Shouldyoubotherwith scaling?
l Well,itdepends l Butifyourelaunchingastartup,probably l Thebestwaytolaunchastartupthesedays
ThePredecessors
l Othergreatplacestolookforinfoonthis l
poocs.netTheAdventuresofScalingRails
http://poocs.net/2006/3/13/theadventuresofscalingstage1
l StephenKaesPerformanceRails
http://railsexpress.de/blog/files/slides/rubyenrails2006.pdf
l RobotCoopblogandgems
http://www.robotcoop.com/articles/2006/10/10/thesoftwareandhardwarethatrunsoursites
l OreillybookHighPerformanceMySQL
l
Itsnotrails,butitsreallyuseful
BigPicture
l Thispresentationwillconcentrateonwhats
Whoweare
l Scribd.com l LikeYouTubefordocuments l LaunchedinMarch,2007 l Handles~1Mrequestsperday
KeyPoints
l Generalarchitecture l Usefragmentcaching! l Rollingyourowntrafficanalyticsandsome
SQLtips
CurrentScribdarchitecture
l 1WebServer l 3DatabaseServers l 3Documentconversionservers l Testandbackupmachines l AmazonS3
ServerHardware
l Dual,dualcorewoodcrestsat3GHz l 16GBofmemory l 415KSCSCIharddrivesinaRAID10 l Welearned:diskspeedisimportant l Don'tskimpyourenotGoogle,andit's
Varioussoftwaredetails
l CentOS l Apache/Mongrel l Memcached,RobotCoopsmemcacheclient l StefanKaesSQLSessionStore
l
Bestwaytostorepersistentsessions
l Monit,Capistrano l Postfix
FragmentCaching
"Wedontuseanypageorfragment caching."robotcoop l "Playwithfragmentcaching...no improvement,changeswererevertedata latertime."poocs.net l Well,maybeit'sapplicationspecific l Scribdusesfragmentcachingextensively, enormousperformanceimprovement
l
ScreenShot
HowtoUseFragmentCaching
l l l
l l
Expiringfragments,1.Timebased
l Youshouldreallyusememcachedforstoring
fragments
Betterperformance l Easiertoscaletomultipleservers l Mostimportant:allowstimebasedexpiration
l
l Usepluginhttp://agilewebdevelopment.com/plugins/memcache_fragments_with_time_expiry l Deadeasy:
<%cache'keyname,:expire=>10.minutesdo%>
...
<%end%>
Expiringfragments,2.Manually
l Noneedtoservestaledata l Justuse:
Cache.delete("fragment:/partials/whatever")
l Clearfragmentswheneverdatachanges l Again,easierwithmemcached
TrafficAnalytics
l GoogleAnalyticsisnice,buttherearealotof
reasonstorollyourowntrafficanalyticstoo
l l l
Scribdsanalytics (screenshots)
Buildingtrafficanalytics,part1
l
create_tablepage_viewsdo|t| t.columnuser_id,:integer t.columnrequest_url,:string,:limit=>200 t.columnsession,:string,:limit=>32 t.columnip_address,:string,:limit=>16 t.columnreferer,:string,:limit=>200 t.columnuser_agent,:string,:limit=>200 t.columncreated_at,:timestamp end Addawholebunchofindexes,dependingonqueries
Buildingtrafficanalytics,part2
l CreateaPageViewoneveryrequest l WeusedahandbuiltSQLquerytotakeout
BuildingTrafficAnalytics,part3
l Scalesprettywell l BUTanalyticsqueriesexpensive,canclogup
mainDBserver l Oursolution:
l l
usetwoDBserversinamaster/slavesetup movealltheanalyticsqueriestotheslave
Railswithmultipledatabases,part1
l
l l l
"AtthispointintimetheresnofacilityinRailstotalk tomorethanonedatabaseatatime."AlexPayne, Twitterdeveloper Wellthat'strue Butsettingthingsupyourselfisabout10linesof code. Therearenowalsotwogreatpluginsfordoingthis: Magicmulticonnections http://magicmodels.rubyforge.org/magic_multi_conn ections/ Actsasreadonlyable http://rubyforge.org/frs/?group_id=3451
Railswithmultipledatabases,part2
l AtScribdweusethistosendpredefined
Railswithmultipledatabases,code
l
Indatabase.yml
slave1: host:18.48.43.29#yourslavesIP database:production username:root password:pass
DefineamodelSlave1.rb
Whenyouneedtorunaqueryontheslave,justdo
Slave1.connection.execute("select*fromsome_table")
ShamelessSelfPromotion
l Scribd.com:VCbackedandhiring l Just3peoplesofar!>10byendofyear. l Awesomesalary/equitycombination l Ifyourereadingthis,youreprobablythe