Make It Scale - Optimise and Scale MySQL

Optimise and Scale MySQL
Mike Griffiths
Proven Scaling
About me and us
• Me:
 Used & administered MySQL databases for 10 years
 Consultant with Proven Scaling
 Used to be a DBA and Service Architect at Yahoo!
 I don’t always follow the party line
• Us:
 Founded in 2006
 Specialise in MySQL, but work with whole stack
 Primarily consult on architecture, design and
optimisation for large scalable systems
 We also do training, DBA work, audits, coding…
Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Mike’s Mantras
• Scalability + Availability = Service Quality
• Be as lazy as possible
• Be paranoid
• You’re only as strong as your weakest link

Common Scenario #1
• “I don’t think scalability will ever be a problem for
me.”
 Denial
 Overconfidence
 No-one uses my product

Common Scenario #2
• “It’s all melting down! Help!”
• Find the cause
 External Factors
 Slashdot, Digg, Blogs
 Internal Factors
 New features
 Bad code
 Marketing

Pre-requisites
• Solid development process
• Version & Release control
• Test / Staging Environment
• Load testing strategy
• Good cross-functional communication

Common Scenario #3
• “I might have problems with scaling up in the
future…
 … I’ll fix them if they happen.”
 … I’ll fix them now. All of them.”

Extreme Approaches
•Wait & See •Fix All Now
 Faster, Cheaper  Slower, Expensive
 Quality of service  Time to market
compromised increased
 Slower to react to  Risk of wasted time
changes in workload  Increased quality of
 Sometimes chosen service
because of lack of skill  Better prepared for
or knowledge unexpected
 Prototypes are useful,
but they are what they
are!

Sensible Approach
• Somewhere between the two extremes
• Architect for what you might conceivably face in
the future
• Implement now what you know you will face
• Always avoid dead ends and short cuts
 Even if it means much more effort now

MySQL Scalability
•Query Optimisation •Caching
•Functional Separation
•Replication •Configuration
•Sharding
•Archiving •Hardware
•Isolation •Capacity Planning

•Error Handling

Query Optimisation
• Ensure queries are correctly indexed
• Use the slow query log
• Learn EXPLAIN
• Look carefully at queries which can’t be optimised
 Rewrite queries
 Change schema
 Denormalise
 Split into multiple queries

Query Optimisation
• Big is generally slower
 Longer to recover, query, maintain
• Partitioning in MySQL 5.1 can sometimes help
query performance
• Is your server optimally configured?

Functional Separation
• Common prototype/start-up scenario is to have
multiple functions on one database host
 Blog
 Forum
 Monitoring
 Application
• Split into different databases
• Move different databases to different hosts
• Can provide short-term breathing space

Replication
• Can help scalability problems with reads
• Doesn’t help with scaling writes
• Asynchronous nature adds complexity to
applications
• Can be used to increase availability
 Actually, replace “can” with “should” here
• Part of a scalability solution
 … but not the whole solution
 Be aware of its strengths and weaknesses
 Know what doesn’t work with replication

Replication: Read
Scalability
• Single master (write point) replicates to multiple
read-only slaves
• Too many slaves can overload a master
 Use “relay” slaves to build a replication tree
• Inefficient (cost, storage, memory) strategy for
scaling reads in isolation
 Load balancing strategy is key
 Consider having different roles for slaves
 Combine with sensible caching elsewhere in stack
for best results

Replication: Write
Scalability
• Replication doesn’t help with scaling writes
• Write-saturated master normally means write-
saturated slaves
 Row-based replication in MySQL 5.1 can help
sometimes
• Updates to slaves done in single thread
 Overall write capacity on a slave is less than the
master
• Read performance on slaves drops rapidly as write
load increases

Replication:
Asynchronous
• Applications need to be aware of replication delays
• Worst case: write followed by a read
• How to handle?
 Promotion of read-only database handles to
read/write when required
 Use binary log positions to work out if a slave has
new enough data
 Can other users be shown out-of-date data?
• Can be very difficult to add handling for replication
delays into existing applications

Replication: Availability
• Deploy servers in pairs using master-master
replication
• Never use master-master as a way of scaling writes
• Use virtual IP addresses to control access - see
MMM, Flipper, Linux HA
• Use “inactive” machine for maintenance, backups,
slow reads for reporting

Sharding
• Splitting database into smaller chunks
• Only solution for scaling writes
• Needs careful planning
• Can be difficult to implement
• Even more difficult, if not impossible to retro-fit to
an existing system
• Architect your system with sharding in mind from
Day One
 … even if not immediately implemented or used

Sharding
• How to split the data?
 Hashing on user-supplied data
 Username, email address
 Splitting on other data
 Time
 Data dictionary
 Any combination of the above

Sharding
• What data to split?
 Cross-shard queries are not impossible, but
logistically more difficult
 Use your application design to decide
• Popular choice: Primary/Secondary split
 Primary dataset
 Frequently used information (highly cacheable)
 Relationships
 Pointers to secondary data
 Secondary dataset(s)
 Vertically split data
 Less frequently used information

Sharding
• Keep shards to the right size
 Small enough to be manageable
 Large enough so you don’t need thousands
 Consistent size per shard
• Architectures should allow for adding shards
 Hash-based sharding often falls down here

Sharding
• Accessing data across shards is more difficult, but
not impossible
 Handle JOINs in your application
 Replicate primary data to secondary shards
 Maintain summary tables outside of shards
 Parallelisation of execution across shards can give
significant performance boost

Archiving
• Big is generally slower
• Keep your data size as small as possible
• Move older & less frequently accessed data
 To slower and/or cheaper infrastructure
 To infrastructure with real or imagined lower service
level

Caching
• Caching closer to the user is more efficient
 Use a content delivery network
 Use squid
 Use memcached
 Maybe use MySQL’s query cache

Isolation & Error
Handling
• Isolate your application from MySQL
• Design your application to run without the MySQL
server (as much as possible)
• Pre-produce full and/or partial web pages
• No need to process web access logs in real-time
• Cope (more) gracefully when there’s an outage

Configuration
• Ensure your servers are configured correctly
 Use LVM or ZFS for snapshots for backup
 Ensure BBWC is on and working
 Enables InnoDB to commit without disk head movement
 Make sure MySQL’s configured right
 Use InnoDB. MyISAM rarely the right choice.
 Give as much memory as possible to InnoDB
 Aim: Get all data in InnoDB Buffer Pool
• Ensure everything’s monitored
• Automate whatever you can

Hardware for MySQL
• Should I scale up or scale out?
• Each approach has benefits
 Don’t listen to sales & marketing people
• Each approach has problems
 Don’t listen to sales & marketing people
• Should I consider cloud computing?

Scale up
• Can be cheaper
 Smaller power and space usage
 Fewer machines to administer
• Many eggs in one basket
• MySQL’s own scaling problems add complexity
 Need to run multiple instances to take advantage of
massively parallel machines

Scale up
• Storage doesn’t scale up cheaply
 Significant storage infrastructure might be required
 Cost per I/O operation and terabyte likely to be
significantly higher
 … but, with a SAN, you only have one “live” copy of your
data
 Power & space savings could be cancelled out
 Single copy of data reduces maintainability
 I/O latency with SAN normally higher than with local
disk

Scale out
• Can be cheaper
 Lower initial cost
 Local storage cheaper
 … offset by duplication of data
• Can be more expensive
 Power, space, administration
• More frequent failures
 … but potentially less damaging

Cloud computing
• More appropriate in other parts of the application
stack
• Worth considering if you have peaky, CPU-
intensive workload
 Cater for your baseline yourself
 Use on-demand services for peaks
• Otherwise, avoid!
 Poor I/O performance will hurt you

Hardware
• Hardware strategy depends on:
 Application Architecture
 Predicted Growth Rates
 Funding

Capacity Planning
• Working out what you need to provide your
product(s) to your users within an acceptable
timeframe
• Constant process
• Use all the data available
 Extrapolate from monitoring data
 Use load testing data
 Use data gained from prototype testing
• Allow for hardware failure

Capacity Planning
Past traffic data

42 Future traffic
predictions
… and a new load

balancing
Traffic profile strategy
Service Level
Performance data
Requirements
Summary
• Sharding for scaling writes
• Replication and caching for scaling reads
• Big is bad. Small is super. Manageable is magic.
• Get the right hardware for the job
• Use your kit as efficiently as possible
 Optimise your queries
 Configure things correctly
• Monitor as much as possible
• Think about scaling now, not later

Thanks!
Want help to make it scale? Get in touch!

consulting@provenscaling.com

Make It Scale - Optimise and Scale MySQL

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Make It Scale - Optimise and Scale MySQL

Diunggah oleh

Hak Cipta:

Format Tersedia

Optimise and Scale MySQL

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

•Isolation •Capacity Planning

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Past traffic data

… and a new load

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Want help to make it scale? Get in touch!

Copyright 2008 Proven Scaling Ltd / Proven Scaling LLC

Anda mungkin juga menyukai