Scaling To 200K Transactions Per Second With Open Source - MySQL, Java, Curl, PHP

Scaling to 200K
Transactions per Second

with Open Source -
MySQL, Java, curl, PHP
by
Dathan Vance Pattishall
Contents
Who am I?
Introduction
Now I work at RockYou
W h
en
I st
a rt
ed
Fa ce b o o k S h a rd s
do 1 0 0 K TPS
a lo n e
MySpace, Hi5,
Orkut, Ads, Main
Site various
other DB servers
Sum to 100K TPS
On Less than 120 Database Servers
32 – 48 GB of RAM
8 Disk RAID 10 with 256 MB PERC 6
Controller
We can support any Logical

SQL Query
TEAM
The Requirements
• Scale Linear
• Store some data forever
• Allow for change
• Keep it cheap
• Oh and downtime is not an option
Design to Meet The
Requirement
Federation
User 1’s Data

User 2’s Data User 1’s Data User 3’s Data
User 3’s Data
….
User N’s Data
User 2’s Data User N’s Data

Federation
pu t
u g h
th ro
w rite
e a se
in cr
N OT
es
Do
Federation
T h is
In cre a se s
W rite
T h ro u g h p u t
How does one Federate?
Enter Global Lookup Cluster
• Hash Lookups are fast, can do 45K qps
single server
• Ownerid -> Shard_id
• Groupid -> Shard_id
• Tagid -> Shard_id
• Url_id -> Shard_id
• Front By memcache
• Use consistent hashing to add capacity
horizontally and HA
Write Multiple Views of the
Data
Keep Data Consistent
What if I need an ID to
represent a row
 REPLACE INTO Tickets VALUES(‘a’);
Get a ID back
CREATE TABLE `TicketsGeneric` (
 `id` bigint(20) unsigned NOT NULL
auto_increment,
 `stub` char(1) NOT NULL default '',
 PRIMARY KEY (`id`),
 UNIQUE KEY `stub` (`stub`)
) ENGINE=MyISAM
AUTO_INCREMENT=7445309740
But what if I need a global view
of the table
• Cron Jobs
• Front by Memcache
• Offline Tasks to atomic write job and
return the page quickly i.e. defer
writes to Many RECPT
– Pure PHP
– Like GEARMAND uses IPC distributed
across servers
– Does 100Million actions per day and
scales linearly
• @see Friend Query Section
What about maintenance
What about Shard
Misbalance?
Migrate them
• object_id -> shard_id, lock shard_id
for object_id
• Migrate the user
• If error die, send alert
• Takes less then 30 seconds per
primary object
• Currently shards are self balancing,
can migrate 4 million users in 8
days, at slowest setting.
What about managing
datasize
• Enter Shard Types
– Archive Shard
– Sub Shards
• One way a DBA can scale is to
partition and allocate a server per
table. Why not by partition shard
types?
• Allows for bleeding edge techs, have
10 shards of XTRA-DB

What about Split Brain?
Friend Queries
MULTI-GET from Shards
Jetty + J/Connect
(AsyncShard Server)
• Can Query 8 shards at a time in
parallel
• Data is merged on the fly
• JASON is the communication protocol
• private ExecutorService exec =
Executors.newFixedThreadPool
(8); // 4 CPU * .8 Ut (1 + W/C)
=~ 8
•
•
J/Connect
/* mysql-connector-java-5.1.7 ( Revision: ${svn.Revision} ) */SHOW
VARIABLE
WHERE Variable_name ='language' OR Variable_name =
'net_write_timeout' OR
Variable_name = 'interactive_timeout' OR Variable_name =
'wait_timeout' OR
Variable_name = 'character_set_client' OR Variable_name =
'character_set_connection' OR Variable_name = 'character_set' OR
Variable_name =
'character_set_server' OR Variable_name = 'tx_isolation' OR
Variable_name =
'transaction_isolation' OR Variable_name = 'character_set_results' OR
Variable_name = 'timezone' OR Variable_name = 'time_zone' OR
Variable_name =
'system_time_zone' OR Variable_name = 'lower_case_table_names'
OR Variable_name = 'max_allowed_packet' OR Variable_name =

'net_buffer_length' OR Variable_name = 'sql_mode' OR
Variable_name = 'query_cache_type' OR Variable_name =
'query_cache_size' OR Variable_name = 'init_connect';
Fix
 Add:
 &cacheServerConfiguration=true
 To your JDBC url directive

 @see
http://assets.en.oreilly.com/1/event/21/Connecto


Writing Large Strings
REALTIME
• Incrementing impressions is easy,
but storing referrer URLS is not as
easy in RealTime
• Why must know your limits of the
Storage Engine you are using
INNODB & Strings
• Indexing a string takes a lot of space
• Indexing a large string takes even
more space
• Each index has its own 16KB page.
• Fragmentation across pages was
hurting the app – chewing up I/O
• Lots of disk space chewed up per day
• Due to a bunch of overhead with
Strings & Deadlock detection

INNODB & High Concurrency of
Writes
• Requirement: 300 ms for total db
access FOR ALL Apps
• Writes when the datafile(s) size is
greater then the buffer_size-slow
down at high concurrency
• 10 ms to 20 seconds sometimes for
the full transaction
• Fixed by offloading the query to
OfflineTask that writes it as a single
thread.

Deadlock / Transaction
Overhead Solved
• Put a Java daemon that buffers up to
4000 messages (transactions) and
apply it serially with one thread
• It does not go down & if it does we
can fail over
• Log data to local disk for outstanding
trans
• It does not use much memory or cpu
• Even during peak messages do not
exceed 200 outstanding
transactions
Disk Consumption solved
• Archive Data
• Compress using INNODB 1.0.4
• innodb_file_format =
Barracuda
• 8K Key Block Size – best bang for the
buck for our data. Less Key Block
Size causes major slow down in
transactions.
Stats Across All Services
Questions / Want to Work
here?
d a th a n @ ro ckyo u . co m

Scaling To 200K Transactions Per Second With Open Source - MySQL, Java, Curl, PHP

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Scaling To 200K Transactions Per Second With Open Source - MySQL, Java, Curl, PHP

Diunggah oleh

Hak Cipta:

Format Tersedia

Scaling to 200K

Transactions per Second

We can support any Logical

User 1’s Data

User 2’s Data User N’s Data

 PRIMARY KEY (`id`),

 UNIQUE KEY `stub` (`stub`)

'character_set_connection' OR Variable_name = 'character_set' OR

Variable_name = 'timezone' OR Variable_name = 'time_zone' OR

OR Variable_name = 'max_allowed_packet' OR Variable_name =

Anda mungkin juga menyukai