Anda di halaman 1dari 19

Falcon from the

Beginning

Presented by,
MySQL & O’Reilly Media, Inc.

Jim Starkey
jstarkey@mysql.com
Why Falcon?
Because the World is Changing!

 Hardware is evolving rapidly


 Customers need ACID transactions
Atomic – the books should balance
Consistent – the alternative is chaos
Isolated – preserve programmer’s sanity(sic)
Durable – who wants to lose data?
Where Hardware is going
 CPUs breed like rabbits – more sockets, more
cores per socket, more threads per core
 Memory is bigger, faster, and cheaper
 Disks are bigger and cheaper but not much
faster
 (Boxes are cheaper and more plentiful, but
that’s a different story)
Where Applications are going

 Batch – dead!
 Timesharing – dead!
 Departmental computing – dead!
 Client server – fading fast
 Application servers for most of us
 Web services for the really big guys
The Database challenge

 Traditional challenge:
Exhaust CPU, memory, and disk simultaneously

 Today’s challenge:
Exhaust CPU and memory and avoid the disk
Falcon tradeoffs
 Use memory (page cache) to avoid disk reads
 Use memory (record cache) to avoid the page
cache manipulation.
 Use CPU to find the fastest path to a record
 Use CPU to minimize record size
 Synchronize most data structures with user
mode read/write locks
 Synchronize high contention data structures
with interlocked instructions.
The Falcon architecture

 Incomplete in-memory database with disk


backfill
 Multi-version concurrency control in memory
 Updates in memory until commit
 Group commits to a single serial log write
 Post-commit multi-threaded pipe line to move
updates to disk
Incomplete in-memory database

 Selected records cached in memory


 Separate cache for disk pages
 Record cache hit is 15% the cost of a page
cache hit
 Record cache is more memory efficient than
page cache
Record Encoding - Cache Efficiency

 Records encoded by value, not declaration


 String “abc” occupies the same space in
varchar(3) or varchar(4096)
 The number 7 is the same where small,
medium, int, bigint, decimal, or numeric
Multi-Version Concurrency Control
 Update operations create new record versions
 New version is tagged with transaction id, points
to old version
 System tracks which transactions should see
which versions
 Readers don’t block writers
 Everyone sees a consistent view of the data
Updates Are in Memory Until
Commit

 Updates held in memory pending commit (well,


usually)
 Index changes held in memory pending commit
(same caveat)
 Verb rollback is dirt cheap
 Transaction rollback is dirt cheap
At Commit…

 Pending record updates flushed to serial log


 Pending index updates flushed to serial log
 Commit record written to serial log
 Serial log flushed to the oxide
 And the transaction is committed!
Alas, Memory isn’t infinite, so
 Large transaction chills uncommitted data
(flushes it to the log early)
 Chilled records can be thawed (fetched from the
log)
 Scavenger garbage collects unloved records
periodically
 When things get really bad, entire record chains
flushed to backlog
 (Note: This is hard and we aren’t done.)
Falcon Weaknesses

 Transactions are ACID but not serializable


 Latency advantage disappears at saturation
 Very large transactions degrade performance
 Optimized for Web, not batch
Falcon Strengths

 Runs like a memory database when data fits in


cache
 Scales like disk-based database when data
doesn’t fit in cache
 Lowest possible latency for Web applications
 Absorbs huge spiky loads
Performance Measurement
 Generally benchmark against InnoDB
(transactional engines)
 We use the DBT2 benchmark:
High contention
Write intensive – 40% records touched are
updated
Measures only performance at saturation
 DBT2 (we believe) is InnoDB’s best spot and
Falcon’s worst
Benchmarking Results
 16 & 8 cpu system: Falcon exceeds InnoDB
performance
 4 cpu systems: Falcon exceeds InnoDB performance for
moderate to large number of threads
 2 cpu systems: Rough parity, advantage to InnoDB
 1 cpu systems: InnoDB wins
 Caveat: Results subject to change! Both systems are
moving targets!!!
When should you use what?
 If you don’t need ACID, MyISAM is probably
fastest
 For Uniprocessors and small memory systems,
InnoDB is a good choice
 For large transaction batch, InnoDB may be
best match
 For multi-cores and large number of threads,
Falcon is probably best
 For the Web, Falcon is hard to beat.
 Questions?

Anda mungkin juga menyukai