Program Agenda
The Evolving Real-time Data Caching
Architecture
Move from process space to networked Need cache server and API Bottlenecks: data load, network
Algorithm dependencies Cache hit/miss (FIFO, LFU, LRU, MRU, ARC, etc.) Data composition, many seeks
5
Recorded version available at http://bit.ly/1g6Eaj4
Expensive to grow
API (App,
Cache)
Non-relational key-value database designed for cost effective simple queries of high volume, velocity & variety data. Provides high performance & availability data storage and retrieval of simple data using a scale-out of servers design.
Architecture
10
Application
Application
Application NoSQL DB Driver
Flexible Key-Value Data Model ACID transactions Horizontally Scalable Highly Available Elastic Configuration Simple administration Intelligent Driver Commercial grade software and support
NoSQL DB Driver
Storage Nodes
Datacenter A
Storage Nodes
Datacenter B
11
Integrated Caching
Single layer in architecture
WAL transactions
No disk seek time on write
12
13
per operation Configurable Consistency per operation ACID by default Transaction scope is single API call Records share same major key Multiple operations supported
Recorded version available at http://bit.ly/1g6Eaj4
14
indication of capacity System allocates replicas per storage node Intelligent Master/Replica load balancing Ensures distribution of replicas Efficient use of system resources Reduces operator-caused configuration errors
Master
Replica Replica
Shard-2
StorageNode StorageNode
Replica
Shard-1
StorageNode
15
16
Twitter sees ~500M tweets/day This is 750M a minute Capture twitter activity with 3 commodity servers 1.25M ops/sec 2 billion records 2 TB of data 95% read, 5% update Low latency, High Scalability
Recorded version available at http://bit.ly/1g6Eaj4
17
Where to look for use cases ? Business decisions (rules) hard coded in
systems
20
Architecture
22
Technology
Oracle NoSQL Database Ankush (Impetus open source tool) StreamAnalytix (Impetus pre-built tech
stack)
23
Deployment
24
Throughput
RF Storag Shards Replicatio Partition Memory Cache Throughput e n Nodes s Nodes 2 4 4 8 40 Default Default 220k ticks per sec 3 6 4 12 30 Default Default 200k ticks per sec
25
Functional View
26
27
28
29
30
31
32
Infrastructure View
33
34
Cluster Dashboard
35
36
Cluster Metrics
37
Node Monitoring
38
39
Operations Available
40
41
http://bit.ly/1f0d8wU
Developer Webcast Series http://bit.ly/1doV2jl
42
Q&A
43
Thank You
44