DB2 10.5
DB2 10.5
Analytics at the Speed of Thought
BLU Acceleration
DB2 10
pureScale (DB2 9.8)
Virtually Unlimited Capacity
Transparent Scalability
Leading Availability
Ease of Development
Application Transparent
Scaling
Avoid the risk & cost of
tuning your applications to
the database topology
Reliability /Availability
Maintain service across
planned & unplanned
events
3x Query Performance
New Index Exploitation
Adaptive Compression
Multi-temp Storage
Real-time Warehousing
Ease of Development
Temporal Query
98% SQL Compattibiltiy
Graph Store
RCAC
Reliability / Availability
pureScale Integration &
Enhancements
WLM Enhancements
Reorg Avoidance
HADR Mutliple Standby
Ease of Development
Enhanced SQL Compatibility
More noSQL Integration
Reliability / Availability
Rolling Updates
HADR with pureScale
pureScale Active / active DR
Enhancements
Online add/drop Member
Other availability enhancements
Revolution by Evolution
Built directly into the DB2 kernel
BLU tables can coexists with traditional row tables, in same
schema, tablespaces, bufferpools
Query any combination of BLU or row data
Memory-optimized (not in-memory)
Speedup over
DB2 10.1
Large Financial
Services Company
46.8x
8x-25x
37.4x
13.0x
improvement
is common
Global Retailer
6.1x
5.6x
It was amazing to see the faster query times compared to the performance
results with our row-organized tables. The performance of four of our
queries improved by over 100-fold! The best outcome was a query that
finished 137x faster by using BLU Acceleration.
- Kent Collins, Database Solutions Architect, BNSF Railway
Broad
Broadrange
rangeofofqueries
querieswith
withvarying
varying
selectivity
/
aggregation
selectivity / aggregation
Substantial
SubstantialStorage
StorageSavings
Savingswith
with
BLU
Acceleration
BLU Acceleration
2.5x
2.5xless
lessspace
spacethan
thanDB2
DB210.1
10.1
Massive
MassivePerformance
PerformanceGains
Gains
133x
133xspeedup
speedupover
overDB2
DB210.1
10.1
Maximum
Maximumquery
queryspeed
speedup
upover
over900x
900x
~4TB
~4TBofofraw
rawdata
data
22fact
tables
fact tables
55dimension
dimensiontables
tables
621
DB2 10.1
DB2 BLU
600
500
400
133x
Derived
Derivedfrom
fromRedbrick
Redbrickperformance
performance
test
test
Classic
Classicsales
salesanalytics
analytics
5.5years
of
data
5.5years of data(2000
(2000days)
days)for
for63
63
stores
stores
700
300
200
100
4.7
0
MV
Intel Xeon Processor E5-4650
M
Y
32 cores total (4 CPUs)
ts
384 GB
s
te
DS5300 (2x16 disks)
b
La
b
La
ts
s
te
MV
M
Y
7
6
Conceptual
Compression
Dictionary
STATE
New York
California
Illinois
Michigan
Florida
Alaska
Rhode Isl
Encoding
Register Length
Encoding
Encode
Count = 1
4
3
2
7 Big Ideas:
Without
Performance
SIMD increase
processing
with
the
Single
CPU will
Instruction
apply each
Multiple
instruction
Data (SIMD)
to each
data
Using
element
hardware instructions, DB2 with BLU Acceleration can apply a
single instruction to many data elements simultaneously
Eg. compare
records joins,
to 2005
Predicate
evaluation,
grouping, arithmetic
2001
2002
2003
2004
2005
2006
2007
2008
2009
2005
2001
2009
2001
2007
2006
2005
2004
2003
2002
2006
2002
2010
2011
2012
2007
2003
2011
Data
Instruction
Compare
= 2005
2010
2008
2004
2012
Data
Instruction
Processor
2005
Core
Result Stream
Compare
Compare
== 2005
2005
Processor
2005
Core
Result Stream
7 Big Ideas:
Core-Friendly Parallelism
core
cache
line
Cacheline
ping-pong
cache
core 0 working
on blue data
Main memory
layout
11
SELECT c2 FROM
SELECT c1 FROM
core
core
cache
cache
core 1 working
on green data
SELECT c2 FROM
Minimal
Traffic
core
cache
7 Big Ideas:
Core-Friendly Parallelism
larger
working
set of
memory
accesses
core
core
core
core
cache
cache
cache
cache
BLU tries to
match
working
set to actual
cache size
Minimized
Memory Access
Frequent, Slow
Memory Access
2 3
2 3
4 5
4 5
remaining
portion of
data is
processed in
sequence
7 Big Ideas:
Rows are not materialized until absolutely necessary to build result set
No need to consume memory/cache space & bandwidth for unneeded columns
Columns stored
separately
and packed in
different buffers
in memory
C1 C2 C3 C4 C5 C6 C7 C8
7 Big Ideas:
A key BLU design point is to run well when all data fits in memory, and when it
doesnt !
Even with large scans, BLU prefers
selected pages in the bufferpool, using
an algorithm that adaptively computes
a target hit ratio for the current scan,
based on the size of the bufferpool,
the frequency of pages being re-accessed
in the same scan, and other factors
RAM
Near optimal caching
14
7 Big Ideas:
Data skipping
15
The challenge:
Subsecond response to a 4TB query on a 32 core server,
without defining an index.
The action:
1. Compression reduces data size to 1/10th
Divide by 10
400GB
Divide by 100
4GB
Divide by 10
400MB
12.5MB
Divide by ~4
3.1MB
7 Big Ideas:
Indexes
REORG (its automated)
RUNSTATS (its automated)
MDC or MQTs or Materialized Views
Statistical views
Optimizer hints
It is just DB2!
Same SQL, language interfaces, administration
Same DB2 process model, storage, bufferpools
The BLU Acceleration technology has some obvious benefits: It makes our analytical
queries run 4-15x faster and decreases the size of our tables by a factor of 10x. But its
when I think about all the things I don't have to do with BLU, it made me appreciate the
technology even more: no tuning, no partitioning, no indexes, no aggregates.
-Andrew Juarez, Lead SAP Basis and DBA
17
7 Big Ideas:
SALES
Use
Usethe
thenew
newDFT_TABLE_ORG
DFT_TABLE_ORG
database
databaseconfiguration
configurationparameter
parameter
to
set
default
table
organization
to set default table organization
(row
(rowor
orcolumn)
column)
created in
TS1
created using
Storage
Group
DB2 10.5 allows an arbitrarily high number of concurrent queries to be submitted, but limits the number that consume resources at
any point in time
Lightweight queries that instant response, bypass this control
Up to tens of thousands of
SQL queries at once
Moderate number of
queries consume resources
SQL Queries
..
.
20
Threshold
SYSDEFAULTUSERCLASS
limit concurrency
to N, queue excess
query cost
>X?
SYSDEFAULTUSERWORKLOAD
SYSDEFAULTUSERWAS
SYSDEFAULTMANAGEDSUBCLASS
SYSDEFAULTSUBCLASS
else
Work Class Set
Column2
Column3
2013
2013
2013
2012
2012
2013
Storage
extent
2012
Done automatically via DB2s automatic
table maintenance when DB2_WORKLOAD=
ANALYTICS
22
2012
Row
organized
table
ALTER TABLE
ADD CONSTRAINT uc1
UNIQUE (c2)
CREATE INDEX i1
CREATE INDEX i2
CREATE INDEX i3
Index (inx)
object
TS1
uc1
i1
i2
i3
TS2
Column
organized
table
ALTER TABLE
ADD CONSTRAINT uc1
UNIQUE (c2)
Table (dat)
object
extent
extent
extent
extent
extent
extent
extent
CREATE TABLE t1
ORGANIZE BY COLUMN
IN TS1 INDEX IN TS2
Column (col)
object
Synopsis
(records the
(meta data &
(each extent
range of column
compression contains pages of values existing in
dictionary)
data for 1 column) different regions
of the table)
c1
c2
c3
c4
c1
c2
c3
c4
TS1
Index
(inx)
object
uc1
TS2
Table (dat)
object
Column (col)
object
Synopsis
(records the
(meta data &
(each extent
range of column
compression contains pages of values existing in Index
dictionary)
data for 1 column) different regions (inx)
object
of the table)
<empty>
c1
<empty>
c2
<empty>
c3
<empty>
c4
c1
c2
c3
c4
TS1
TS2
5000
4500
DB2 10.1
BLU on pre-GA 10.5 Build
4000
3500
3000
42.8x
42.8x
Speedup
Speedup
2500
2000
1500
1000
771s
500
La
b
te
s
ts
0
-
YM
MV
Q01 Q01a Q02 Q03 Q04 Q05 Q06 Q07 Q08 Q09 Q10 Q11 Q12 Q12a Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
Query
2013 IBM Corporation
25
18s
OLAP Application
Cognos BI
with BLU Acceleration
Cognos BI
with BLU Acceleration
Load and Go
Analytic
Data Mart
(BLU Tables)
Multi-platform software
26
Cognos BI 10.2
Dynamic Cubes (ROLAP)
DB2 10.5
18x faster
DB2 10.5
14x faster
27
DB2
DB2
WITH BLU
ACCELERATION
10.5
Typical experience
Super analytics
Super easy
DB2 10.5
DB2 10
pureScale (DB2 9.8)
Virtually Unlimited Capacity
Transparent Scalability
Leading Availability
Application Transparent
Scaling
Avoid the risk & cost of
tuning your applications to
the database topology
Availability
Maintain service across
planned & unplanned
events
Online Maintenance
pureScale HADR
3x Query Performance
New Index Exploitation
Adaptive Compression
Multi-temp Storage
Real-time Warehousing
Ease of Development
Temporal Query
98% SQL Compattibiltiy
Graph Store
RCAC
Reliability / Availability
pureScale Integration &
Enhancements
WLM Enhancements
Reorg Avoidance
HADR Mutliple Standby
Ease of Development
Enhanced SQL Compatibility
Index on Expression
More noSQL Integration
Reliability / Availability
Rolling Updates
HADR with pureScale
pureScale Active / active DR
Enhancements
Online add/drop Member
Other availability enhancements
Application Transparency
Avoid the risk and cost of
application changes
Continuous Availability
Deliver uninterrupted access to your data
with consistent performance
Workload A
Applications
Workload B
Applications
Workload C
Applications
Member
Failover
DB2
DB2
Member
Passive
until
Failover
DB2
DATABASE B
Member
CF
Passive
until
Failover
Shared
Storage
DATABASE A
DATABASE A
Member
Failover
Failover
Passive
until
Failover
Member
DATABASE B
DATABASE N
DATABASE N
OLTP
Batch
Member
Member
Member
Member
Member
CF
Shared
Storage
Member Subsets
Batch
Member
Member
OLTP
Member
Member
Member
CF
Member subsets:
Database
Workload 1
Workload 3
Workload 2
Member
Member
Member
Member
Member
Member
Member
CF
Current pureScale
STMM design
Workload 4
Workload
Workload 1
Member
Workload 5
Workload 3
Member
Member
Member
Member
STMM
Daemon
CF
Workload 2
Workload 1
Workload 4
Workload 5
Workload 3
Control via:
CALL SYSPROC.ADMIN_CMD
( "get stmm tuning member" )
STMM
Daemon
Member
STMM
Daemon
Member
STMM
Daemon
Member
STMM
Daemon
Member
STMM
Daemon
CF
Shared
Storage
CALL SYSPROC.ADMIN_CMD
( update stmm tuning member -2" )
Member
Member
Member
Member
CF
CF
Table or
Partition A
Table or
Partition B
Table or
Partition C
Table or
Partition D
No member/CF
communication
necessary
La
b
Multi-Tenancy Demo :
10 Independent Workloads
te
st
s
10
10Members
Membersxx10
10cores
coreseach,
each,22CF
CFxx88cores
coreseach
each
Each
member
runs
a
separate
70%
read
/
30%
Each member runs a separate 70% read / 30%write
writetransactional
transactionalworkload,
workload,
representing
a
different
tenant
representing a different tenant
Over
Over90%
90%scaling
scalingat
at10
10members
members!!
More
than
850,000
SQL
statements
per
More
than
850,000
SQL
statements
per
E.g.
different
regions,
different
subsidiaries,
different
customers
in
a
E.g. different regions, different subsidiaries, different customers in aSaaS
SaaS
environment
second
across
10
members
environment
second
across 10 members
CFp
YM
MV
CFs
10
9
10 Gb RoCE switch
8
7
6
5
M0
M1
M2
M3
M4
M5
M6
M7
M8
M9
4
3
4x IBM TMS820
Flash Storage Units
20 TB each
1
0
1
10
18000
16000
12000
17449
14000
TPS
10000
8000
6000
4000
2000
0
2216
Regular index
Random index
b
La
ts
s
te
V
M
YM
39
1.47x more
users than best
Oracle3 result
1)
Results of DB2 10.5 on IBM Power 780 on the three-tier SAP SD standard application benchmark on SAP enhancement package 5 for SAP ERP 6.0, achieved 266,000 SAP SD benchmark
users, certification # 2013010. Configuration: 8 processors / 64 cores / 256 threads, POWER7+ 3.72 GHz, 512 GB memory, running AIX 7.1
2)
Results of DB2 UDB 8.2.2 on IBM eServer p5 Model 595 on the three-tier SAP SD standard application benchmark running SAP R/3 Enterprise 4.70 (ERP) software, achieved 168,300
SAP SD benchmark users, certification # 2005021. Configuration:32-core SMP, POWER5, 1.9 GHz, 256 GB memory, running AIX 5.3
3)
Results of Oracle 11g Real Application Clusters (RAC) on SAP sales and distribution-parallel standard application benchmark running the SAP enhancement package 4 for SAP ERP 6.0,
achieved 180,000 SAP SD benchmark users, certification # 2011037. Configuration: 8 x Sun Fire X4800 M2 each with 8 processors / 80 cores / 160 threads, Intel Xeon Processor E7-8870,
2.40 GHz, 8 x 512 GB memory, running Solaris 10
Source: http://www.sap.com/benchmark
SAP, R/3 and all SAP logos are trademarks or registered trademarks of SAP AG in Germany and several other countries.
All other trademarks are the property of their respective owners.
DB2 10.5
DB2 10
pureScale (DB2 9.8)
Virtually Unlimited Capacity
Transparent Scalability
Leading Availability
Application Transparent
Scaling
Avoid the risk & cost of
tuning your applications to
the database topology
Availability
Maintain service across
planned & unplanned
events
Online Maintenance
pureScale HADR
3x Query Performance
New Index Exploitation
Adaptive Compression
Multi-temp Storage
Real-time Warehousing
Ease of Development
Temporal Query
98% SQL Compattibiltiy
Graph Store
RCAC
Reliability / Availability
pureScale Integration &
Enhancements
WLM Enhancements
Reorg Avoidance
HADR Mutliple Standby
Ease of Development
Enhanced SQL Compatibility
Index on Expression
More noSQL Integration
Reliability / Availability
Rolling Updates
HADR with pureScale
pureScale Active / active DR
Enhancements
Online add/drop Member
Other availability enhancements
Index on Expression
What ?
Allow indexes to be defined with an expression, eg.
CREATE INDEX i1 ON emp (UPPER(lastname), salary+bonus)
Value Proposition
Efficient execution of SQL statements with such expressions, eg.
SELECT * FROM emp WHERE UPPER(lastname) = ?
SELECT * FROM emp WHERE salary+bonus = ?
Avoid the drawbacks of work-around (index on generated column)
Space consumption of extra column
Potential need to modify applications to reference the new column
Value Proposition
Support applications whose semantics require unique enforcement, but only where
keys are not NULL
Storage savings ! (Avoid indexing NULLs if they are infrequently queried)
Notes
A NULL key is one where all key components are NULL
C1
C2
NULL
1
NULL
Excluded from
ENK index
Insert
Unique Constraint
C1
C2
NULL
NULL
NULL
Regular Index
ENK Index
Fail
Fail
Success
Value Proposition
Support applications that require long row definitions
Avoid a lengthy table redefinition to change pagesize
Notes
New maximum row length: 1,048,319 bytes
Excess row data stored in a LOB
Performance penalty when need to go off page for a portion of row
Usually OK in most scenarios instances of long rows are rare
If you expect long rows to be common, try to use a larger page size
LOB (lob)
object
LOB
creates
extended
row
removes
extended
row
TS1
TS2
SYSTABLES
SYSTABLESPCTEXTENDEDROWS
PCTEXTENDEDROWScolumn
columnshows
shows%%
ofofrows
rowsininaatable
tablethat
thatare
areextended
extended
Social
Mobile
Big Data
Analytics
Application Characteristics
Engaging
Mobile
Dynamic
Competitive
Fashionable
Scalable
Rapidly Changing !
Cloud
{
"firstName": "John",
"lastName" : "Smith",
"age"
: 25,
"address" :
{
"streetAddress": "21 2nd Street",
"city"
: "New York",
"state"
: "NY",
"postalCode" : "10021"
},
"phoneNumber":
[
{
"type" : "home",
"number": "212 555-1234"
},
{
"type" : "fax",
"number": "646 555-4567"
}
]
Self-describing, schema-less
Very simple (eg. tag:value format)
Human readable
Based on JavaScript, initially targeted for
web applications, but,
Analytics
Eg. An organization stores a large quantity
of web statistics stored as JSON
documents, and wants to perform analytics
No document-level locking
Applications manage a revision tag to detect document update conflicts
Applications
Java
PHP
NodeJS
Java Apps
JSON
Command Shell
JSON API
JDBC Driver
DB2
item
Key
Shopping Cart
<binary JSON>
<binary JSON>
<binary JSON>
item:camera
DB2 10.5
DB2 10
pureScale (DB2 9.8)
Virtually Unlimited Capacity
Transparent Scalability
Leading Availability
Application Transparent
Scaling
Avoid the risk & cost of
tuning your applications to
the database topology
Availability
Maintain service across
planned & unplanned
events
Online Maintenance
pureScale HADR
3x Query Performance
New Index Exploitation
Adaptive Compression
Multi-temp Storage
Real-time Warehousing
Ease of Development
Temporal Query
98% SQL Compattibiltiy
Graph Store
RCAC
Reliability / Availability
pureScale Integration &
Enhancements
WLM Enhancements
Reorg Avoidance
HADR Mutliple Standby
Ease of Development
Enhanced SQL Compatibility
Index on Expression
More noSQL Integration
Reliability / Availability
Rolling Updates
HADR with pureScale
pureScale Active / active DR
Enhancements
Online add/drop Member
Other availability enhancements
Applications
s
tionns
neecctio
n
n
o
n
C
/3Co
~~11/2
M1
M3
CFP
~~12A
//23llCC
Coon
onnnnneeec
ionns
cctttio
io
nss
10s of km
Site A
CFS
M2
Site B
M4
R/W Ratio
Portion of
UPDATE,
INSERT, or
DELETE
Operations
in
Workload
Good
candidate
workloads for other
replication
technologies
70%
80%
90%
Good
candidate
workloads /
configurations
for GDPC
100%
10km
20km
30km
40km
50km
60km 70km
Site-to-site distance
Workload can be immediately directed to the newly added member once it is started.
Other notes
New optional mid option to indicate
member number to be added
New member can be added
to an existing member host
Backup no longer needed
after adding new members.
Member
added
online
Member
Log
CF
55
Member
Log
Member
Log
Member
Member
Log
Log
CF
$ db2start member 12
Topology-Changing Restore
Allow restore of M-member backup to N-member instance
Allow restore from pS to non-pS and vica-versa
Backup image can be online if N is a superset of M
BACKUP IMAGE
C
BA
Member
Member
RE
P
KU
Member
Member
ST
OR
E
Member
Member
Member
Member
CF
Logs
CF
Logs
Shared
Storage
DATABASE MYDB
DB PARTITION
pureScale Feature
Shared
Storage
DATABASE MYDB
DB PARTITION
pureScale Feature
t2: Add
Member 2
t1: Backup
tablespace
TBSP0
(online)
t3: Backup
tablespace
TBSP1
(online)
t4: Media
Failure
Restore
1.
2.
3.
DB2 Database
No
Nohistory
historyfile
fileentry
entry
Error
Errorprone
prone
Flash Copy
Source LUNs
Target LUNs
1.
2.
3.
4.
5.
6.
Restore
1.
2.
3.
DB2 ROLLFORWARD
DB2 Database
Flash Copy
Source LUNs
Target LUNs
History
Historyfile
filerecord
record
Simple
Simple!!
Wide
Wide(but
(butnot
notexhaustive)
exhaustive)
storage
storagesupport
support
1.
2.
3.
4.
5.
6.
Restore
1.
2.
3.
DB2 ROLLFORWARD
DB2 Database
Flash Copy
Source LUNs
Target LUNs
History
Historyfile
filerecord
record
Simple
Simpleto
touse
use!!
Wider
Widerstorage
storagesupport
support
enabled
enabled
DB2
Script
DBA
DB2
Script
RESTORE
DELETE
QUERY
prepare
prepare
prepare
prepare
snapshot
restore
delete
verify
storemetadata
rollback
Example
Examplescript
scriptinin
samples/BARVendor/libacssc.sh
samples/BARVendor/libacssc.sh
REORG Enhancements
Online inplace reorg support on a table using adaptive
compression
Online inplace reorg support in pureScale
Fastpath option for online inplace reorg to clean up
overflow records only
$ db reorg table T1 inplace cleanup overflows
8am
9am
10am
11am
Extent
Boundaries
1) INSERTS
2) DELETE WHERE
3) REORG RECLAIM EXTENTS
12pm
Procedure:
1.
2.
3.
4.
DB2
DB2
DB2
DB2
Code level:
level: FP1
GA
Code
db2stop
db2iupdt
db2start
member1
member1
member1
quiesce
member1
db2stop
db2start
db2iupdt
member2
member2
quiesce
member2
COMMIT
FP1 committed. New function available. Cannot roll down to GA anymore
Transparent ZERO database downtime
Code level:
level: FP1
GA
Code
InstallFixpack
db2start member1
InstallFixpack
db2start member2
member2
member1
CECL = 10.5 GA
FP1
InstallFixpack -commit_level
pureScale HADR
Simple DR solution for pureScale
Built in resiliency
Tolerant of member failures on primary
and standby
Another member takes-over
sending/receiving log data
CF
CF
Primary Cluster
Simple configuration
No need specify all addresses of other
side (an automatic discovery protocol
does that)
CF
CF
Standby DR Cluster
All primary members send log to parallel threads on a replay member on standby
The replay member is highly available
If the current replay member fails, DB2 will automatically run replay on another member
If one primary member is not available, standby can obtain its logs via another
primary member that is available
Standby requirement
Must also be running with pureScale with the same number of members (they can be
logical members)
Standby site
Member
Transactions
Member
Link
Link
Member
Member
Member
Member
CF
CF
CF
CF
Member 3
sends member
1s logs
Logs 1
Replay
member
Logs 2
Logs 3
Fully Functional
Unlimited Capacity
Limited Capacity
Workgroup Server Edition
Express Edition
Data Studio
BLU support
HADR Multiple Standbys
pureScale support
Enhancements
Unprecedented Affordability
In-memory speed and simplicity on existing infrastructure
Optimized for SAP workloads for faster performance and to help dramatically reduce costs
Upgrade to DB2 with average. 98% Oracle Database application compatibility7
Future-Proof Versatility
Optimized capabilities for both OLTP and data warehousing
Business grade NoSQL and mobile database for greater application flexibility
1
Based on internal IBM testing of sample analytic workloads comparing queries accessing row-based tables on DB2 10.1 vs. columnar tables on DB2 10.5. Performance improvement figures are cumulative of all queries in the workload. Individual results will vary
depending on individual workloads, configurations and conditions.
Based on internal IBM tests of pure analytic workloads comparing queries accessing row-based tables on DB2 10.1 vs. columnar tables on DB2 10.5. Results not typical. Individual results will vary depending on individual workloads, configurations and conditions,
including size and content of the table, and number of elements being queried from a given table.
3
Client-reported testing results in DB2 10.5 early release program. Individual results will vary depending on individual workloads, configurations and conditions, including table size and content.
4 Based on IBM design for normal operation with rolling maintenance updates of DB2 server software on a pureScale cluster. Individual results will vary depending on individual workloads, configurations and conditions, network availability and bandwidth.
5 Based on IBM design for normal operation under typical workload. Individual results will vary depending on individual workloads, configurations and conditions, network availability and bandwidth.
6 Available with DB2 Advanced Enterprise Server Edition.
7 Based on internal tests and reported client experience from 28 Sep 2011 to 07 Mar 2012.
2