Anda di halaman 1dari 61

Managed

Services

Cloud Services

Consul3ng Services

Licensing

Extreme Data Warehouse Performance


with Oracle Exadata
Kasey Parker
Enterprise Architect
Kasey.Parker@centroid.com
Managed Services Cloud Services Consul3ng Services Licensing

Who is Centroid?
QUICK FACTS
Centroid is a leading provider of Oracle Technology, Applica8ons and
Infrastructure/Hos8ng solu8ons
Established in 1997
Oce loca8ons: Troy, MI (HQ); San Francisco, CA; Los Angeles, CA; Dallas, TX
200+ Consultants
Oracle Pla8num Partner
Selected to Oracles Top 25 Strategic Partner Program
Top 5 Oracle Partner for Hardware/Storage

100% Oracle Red Stack Focused


Clients for life approach to customer rela8onships
Oracle Exadata Center of Excellence established in 2011
Centroid Authored - Oracle Exadata Recipes (Published Feb-2013)
Managed Services Cloud Services Consul3ng Services Licensing

Agenda
Exadata Overview
Why Exadata?
Exadatas Secret Sauce
GeAng the Most out of Exadata DW
Avoiding the 3X Club
Other Data Warehouse Best Prac3ces

Managed Services Cloud Services Consul3ng Services Licensing

EXADATA OVERVIEW

Managed Services Cloud Services Consul3ng Services Licensing

Exadata Architecture
Database hardware and soIware plaKorm in a box
Scale-Out Database Servers
8x 2-socket, or 2x 8-socket Xeon database servers
Oracle Database, ASM, RAC; Linux or Solaris
Standard Ethernet to data center

Scale-Out Intelligent Storage Servers
2-socket storage servers, Exadata Storage SoIware
Up to 672 terabytes disk per rack
56 PCI Flash memory cards per rack

InniBand Network
Unied internal connec3vity ( 40 Gb/sec )

Managed Services Cloud Services Consul3ng Services Licensing

Exadata Congura3on Op3ons


Start small and grow as needed upgraded onsite

Eighth Rack

Quarter Rack

Managed Services Cloud Services Consul3ng Services Licensing

Half Rack

Full Rack

Exadata Hardware Summary


X4-2 Full

X4-2 Half

X4-2 Quarter

X4-2 Eighth

192

96

48

24

2048 (max 4096)

1024 (max 2048)

512 (max 1024)

512 (max 1024)

InfiniBand switches

Ethernet switch

Exadata Storage Servers

14

Storage Grid CPU Cores

168

84

36

18

44.8 TB

22.4 TB

9.6 TB

4.8 TB

High Perf

200 TB

100 TB

43.2 TB

21.6 TB

High Cap

672 TB

336 TB

144 TB

72 TB

High Perf

90 TB

45 TB

19 TB

9 TB

High Cap

300 TB

150 TB

63 TB

30 TB

High Perf

60 TB

30 TB

13 TB

6.3 TB

High Cap

200 TB

100 TB

43 TB

21.5 TB

Database Servers
Database Grid Cores
Database Grid Memory (GB)

Raw Flash Capacity


Raw Storage Capacity

Usable mirrored capacity

Usable Triple mirrored capacity

Managed Services Cloud Services Consul3ng Services Licensing

Exadata Hardware
Exadata X4-2 SQL IO Performance
Flash Cache
SQL Bandwidth1,3
Flash SQL IOPS2,3
Disk SQL
Bandwidth1,3
Disk SQL IOPS
Data Load Rate4

X4-2
Full Rack

X4-2
Half Rack

X4-2
Quarter

X4-2
Eighth

High Cap Disk

100 GB/s

50 GB/s

21.5 GB/s

10.7 GB/s

High Perf Disk

100 GB/s

50 GB/s

21.5 GB/s

10.7 GB/s

8K Reads

2,660,000

1,330,000

570,000

285,000

8K Writes

1,960,000

980,000

420,000

210,000

High Cap Disk

20 GB/s

10 GB/s

4.5 G/s

2.25 GB/s

High Perf Disk

24 GB/s

12 GB/s

5.2 GB/s

2.6 GB/s

High Cap Disk

32,000

16,000

7,000

3,500

High Perf Disk

50,000

25,000

10,800

5,400

20 TB/hr

10 TB/hr

5 TB/hr

2.5 TB/hr

1 - Bandwidth is peak physical scan bandwidth achieved running SQL, assuming no compression. Eec3ve data bandwidth will be much higher when compression is
factored in.
2 - IOPS Based on read IO requests of size 8K running SQL, typically with sub-millisecond latencies. Note that the IO size greatly eects ash IOPS. Others quote
IOPS based on 2K, 4K or smaller IOs that are not relevant for databases and measure IOs using low level tools instead of SQL.
3- Actual Performance varies by applica3on.
4 Load rates are typically limited by database server CPU, not IO. Rates vary based on load method, indexes, data types, compression, and par33oning

Managed Services Cloud Services Consul3ng Services Licensing

WHY EXADATA?

Managed Services Cloud Services Consul3ng Services Licensing

Why Exadata?
Exadata is designed to
eliminate the most common
bomleneck for large
databases

Timely transfer of large data


sets from storage subsystem to
database server

Managed Services Cloud Services Consul3ng Services Licensing

Why Exadata?
Solving the IO BoTleneck
Solu3on 1: Enlarge the pipe

Physical disks, on all cells, work in parallel to serve IO requests


Large Inniband pipe (40GB/Sec)
Managed Services Cloud Services Consul3ng Services Licensing

Why Exadata?
Cant we do that with other high
performance storage soluVons?

YES
There is nothing Magical about
Exadata hardware, and its s3ll the
same Oracle Database

Managed Services Cloud Services Consul3ng Services Licensing

Why Exadata?
Solving the IO BoTleneck
Solu3on 2: Reduce the IO opera3ons

Done using Exadatas Secret Sauce: Smart Storage, Smart Flash


Cache and Hybrid Columnar Compression
10X reduc3on in data sent to database servers is common
Managed Services Cloud Services Consul3ng Services Licensing

Exadata Innova3ons
Some are automa3c, with limited
congura3on ability
Storage Indexes
Smart Flash Cache

Some may require some eort


Smart Scans
Hybrid Columnar Compression (HCC)
IORM (Resource Manager)
Managed Services Cloud Services Consul3ng Services Licensing

Storage Indexes
Transparent I/O Elimination with No Overhead
Table

Index

A B C D
1
3
5
5
8

Min B = 1
Max B =5

Exadata Storage Indexes maintain summary


information about table data in memory
Store MIN and MAX values of columns
Typically one index entry for every MB of disk

Eliminates disk I/Os if MIN and MAX can never


match where clause of a query
Min B = 3
Max B =8 Completely automatic and transparent

3
Select * from Table where B<2 - Only first set of rows can match
Managed Services Cloud Services Consul3ng Services Licensing

Smart Flash Cache


I/Os

Caches Read and Write I/Os in PCI ash


Transparently accelerates read and write intensive
workloads
Up to 2.66 million 8K read IOPS from SQL
Up to 1.96 million 8K write IOPS from SQL

2.66 Million 8K Read


1.96 Million 8K Write
IOPS from SQL

Persistent write cache speeds database recovery


Exadata Flash Cache is much more eec3ve than
ash 3ering architectures used by others
Caches current hot data, not yesterdays
Caches data in granules 8x to 16x smaller than 3ering
Greatly improves the eec3veness of ash

Other Flash Features can be congured if needed


E.g. Cache compression, Cache pinning, Flash Disks (for Temp)
Managed Services Cloud Services Consul3ng Services Licensing

Avoid the 3X Club


Some Exadata op3miza3ons may require
a limle eort but theyre worth it.
Data Warehouse workloads should
improve >7X on Exadata
Managed Services Cloud Services Consul3ng Services Licensing

Avoid the 3X Club


Tune for Smart Scans
Wisely use Parallelism
Compress with HCC where appropriate
Invoke Resource Management (IORM)
S3ll follow Data Warehouse Best Prac3ces

Managed Services Cloud Services Consul3ng Services Licensing

Avoid the 3X Club an Example


EDW for Large Organiza3on in Salt Lake valley

Moved to Exadata beginning September 2012


Congured/Tuned Exadata op3miza3ons for October 2012
Average Response Time

Managed Services Cloud Services Consul3ng Services Licensing

Avoid the 3X Club


Tune for Smart Scans
Wisely use Parallelism
Compress with HCC where appropriate
Invoke Resource Management (IORM)
S3ll follow Data Warehouse Best Prac3ces

Managed Services Cloud Services Consul3ng Services Licensing

Smart Scan Processing


Oracle DB
Grid

Exadata
Storage
Grid

Who are my
customers in
Salt Lake
City?

Smart Scan idenVes rows / columns


Select name, customer#...
Where city=SALT LAKE CITY

1000 rows returned to


client

Managed Services Cloud Services Consul3ng Services Licensing

in the 1 TB tables that match the SQL


(1000 rows)

IO is executed and 20MB

returned from storage to


PGA

Smart Scan Comparison


SGA

PGA

Rows and
Columns

Database Servers
8K
Blocks

Standard
Operations

Smart Scans
Storage Servers
Managed Services Cloud Services Consul3ng Services Licensing

22

Smart Scan Requirements


Full table scan or index fast full scan
No IOTs, Clustered Tables or LOBs

Direct path reads


Direct path reads happen for
Serial queries of large tables (11gR2)
Func3on of Buer Cache Size, threshold and object size
_small_table_threshold

Parallel queries
Queries when _serial_direct_read

Managed Services Cloud Services Consul3ng Services Licensing

= TRUE!

Smart Scans How do you know?


Execu3on Plan
TABLE ACCESS STORAGE FULL
Storage() predicate
Only indicates Smart Scan is eligible to be
performed; does not mean it is

Managed Services Cloud Services Consul3ng Services Licensing

Smart Scans How do you know?


Sta3s3c views (V$MYSTAT, V$SESSTAT)
cell physical IO bytes eligible for predicate ooading
cell physical IO interconnect bytes
cell physical IO interconnect bytes return by smart scan

V$SQL views (IO_ columns)


IO_CELL_OFFLOAD_RETURNED_BYTES
IO_CELL_OFFLOAD_ELIGIBLE_BYTES

Wait events
cell smart table scan
cell smart index scan

Managed Services Cloud Services Consul3ng Services Licensing

Smart Scans How do you know?


A Easier Way SQL Monitor
Accessed through DBMS_SQLTUNE or OEM

Managed Services Cloud Services Consul3ng Services Licensing

Smart Scans Why dont they happen?


Index scan used instead
Buer cache too large
Many table blocks in buer cache

Chained rows
Tables with more than 255 columns

Certain func3ons (see v$sqlfn_metadata)


Table "too small (_small_table_threshold)!
Read consistency
Delayed block cleanout

Managed Services Cloud Services Consul3ng Services Licensing

Smart Scans How to get them?


Accurate, Up-to-date Sta3s3cs
Are ETL jobs gathering stats appropriately?
Use auto sample size
Exadata System stats
This is how the op3mizer becomes Exadata aware
exec dbms_stats.gather_system_stats('EXADATA');!

Right Sized SGA


Most Data warehouses shouldnt need more than 16GB

Avoid row by row processing


Appropriate use of Indexes
Wise use of Parallelism
Managed Services Cloud Services Consul3ng Services Licensing

To Index or Not to Index


So if Smart Scans are so great do we even need
indexes anymore?

YES!...
You s3ll need indexes for queries with
single/few out of many row reads

Also keep many FK indexes especially
if used for Star Transforma3ons

Managed Services Cloud Services Consul3ng Services Licensing

To Index or Not to Index


Many indexes will be obsolete and should be
removed to help drive smart scans
Test by:
Making indexes invisible and tes3ng queries
Comparing ETL without indexes

Managed Services Cloud Services Consul3ng Services Licensing

Avoid the 3X Club


Tune for Smart Scans
Wisely use Parallelism
Compress with HCC where appropriate
Invoke Resource Management (IORM)
S3ll follow Data Warehouse Best Prac3ces

Managed Services Cloud Services Consul3ng Services Licensing

Parallelism on Exadata
Parallelism executes the same on or o Exadata
PX works much bemer on Exadata and can be a big
performance boost
Pushes Direct Path Reads to enable smart scans
Exadata architecture enables parallelism through
storage cell CPUs and disks all working together
Load split across DB and Cell CPUs
Allows lower DOP on Exadata to achieve op3mal
performance

Easy to overwhelm a system with Parallelism


But on Exadata, it can be controlled eec3vely
Managed Services Cloud Services Consul3ng Services Licensing

Parallelism Guidelines
Control parallel load
Parallel init parameters
Parallel Statement Queuing
DBRM resource plans
Set parallel degree limits and max % targets

Set parallel degree on large tables


ALTER TABLE [TABLE NAME] PARALLEL 12;

Use parallelism for direct path loads in ETL


CTAS, IAS or Merge with Append Hint, Bulk Load API
ALTER SESSION ENABLE PARALLEL DML;!
Managed Services Cloud Services Consul3ng Services Licensing

Key Parallel Init Parameters


PARALLEL_MAX_SERVERS

See Oracle Support Note


1274318.1 for Exadata
best prac3ces

Max # of instance parallel workers


Recommend leaving at default (CPU_COUNT *
PARALLEL_THREADS_PER_CPU*10)

PARALLEL_MIN_SERVERS
Min # of instance parallel workers (default 0)
Helps control overhead of crea3ng and destroying
workers
Recommend seAng to high daily average of
workers
Managed Services Cloud Services Consul3ng Services Licensing

Parallel Init Parameters


AUTO DOP
Enabled by parallel_degree_policy

Manual (Default), Limited, Auto

Each statement automa3cally evaluated as a


candidate for parallelism; whether or not statements
contain parallel hints or objects have a DOP set
Controlled by parallel_min_time_threshold
10 seconds by default
Statements expected to run longer are candidates for
automa3c paralleliza3on

Use with Cau3on!


Managed Services Cloud Services Consul3ng Services Licensing

Parallel Statement Queuing


Limits concurrent parallel processes un3l enough
slaves are available
Protects against overwhelming the server with
parallel processes
Delivers a more consistent performance prole
Can be enabled without Auto DOP by seAng
_parallel_statement_queuing = TRUE!

Control when queuing starts by using


PARALLEL_SERVER_TARGET!

Statements queued in FIFO method


!
Managed Services Cloud Services Consul3ng Services Licensing

Parallel Statement Queuing

Managed Services Cloud Services Consul3ng Services Licensing

Parallel Statement Monitoring


OEM / Grid Control!
SQL Monitoring specically
GV$PX PROCESS
One record per Parallel Worker

GV$SQL_MONITOR
Also shows queued parallel statements
See Oracle Support Note
135043.1 for more
monitoring queries

Managed Services Cloud Services Consul3ng Services Licensing

Avoid the 3X Club


Tune for Smart Scans
Wisely use Parallelism
Compress with HCC where appropriate
Invoke Resource Management (IORM)
S3ll follow Data Warehouse Best Prac3ces

Managed Services Cloud Services Consul3ng Services Licensing

Hybrid Columnar Compression


Data is organized and compressed by
column in compression units (CU)

5X to 10X compression typical


Runs faster because of Exadata offload!

Space Optimized Archival Compression


for infrequently accessed data

Query

Speed Optimized Query Compression for


Data Warehousing

10X to 50X compression typical

Benets Mul3ply

Managed Services Cloud Services Consul3ng Services Licensing

Faster and Simpler


Backup, DR, Caching,
Reorg, Clone

Hybrid Columnar Compression


VENDOR_ID
==========
100
101
102
103
104

VEND_NAME STATE VNDR_RATING


=========== ===== ===========
ACME ONE MI
100
ACME ONE CA
90
NORTON
IA
95
WINGDINGS MI
96
WINGDINGS GA
96

VENDOR_TYPE
==========
DIRECT
DIRECT
INDIRECT
INDIRECT
INDIRECT

Hybrid Columnar Compression


Uncompressed

Logical Compression Unit


<- Header ->

Free space

CU Header->

VEND_NAME
VENDOR_ID

VENDOR_
TYPE

COL6

COL8
COL10

100ACME ONEMI100DIRECT|
101ACME ONECA()DIRECT|
102NORTONIA95INDIRECT|
103WINGDINGSMS96INDIREC
T|
104WINGDINGSGA96INDIREC
T

Managed Services Cloud Services Consul3ng Services Licensing

COL7
VNDR_RATING
STATE

COL9

Hybrid Columnar Compression


Performance Benets
If queries select a single or subset of columns, Oracle
will only need to read from blocks on which the
columns exist
This is dierent than other types of compression and un-
compressed tables

Not only is space saved, but also IO


Saving IO means bemer performance!

Managed Services Cloud Services Consul3ng Services Licensing

HCC Why Not?


HCC requires direct path loads
Conven3onal inserts use OLTP compression

Deletes against HCC tables lock en3re CU


When upda3ng HCC tables:
The updated row is migrated (i.e., deleted + re-
inserted into a new block, leaving a pointer behind)
New row is OLTP-compressed
Locks impact en3re CU, not just row!

DML on HCC tables is very expensive!


Managed Services Cloud Services Consul3ng Services Licensing

HCC Use Cases


Use OLTP compression for DW tables by default, and
then use HCC compression when
Data is direct path loaded (CTAS, Insert /*+ APPEND */)
Data is not updated
Or rarely updated and truncated and reloaded periodically

Par33on tables with dierent compression ra3os


Updated Data = OLTP compression
Heavily Queried Data = Query / Archive Low compression
Cold / Archive Data = Archive High compression

Use compression advisor to preview compress ra3o


DBMS_COMPRESSION.GET_COMPRESSION_RATIO
Managed Services Cloud Services Consul3ng Services Licensing

Avoid the 3X Club


Tune for Smart Scans
Wisely use Parallelism
Compress with HCC where appropriate
Invoke Resource Management (IORM)
S3ll follow Data Warehouse Best Prac3ces

Managed Services Cloud Services Consul3ng Services Licensing

IORM
IO Resource Management (IORM) governs and
meters IO from dierent workloads in the Exadata
Storage Servers
A common challenge with shared storage
infrastructure is that of compe3ng IO workloads
Batch vs. OLTP
Warehouse vs. OLTP
Produc3on vs. Test and Development

Compe3ng priori3es can be mi3gated by over-


provisioning storage, but this becomes expensive
Exadata addresses this challenge with IORM
Managed Services Cloud Services Consul3ng Services Licensing

IORM and DBRM


Oracle DBRM allows managing CPU and other internal
DB resources, e.g. parallelism, among compe3ng
workloads in a single database
DBRM is not Exadata Specic

With Exadata IORM integra3on, IO resources are also


controlled by DBRM
A DBRM resource plan is also called an intra-
database resource plan

Managed Services Cloud Services Consul3ng Services Licensing

IORM Plans
Approaches for managing resource allocaVons
Intra-database resource plans manage mul3ple
workloads in a single database
If only one database on the Exadata machine, only an intra-
database resource plan is needed

Inter-database resource plans manage resources


among mulVple databases on Exadata
Species alloca3ons to databases, not consumer groups
Category plans allow resource control across databases by
the type of workload
An IORM plan is the combina3on of an inter-database plan
and a category plan
Managed Services Cloud Services Consul3ng Services Licensing

IORM and DBRM


DBRM Example
Database DBM
OM OLTP
Consumer group

Database XBM
Online query
Consumer group

Other OLTP
Consumer group
Repor3ng
Consumer group

Managed Services Cloud Services Consul3ng Services Licensing

Batch query
Consumer group

IORM and DBRM


Category Plan Example
Database DBM
OM OLTP
Consumer group

Database XBM
Online query
Consumer group

Batch
category

Other OLTP
Consumer group

Repor3ng
Consumer group

Managed Services Cloud Services Consul3ng Services Licensing

Interactive
category

Batch query
Consumer group

IORM Example
All User IO = 100%

Category Plan

Interdatabase
Plan

70%
Interactive

40% XBM

30% Batch

60%
DBM

40%
XBM

60% DBM

Intradatabase
Plan

30%

70%

20%

30%

50%

IORM
Allocation

DBM
OM OLTP
26.25%

DBM
OTHER
OLTP:
15.75%

XBM:
ONLINE
QUERY
28.00%

Managed Services Cloud Services Consul3ng Services Licensing

DBM:
REPORTING
18.00%

XBM:
BATCH
QUERY
12.00%

IORM Rules
IORM is only engaged when needed
LeIover disk alloca3on is made available to other
workloads in rela3on to the congured resource plans
max limits can be set

Background IO is priori3zed rela3ve to user IO


Redo and control le writes always take precedence
DBWR writes are scheduled at the same priority as user IO

If no intra-database plan is set, all non-background IO


requests are grouped into the default
OTHER_GROUPS consumer group
Managed Services Cloud Services Consul3ng Services Licensing

IORM Plan Syntax


IORM plans created using CELLCLI / DCLI

Managed Services Cloud Services Consul3ng Services Licensing

IORM Monitoring
IORM Metrics using CELLCLI / DCLI
Metric Name

Meaning

DB_IO_RQ_SM
DB_IO_RQ_LG

Total number of IO requests issues by the database


since any resource plan was set

DB_IO_RQ_SM_SEC
DB_IO_RQ_LG_SEC

IO requests per second issued by the database in


the last minute

DB_IO_WT_SM
DB_IO_WT_LG

Total number of seconds that IO requests issued by


the database waited to be scheduled


Metric IORM script

See Oracle Support Note: Tool for Gathering I/O Resource


Manager Metrics: metric_iorm.pl [ID 1337265.1]

OEM (Grid Control) Exadata plugin


Managed Services Cloud Services Consul3ng Services Licensing

IORM
Unless you only have one database with a single
type of workload on Exadata then you should
use IORM
In other words
Everyone using Exadata should use IORM!

Managed Services Cloud Services Consul3ng Services Licensing

IORM Benets
EDW for Large Organiza3on in Salt Lake valley
3.5 days before and aIer enabling IORM/DBRM plans

Managed Services Cloud Services Consul3ng Services Licensing

Avoid the 3X Club


Tune for Smart Scans
Wisely use Parallelism
Compress with HCC where appropriate
Invoke Resource Management (IORM)
S3ll follow Data Warehouse Best Prac3ces

Managed Services Cloud Services Consul3ng Services Licensing

Follow DW Best Prac3ces


Oracle data warehousing on Exadata is s3ll
data warehousing on Oracle
(With a few incredible innova3ons J)

So
Data Warehouse Best Prac3ces s3ll apply!

Managed Services Cloud Services Consul3ng Services Licensing

Follow DW Best Prac3ces


Key Best PracVces
Dimensional Model (Star Schema)
Well-wrimen SQL
Table Par33oning (par3cularly fact tables)
Par33on by load frequency, sub par33on by join hash
Par33on Exchange loading

Parallel, Direct-Path (possibly nolog) Data Loading


Including Constraint and Index management

Query Rewrite
Materialized Views and OLAP cubes

Star Transforma3on Joins


Managed Services Cloud Services Consul3ng Services Licensing

GeAng the Most Out of Your Exadata DW


Smart Scans
(Storage Ooading)

Parallelism

DW Best
PracVces

Hybrid
Columnar
Compression

Managed Services Cloud Services Consul3ng Services Licensing

Ques3ons?

Managed Services Cloud Services Consul3ng Services Licensing

Anda mungkin juga menyukai