Anda di halaman 1dari 59

You Can Do It Datawarehouse: Beginner to Advanced In Two Hours

by Arup Nanda Manager Database Systems Starwood Hotels & Resorts International White Plains, NY

Objectives
Exploring DW Techniques in Oracle Case Study Oracle 10G Additions

DB1 Cust1

DB2 Cust11 ? Cust10

Cust2

Cust9

Cust3 Datawarehouse

Cust8

Cust4

Cust7 Cust5 Cust6 DB5 DB6

DB3

DB4

A Real Life Case


Claims Datawarehouse Several Customers/Sources Several Quarters Data Volume Was High Irregular Frequency Data Comes Often Late Near Real Time Requirements

Problem of Irregular Data


Detail Table

Summary Table Detail Table DBMS_MVIEW.REFRESH () CUST2

Problems
Incoming Data Irregular Summary Tables Need Refreshing Quarters Added Continuously Archival Requirements Vary Across Customers Quick Retrieval of Archival Needed

Problems contd.
Summary on Summary Tables as Materialized Views Need Refresh Whenever New Data Arrives Or When Data is Purged/Reinstated Customers Added and Deleted Frequently

Objective
To Minimize Downtime for Refreshes
Incrementally Refresh Partitioning Techniques

To Add Customers Easily To Add Quarters Easily To Archive Off and Purge Easily and Atomically To Restore Archives Quickly

Objective contd.
To have an ETL Setup for Easy Addition of Objects Such As Tables, Indexes, Mat Views. Use Only Available Oracle and Unix Tools
PL/SQL Unix Shell Scripts SQL*Plus

Design
Varying Dimensions
Customer Quarter

Composite Partitioning
Range (for Quarters) List (for Customers)

Local Indexes

Partitioning
Partitioned on CLAIM_DATE
RANGE Partitioned named YyyQq
Storage Clauses Not Defined

Supartitioned on CUST_NAME
LIST Named YyyQq_CustName, e.g. Y03Q3_CUST1

Indexing
All Indexes Local
CREATE INDEX IN_CLAIM_SUM_01 LOCAL ON SUMTAB1 (COL1, COL2)

No Indexes UNIQUE and GLOBAL

Storage
Each Subpartition of Index or Table is kept in separate tablespaces named in the format Y<Year>Q<Qtr>_<CustName>_DATA e.g. Y02Q2_CUST1_DATA Y02Q2_CUST2_DATA Y03Q3_CUST1_DATA

Table

Index

Customers

Cust3 Y03 Q3

Cust3 Y03 Q3

Quarter In Tablespace Y03Q3_CUST3_DATA In Tablespace Y03Q3_CUST3_INDX

Tablespace
create tablespace y03q3_cust1_data datafile /oradata/y03q3_cust1_data_01.dbf size 500m autoextend on next 500m extent management local segment space management auto

CREATE TABLE TAB1 ( ) PARTITION BY RANGE (CLAIM_DATE) SUBPARTITION BY LIST (CUST_NAME) ( PARTITION Y03Q1 VALUES LESS THAN (TO_DATE(2003/04/01,YYYY/MM/DD)), ( SUBPARTITION Y03Q1_CUST1 VALUES (CUST1) TABLESPACE Y03Q1_CUST1_DATA, SUBPARTITION Y03Q1_CUST2 VALUES (CUST2) TABLESPACE Y03Q1_CUST2_DATA,

Table DDL

and so on for all subpartitions


SUBPARTITION Y03Q1_DEF VALUES (DEFAULT) TABLESPACE USER_DATA ), PARTITION Y03Q2 VALUES LESS THAN (TO_DATE(2003/07/01,YYYY/MM/DD)), ( SUBPARTITION Y03Q2_CUST1 VALUES (CUST1) TABLESPACE Y03Q2_CUST1_DATA, SUBPARTITION Y03Q2_CUST2 VALUES (CUST2) TABLESPACE Y03Q2_CUST2_DATA,

and so on for all subpartitions


SUBPARTITION Y03Q2_DEF VALUES (DEFAULT) TABLESPACE USER_DATA ), and so on for all the partitions PARTITION DEF VALUES LESS THAN (MAXVALUE), ( SUBPARTITION DEF_CUST1 VALUES (CUST1) TABLESPACE USER_DATA, SUBPARTITION DEF_CUST2 VALUES (CUST2) TABLESPACE USER_DATA,

and so on for all subpartitions


SUBPARTITION DEF_DEF VALUES (DEFAULT) TABLESPACE USER_DATA

)
)

CREATE INDEX IN_TAB1_01 ON TAB1 (COL1) LOCAL NOLOGGING ( PARTITION Y03Q1 ( SUBPARTITION Y03Q1_CUST1 TABLESPACE Y03Q1_CUST1_INDX, SUBPARTITION Y03Q1_CUST2 TABLESPACE Y03Q1_CUST2_INDX, and so on for all subpartitions SUBPARTITION Y03Q1_DEF TABLESPACE USER_DATA ), PARTITION Y03Q2 ( SUBPARTITION Y03Q2_CUST1 TABLESPACE Y03Q2_CUST1_INDX, SUBPARTITION Y03Q2_CUST2 TABLESPACE Y03Q2_CUST2_INDX, and so on for all subpartitions SUBPARTITION Y03Q2_DEF TABLESPACE USER_DATA ), and so on for all the partitions PARTITION DEF ( SUBPARTITION DEF_CUST1 TABLESPACE USER_DATA, SUBPARTITION DEF_CUST2 TABLESPACE USER_DATA, and so on for all subpartitions SUBPARTITION DEF_DEF TABLESPACE USER_DATA ) )

Index DDL

Creating DDLs
Static Part
create table tab1 () DDL to Create Table Variable Part partition y03q1 ( subpartition y03q1_cust1 tablespace )

Constraints
Constraints defined as DISABLE NOVALIDATE RELY

ALTER TABLE ADD CONSTRAINT RELY DISABLE NOVALIDATE;

Constraint
VALIDATE/NOVALIDATE
Table TAB1 (Column: STATUS) Current Values A, I, F Check Constraint: STATUS IN (A,I)

ENABLE/DISABLE
New Value F

RELY

RELY
Reasons To Include Relation Information to the Metadata To Enable Query Rewrite

Summary Tab and View


On DW

Summary Table View


CUST_NAME CLAIM_DATE PROVIDER_ID NUM_CLAIMS NUM_LINES

On Source

SELECT CUST1 AS CUST_NAME, CLAIM_DATE, PROVIDER_ID, COUNT(DISTINCT CLAIM_ID) AS NUM_CLAIMS, COUNT(*) AS NUM_LINES FROM . GROUP BY

Casting
SELECT CAST (CUST_NAME AS VARCHAR2(20)) AS CUST_NAME FROM <viewname> CAST (column_name AS datatype (precision))

cust1

Owned by Cust Schema

Index of Temporary Table

View

INDEX

Filter: Where CLAIM_DATE is in that quarter

Temporary Table

TABLE Summary Table


For Customer Cust1 and Quarter Q1

Massaging Analyzing

DW

cust
Old Sub Partition

View

INDEX

Old Sub Partition

TABLE

ALTER TABLE EXCHANGE SUBPARTITION subpartname WITH TEMPTABLE INCLUDING INDEXES

DW

Technique
Not Using DBMS_MVIEW.REFRESH MV is always STALE

Temp Table
CREATE TABLE T1_Y03Q1_CUST1 TABLESPACE Y03Q1_CUST1_DATA PARALLEL 8 NOLOGGING AS SELECT FROM CUST1.VIEW1@DB1 WHERE CLAIM_DATE >= add_months(trunc(to_date(03','RR'),'YYYY'), 3*(to_number(1')-1)) and batch_date < last_day(add_months(trunc( to_date(03','RR'),'YYYY'), 3*(to_number(1')) - 1 )) + 1

Script
CREATE TABLE T1_Y&&YY.Q&&Q._&&CUST TABLESPACE Y&&YY.Q&&Q._&&CUST._DATA PARALLEL 8 NOLOGGING AS SELECT FROM &&CUST..VIEW1@&&DBLINK WHERE CLAIM_DATE >= ADD_MONTHS(TRUNC(TO_DATE('&&YY','RR'),'YYYY'), 3*(TO_NUMBER('&&Q')-1)) AND BATCH_DATE < LAST_DAY(ADD_MONTHS(TRUNC( TO_DATE('&&YY','RR'),'YYYY'), 3*(TO_NUMBER('&&Q')) -1 )) + 1

External Table
Reason Source is a non-Oracle DB, e.g. DB2 Source is External, no DB Link Allowed Fixed Format vs- Delimited Fixed Format Faster, Easier More Space Delimited Less Space Slower, Slightly More Complex

Massaging
Removing NOT NULL Constraints Making Datatypes Consistent
The CAST operation converts NUMBER(m,n) to NUMBER cast(col1 as number(10,2)) as col1_m COL1 NUMBER(5,2) COL1_M NUMBER

Analyzing
Using DBMS_STATS.GATHER_TABLE_STATS PARALLEL Degree
dbms_stats.gather_table_stats ( ownname => DWOWNER', tabname => '&&TABNAME', estimate_percent => dbms_stats.auto_sample_size, method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO', degree => dbms_stats.default_degree, cascade => TRUE );

Mat Views
MVs Created as Tables CREATE TABLE MV_SUMMTAB1 Storage clauses just like the underlying table CREATE MATERIALIZED VIEW MV_SUMMTAB1 ON PREBUILT TABLE AS SELECT
http://www.proligence.com/painless_alter.pdf

Query Rewrite
Table SUM_CLAIMS PROVIDER_ID, STATE, TYPE, TOT_AMT Table MV_SUM_CLAIMS PROVIDER_ID, STATE, SUM(TOT_AMT) TOT_AMT GROUP BY PROVIDER_ID, STATE

SELECT SUM(TOT_AMT) FROM SUM_CLAIMS SELECT SUM(TOT_AMT) FROM MV_SUM_CLAIMS

Query Rewrite
Init.ora Parameters query_rewrite_enabled='TRUE' query_rewrite_integrity='STALE_TOLERATED ENFORCED Rewrite only if guaranteed TRUSTED Uses only if RELY STALE_TOLERATED Even if not RELY

Checking QR
dbms_mview.explain_rewrite ( select cust_name, count(*) from summtab1 group by cust_name );

select message from rewrite_table;


QSM-01033: query rewritten with materialized view, MV_SUMMTAB1 QSM-01101: rollup(s) took place on mv, MV_SUMMTAB1

Design
MV_* subpartitions are on the same tablespace as the parents. Subparts of MV_SUMMTAB1_0? are in the same TS as SUMMTAB1 Subparts of MV_SUMMTAB2_0? in SUMMTAB2

Quarter

MV2

MV1 PARENT
TableSpace1 TableSpace2

MV and Parents
Partition Pruning Partition-wise Joins Partition Independence

Adding Quarters/Customers
Partition
Default Partition VALUES LESS THAN (MAXVALUE)

Subpartition
Default Subpartition VALUES (DEFAULT)

Cust1
Cust2

Cust3
DEF

Cust1
Cust2

Cust3
DEF

Cust1
Cust2

Cust3
Cust4

DEF

alter table split subpartition

Cust1

Cust2
Cust3 DEF

Cust1

Cust2
Cust3 DEF

alter table split partition

Backup/Restore
Backup
ALTER TABLESPACE <TSName> READ ONLY Copy the files to tape/CD.

Restore
Copy the file back into the directory ALTER TABLESPACE <TSName> RECOVER

Archival/Purge
Table SP1 SP2 SP3 SP4
Table4

Table SP1

SP2

SP3

SP4

Table4

Table SP1

SP2

SP3

Table4

Archival/Purge
CREATE TABLE S1_Y<yy>Q<q>_<CustName> TABLESPACE Y<yy>Q<q>_<CustName>_<TSType> AS SELECT * FROM SUMMTAB1 WHERE 1=2 / CREATE INDEXES, CONSTRAINTS, etc. / ALTER TABLE SUMMTAB1 EXCHANGE SUBPARTITION Y<yy>Q<q>_<CustName> WITH TABLE Y<yy>Q<q>_<CustName> INCLUDING INDEXES /

Check TTS
ALTER TABLESPACE Y<yy>Q<q>_<CustName>_<TSType> READ ONLY; DBMS_TTS.TRANSPORT_SET_CHECK ( <DataTS>,<IndexTS>) ; SELECT * FROM TRANSPORT_SET_VIOLATIONS;

Transport TS
Export Parameter File
TRANSPORT_TABLESPACE=y TTS_FULLCHECK=Y FILE=<FileLocation>/exp<TS>.dmp TABLESPACES=(<DataTS>, <IndexTS>)

Copy the exp.dmp and Datafiles to tape/CD.

Purge
Drop Subpartition Drop the Tablespace
DROP TABLESPACE <TSName> INCLUDING CONTENTS AND DATAFILES;

Restore
ALTER TABLE SPLIT SUBPARTITION <DefaultSP> Copy Datafiles & Export Dump Files from CD/Tape Import Parameter File
TRANSPORT_TABLESPACES=Y TABLESPACES=(<DataTS>,<IndexTS>) DATAFILES=()

Minimizing Refresh Unit


Months instead of quarters refreshed at a time. Last Quarter Split into a Subpartition per Month Naming Convention
YyyQqMmm Y03Q3M09

Merge Subpartition

Merging Subpartitions
Index Subpartitions Created in Users Default Tablespace Subpartition Template
ALTER TABLE SUMTAB1 ADD SUBPARTITION TEMPLATE

Resumable Statement
When? Running Large Report Jobs Creating Large Indexes ALTER SESSION ENABLE RESUMABLE NAME Job1; View DBA_RESUMABLE NAME Name specified in ALTER SESSION COORD_SESSION_ID Coord Session in PQ SQL_TEXT The text of the SQL STATUS - RUNNING, SUSPENDED, ABORTED, ABORTING, TIMEOUT ERROR_NUMBER/ERROR_MSG

Objectives Revisited
To Minimize Downtime for Refreshes
Incrementally Refresh Partitioning Techniques

To Add Customers Easily To Add Quarters Easily To Archive Off and Purge Easily and Atomically To Restore Archives Quickly

Oracle 10G
Transportable Tablespaces Can Be Reinstated At a Different Operating System
Can be used for Restoring to a Different OS

Tablespaces Can Be Renamed


Restoring Tablespace of the Same Name

Multiple Temporary Tablespace


For Large Index Creation, Sorting, etc.

Oracle 10G contd.


Partition Change Tracking Support for List Partitioning Query Rewrites Can Use Multiple MVs OEM Shows All Partitioning Features Data Pump
Export/Import on Steroids Parallel Operation

Oracle 10G contd.


External Table Download
A Utility to Create File from Table Data
CREATE TABLE ORGANIZATION EXTERNAL AS SELECT * FROM <a query>

Platform Independent File Can Be Used In External Tables

Thank You!
Updated Copy Can Be Found In

www.proligence.com
arup@proligence.com

Anda mungkin juga menyukai