3/15/12
Agenda
Goals of Data Warehouse Components of Data Warehouse Dimensional Modeling Case Study : Retail Business Designing the Dimensional Model Dimensional Table Attributes
o Date Dimension o Product Dimension o Store Dimension
3/15/12
The data warehouse must be a secure bastion that protects our information assets
The data warehouse must serve as the foundation for improved decision making
3/15/12
3/15/12
Dimensional Modeling
Fact Table
Dimension Table
Discretely valued description that is more or less constant and participates in constraints It implements user interface to the Data Warehouse
3/15/12
Simplicity and symmetry Highly recognizable to business users High performance benefits
3/15/12
3/15/12
3/15/12
Retail Business
100 grocery stores spread over five-state
area
departments, including grocery, frozen foods, dairy, meat and health/beauty aids
Each store has roughly 60,000 SKUs on its
shelves
outside manufacturers and have Universal Product Codes (UPCs) imprinted on the product package.
The remaining 5,000 SKUs come from
3/15/12
Retail Business
Data collection happens at
Cash Registers (POS systems) Back door where vendors make deliveries
3/15/12
Step 2. Declare the Grain Granularity, atomic data Provides maximum flexibility Can support all possibilities of user requests The most granular data is an individual line item on a POS transaction
3/15/12
3/15/12
Date Dimension
It is present in every data mart as a data
Date Dimension Unlike other dimension table date Date Attributes of date dimension: Description Full Date Day of Week o Day Number Day Number in Epoch o Month Number Week Number in Epoch Month Number in Epoch o Holiday Indicator Day Number in o Weekday Indicator Calendar Month Day Number in o Selling Season 3/15/12 Calendar Year Date Key (PK) dimension can be build in advance
semantics, so he or she would be unable to directly leverage inherent capabilities associated with a date data type
SQL date functions do not support filtering by
attributes such as weekdays versus weekends, holidays, fiscal periods, seasons, or major events
Presuming that the business needs to slice data by
3/15/12
Product Dimension
The product dimension describes every SKU
master files at headquarters and download a subset of the file to each stores POS system at frequent intervals attributes of each SKU
The product master holds many descriptive The merchandise hierarchy is an important
3/15/12
the product dimension table which are not part of the merchandise hierarchy, can combine constraints with a constraint on a merchandise hierarchy attribute three primary dimensions in nearly every data mart
Bakery Bakery Bakery Frozen Foods Frozen Foods Frozen Foods Frozen Foods Frozen Foods Frozen Foods Baked Well Fluffy Light QuickFreeze Freshlike Frigid Icy QuickFreeze Freshlike
Sales Quantity
attributes translates into user capabilities for robust and complete analysis
3/15/12
Store Dimension
Store Dimension
describes every store in our grocery chain primary geographic dimension in our case study as a location. As a result, we can roll stores up to any geographic attribute, such as ZIP code, county
Store Key (PK) Store Name Store Number (Natural Key) Store Street Address Store City Store County Store State Store Zip Code Store Manager Store District Store Region Floor Plan Type Photo Processing Type Financial Service Type Selling Square Footage Total Square Footage First Open Date Last Remodel Date and 3/15/12 more
Promotion Dimension
It describes the promotion conditions under
priced products to temporarily reduced-priced products but sales decrease in nearby products on the
The tradeoffs
correlated, the combined single dimension is not much larger than any one of the separated dimensions would be
o The combined single dimension can be
browsed efficiently but it only shows the possible combinations. Browsing in the dimension table does not reveal which stores 3/15/12 or products were affected by the promotion. This information is found in the fact table
product on promotion in a store each day regardless of whether the product sold or not.
It is a factless fact table as it has no
measurement metrics; it merely captures the relationship between the involved keys promotion but didnt sell requires a two3/15/12 step process
the transaction header record, containing all the information valid for the transaction as a whole, such as the transaction date and store identifier header information is already extracted into other dimensions as it serves as the grouping key for pulling together all the products purchased in a single transaction 3/15/12
Retail Schema
A frequent
shopper dimension table and add another foreign key in the fact table is created to see exact purchase of frequent shopper on a weekly basis
3/15/12
A frequent
Retail Schema
Original
schema gracefully extends to accommodate these new dimensions largely because we chose to model the POS transaction 3/15/12 data at its
Dimension Normalization
Perceived benefits of Dimension
Normalization
cryptic codes
o This design saves space as were only storing o The normalized design for the dimension
removed from the flat, denormalized dimension table and placed in normalized secondary dimension tables
o The multitude of snowflaked tables makes for
3/15/12
Surrogate Key
Surrogate keys are integers that are
dimensional models rather than relying on operational production codes operational code:
warehouse keys because any assumptions that we make eventually may be invalidated
o Queries and data access applications should
3/15/12
It gives the retailer insights about how to merchandise various The retail sales fact table cannot be used easily to perform MBA
table is a periodic Data mining tools and some OLAP products can assist with snapshot representing market basket analysis. However in the absence of these tools, a the pairs of products more direct approach is used purchased together during a specified time period
o
as SQL was never designed to constrain and group across line item fact rows o The market basket fact
3/15/12
Thank You
3/15/12
References
The Data Warehouse ToolKit Ralph Kimbal
Wikepedia.org
3/15/12