Anda di halaman 1dari 25

DATA WAREHOUSE

BUILDING BLOCKS

What is a Data Warehouse?


Subject oriented, integrated, non volatile and time variant collection of data in support of managements decision. Data Warehousing is a decision support system. It has the Following characteristics:

Characteristics: 1. A central database that is loaded from multiple operational databases for the purpose of end-user access and decision support.

What is a Data Warehouse? Continued


2.
A data warehouse differs from an operational system in that the data it contains is normally static and updated in a scheduled manner through massive loading procedures.

What is a Data Warehouse? Continued


3. A data warehouse is developed to accommodate random, ad hoc queries and to allow users to drill down to minute levels of detail.

Definition
Bill Inmon defines a central data warehouse as a database that is: 1. Subject Oriented The warehouse is organized around the major subjects of the enterprise (e.g. customers, products, and sales) rather than the major application areas (e.g. customer invoicing, stock control, and product sales). This is reflected in the need to store decision-support data rather than application-oriented data.

Definition (Continued)
2. Integrated

The data warehouse integrates corporate application-oriented data from different source systems, which often includes data that is inconsistent.

The integrated data source must be made consistent to present a unified view of the data to the users.

Definition (Continued)
3. Time Variant Data in the warehouse is only accurate and valid at some point in time or over some time interval.

Time-variance is also shown in the extended time that the data is held, the implicit or explicit association of time with all data, and the fact that the data represents a series of snapshots.

4. Non-Volatile Data
Data in the warehouse is not updated in real-time but is refreshed from operational systems on a regular basis. New data is always added as a supplement to the database, rather than a replacement.

13

5. Data Granularity
Data in the warehouse is summarized at different levels. Granularity levels are based on the data types and the expected system performance for queries.

The Benefits of Data Warehouse


Enable workers to make better and wiser decisions A data warehouse is specifically developed to allow users the ability to explore data in an unlimited number of ways, accommodating essentially any query a manager could dream up and providing access to the data sources that are behind the results. For example, information gleaned from a data warehouse can change pricing information.

The Benefits of Data Warehouse


Identify hidden business opportunities A data warehouse performs a second, and very valuable function by searching data for trends and abnormalities which users may not know to look for. For example: Assisting companies in spotting sales trends, and detecting erroneous or fraudulent billings.

The Benefits of Data Warehouse


Bending with the customer A data warehouse can help companies by really understanding who their customers are and what services they are using. For example, by collecting and analyzing internet portal click stream data, companies are able to build extensive user profiles to boost profits through sales channel.

The Benefits of Data Warehouse


Precision Marketing A data warehouse can aid in detecting segments of the marketplace (geographically and demographically) which remain untapped, and help show the best way to reach out to these potential customers (rapid response to market and technology trends).

Data WH and Data marts


Before deciding to build a data warehouse for a organization, you need to ask the following basic and fundamental questions and address the relevant issues: Top-down or bottom-up approach? Enterprise-wide or departmental? Which firstdata warehouse or data mart? Build pilot or go with a full-fledged implementation? Dependent or independent data marts?

Data Warehouse vs. Data Mart In Terms of Data Granularity Data Mart Data Warehouse
Corporate/Enterprise-wide Union of all data marts Data received from staging area Queries on presentation source Structure for corporate view of data Organized on E-R Model Departmental A single business process Star-join (facts & dimensions) Technology optimal for data access and analysis Structure to suit the departmental view of data

Top-Down Approach Advantages 1. A truly corporate effort, an enterprise view of data 2. Inherently architectednot a union of disparate data marts 3. Single, central storage of data about the content 4. Centralized rules and control 5. May see quick results if implemented with iterations Disadvantages 1. Takes longer to build even with an iterative method 2. High exposure/risk to failure 3. Needs high level of crossfunctional skills 4. High outlay without proof of concept

Bottom-Up Approach Advantages 1. Faster and easier implementation of manageable pieces 2. Favorable return on investment and proof of concept 3. Less risk of failure 4. Inherently incremental; can schedule important data marts first 5. Allows project team to learn and grow Disadvantages 1. Each data mart has its own narrow view of data 2. Permeates redundant data in every data mart 3. Perpetuates inconsistent and irreconcilable data 4. Proliferates unmanageable interfaces

Typical Architecture of A Data warehouse

1. Source Data Component Operational Data Store

An operational data store (ODS) provides the basis for operational processing and may be used to feed the data warehouse. It consists of the following: Production data Internal Data Archived data External Data

2. Source Staging Component


Three major functions need to be performed for getting the data ready. You have to extract the data, transform the data, and then load the data into the data warehouse storage.

Data staging provides a place and an area with a set of functions to clean, change, combine, convert, duplicate, prepare source data for storage and use in the data warehouse.

3. Data Storage Component


The foundation of the warehouse consists of detailed data at its most basic level. Stores all the detailed data in the database schema. In most cases, the detailed data is not stored online but aggregated to the next level of detail. On a regular basis, detailed data is added to the warehouse to supplement the aggregated data.

4. Information Delivery Component

Whats Metadata

THE DATA WAREHOUSE PROVIDES A MEANS FOR IMPLEMENTING AN EFFECTIVE DECISION SUPPORT ENVIRONMENT BY BUILDING EXISTING DATA FROM DISPARATE SOURCES SCATTERED ALL OVER AN ORGANIZATION. METADATA (META MODEL) COULD BE COMPARED TO AN INFORMATION DIRECTORY, CONTAINING THE YELLOW PAGES, ROAD MAP FOR NAVIGATING A DATA WAREHOUSE.

Types of Metadata
Extraction and Transformation Metadata-Extraction and loading processes metadata is used to map data sources to a common view of information within the warehouse. Operational Metadata-- Warehouse management process - metadata is used to automate the production of summary tables. End-User Metadata -- Query management process - metadata is used to direct a query to the most appropriate data source.

Special Significance of Metadeta


Why is metadata especially important in a data warehouse? 1. It acts as the glue that connects all parts of the data warehouse. 2. It provides information about the contents and structures to the developers. 3. It opens the door to the end-users and makes the contents recognizable in their own terms

THANK YOU

tha

Anda mungkin juga menyukai