Anda di halaman 1dari 15

Dimensional modeling

Dimensional modeling (DM) is the name of a set of techniques and concepts used in
data warehouse design. It is considered to be different from entity-relationship modeling
(ER). Dimensional Modeling does not necessarily involve a relational database. The
same modeling approach, at the logical level, can be used for any physical form, such as
multidimensional database or even flat files. According to Dr. Kimball,[1] DM is a design
technique for databases intended to support end-user queries in a data warehouse. It is
oriented around understandability and performance. According to him, although
transaction-oriented ER is very useful for the transaction capture, it should be avoided for
end-user delivery.

Dimensional modeling always uses the concepts of facts (measures), and dimensions
(context). Facts are typically (but not always) numeric values that can be aggregated, and
dimensions are groups of hierarchies and descriptors that define the facts. For example,
sales amount is a fact; timestamp, product, register#, store#, etc. are elements of
dimensions. Dimensional models are built by business process area, e.g. store sales,
inventory, claims, etc. Because the different business process areas share some but not all
dimensions, efficiency in design, operation, and consistency, is achieved using
conformed dimensions, i.e. using one copy of the shared dimension across subject areas.
The term "conformed dimensions" was originated by Ralph Kimball.

Dimensional modeling structure:[2][3]

The dimensional model is built on a star-like schema, with dimensions surrounding the
fact table. To build the schema, the following design model is used:

1. Choose the business process


2. Declare the Grain
3. Identify the dimensions
4. Identify the Fact

CHOOSE THE BUSINESS PROCESS

The process of dimensional modeling builds on a 4-step design method that helps to
ensure the usability of the dimensional model and the use of the data warehouse. The
basics in the design build on the actual business process which the data warehouse should
cover. Therefore the first step in the model is to describe the business process which the
model builds on. This could for instance be a sales situation in a retail store. To describe
the business process, one can choose to do this in plain text or use basic Business Process
Modeling Notation (BPMN) or other design guides like the Unified Modeling Language
(UML).

DECLARING THE GRAIN


After describing the Business Process, the next step in the design is to declare the grain of
the model. The grain of the model is the exact description of what the dimensional model
should be focusing on. This could for instance be “An individual line item on a customer
slip from a retail store”. To clarify what the grain means, you should pick the central
process and describe it with one sentence. Furthermore the grain (sentence) is what you
are going to build your dimensions and fact table from. You might find it necessary to go
back to this step to alter the grain due to new information gained on what your model is
supposed to be able to deliver.

IDENTIFY THE DIMENSIONS

The third step in the design process is to define the dimensions of the model. The
dimensions must be defined within the grain from the second step of the 4-step process.
Dimensions are the foundation of the fact table, and is where the data for the fact table is
collected. Typically dimensions are nouns like date, store, inventory etc. These
dimensions are where all the data is stored. For example, the date dimension could
contain data such as year, month and weekday.

IDENTIFY THE FACTS

After defining the dimensions, the next step in the process is to make keys for the fact
table. This step is to identify the numeric facts that will populate each fact table row. This
step is closely related to the business users of the system, since this is where they get
access to data stored in the data warehouse. Therefore most of the fact table rows are
numerical, additive figures such as quantity or cost per unit, etc.

[edit] Dimension Normalization

Dimensional normalization or snowflaking removes redundant attributes, which are


known in the normal flatten de-normalized dimensions. Dimensions are strictly joined
together in sub dimensions.

Snowflaking has an influence on the data structure that differs from many philosophies of
data warehouses.[3] Single data (fact) table surrounded by multiple descriptive
(dimension) tables

Developers often don't normalize dimensions due to several facts:[4]

1. Normalization makes the data structure more complex


2. Performance can be slower, due to the many joins between tables
3. The space savings are minimal
4. The use of bitmap indexes can't be done
5. Query Performance, 3NF databases suffer from performance problems when
aggregating or retrieving many dimensional values that analysis may require. If
you are only going to do operational reports then you may be able to get by with
3NF because your operational user will be looking for very fine grain data.
There are some arguments on why normalization can be useful.[3] It can be an advantage
when part of hierarchy is common to more than one dimension. For example, a
geographic dimension may be reusable because both the customer and supplier
dimensions use it.

[edit] Benefits of dimensional modeling [5]

Benefits of the dimensional modeling are following:

• Understandability - Compared to normalized model the dimensional model is


easier to understand and more intuitive. In dimensional models information is
grouped into coherent business categories or dimensions which make it easier to
read and interpret. Simplicity allows also software to efficiently navigate
databases. But in normalized models data is divided into many discrete entities
and even the simple business process might result in dozens of tables that might
be joined together in complex way.
• Query performance - Dimensional models are more denormalized and optimized
for data querying while normalized models seek to eliminate data redundancies
and are optimized for transaction loading and updating. The predictable
framework of a dimensional model allows the database to make strong
assumptions about the data that aid in performance. Each dimension is a
equivalent entry point into the fact table and this symmetrical structure allows
effectively handle complex queries. Query optimization for star join databases is
simple, predictable, and controllable.
• Extensibility - dimensional model is extensible to accommodate unexpected new
data. Existing tables can be change in place either by simply adding new data
rows in the table or executing SQL alter table. No queries or other applications
that sits on top of the Warehouse needs to be reprogrammed to accommodate the
change. Old queries and applications continue to run without yielding different
results. But in normalized models each modification should be considered
carefully, because of the complex dependencies between database tables.

Facts, Dimensions and Dimension Hierarchy

For every decision support system, it is important that the data model chosen parallels the
business analysts' understanding of the business structure. The model should be in such a
way that it hides the technical complexities of an OLTP system(like those in Oracle
HRMS/SCM/Financials or in other enterprise applications) and transforms them into a
OLAP model that allows analysts to structure queries in the same intuitive fashion as they
would ask questions.

Dimensional Model
Dimensional model comprises of a fact table and many dimensional tables and is used for
calculating summarized data. Since Business Intelligence reports are used in measuring
the facts(aggregates) across various dimensions, dimensional data modeling the preferred
modeling technique in a BI environment.

Measures or facts are typically calculated data like dollars value or Sales or Revenue.
They correspond to the focus of a decision support investigation.

Dimensions define the axis of investigation of a fact. For example, Product, Region and
Time are the axes of investigation of the Sales fact. One such investigation could be a
scenario where the user might want to see the Sales(in dollars) for a particular product in
a given market over a particular period of time. In this case, we are calculatin the
fact(Sales) across three dimensions(Product,Region and Time). In simpler terms I can
further say that Dimensions give different views of the facts. The give structure to the
otherwise unstructured facts.

Facts Table

A fact table is a table with measures. They must be defined in a logical fact table. Each
measure has its own aggregation rule such as SUM, AVG, MIN or MAX. Aggregation
rules define the way by which business would like to compare values of a measure.

In OBIEE, while defining the Business Model and mapping, we can define the
aggregation rule on Fact columns/tables. I'll discuss this later as OBIEE has lots of rules
defined on the use of aggregation rules. The following picture shows how Sales Fact
table is joined in a One-t0-Many relationships with other dimension tables.
Dimension Tables

A business uses facts to measure performance by well established dimensions. Every


dimension has a set of descriptive attributes. Dimension tables contain attributes that
describe business entities. For example, the Customers dimension can contain attributes
like Region, Subregion, Country, State, Customer.

Dimension Hierarchy

A hierarchy is a set of parent-child relationships between attributes within a dimension.


These hierarchy attributes, called levels, roll up from child to parent, for example,
Customer totals can roll up to Subregion totals which can further rollup to Region totals.
A better example would be daily sales could roll up to weekly sales, which further rollup
to month to quarter to yearly sales. A sample hierarchy in OBIEE is shown below

There are two database model schemas that use the dimensional model. They are 'Star'
and 'Snowflakes' . Ill talk about the 'Star' schema(preferred schema) in later articles.

OBIEE Repository Metadata Development

Table of contents

Dimensional modeling.........................................................................................................1
Process for bringing data into OBIEE

(from remarks by Mike Jelen, BICG, 5/13/2008)

General: Perform an operation, then immediately save and check integrity. This saves a
lot of troubleshooting time.

Dimensions

Physical Layer

• Use File.> Import from Server with OCI 10g/11g selected (avoids need to change
the Connection Pool later)
• TNS Name = Data source name on Connection Pool dialog
o TNS Name = polydata.world
Connection Pool

• Require fully qualified table names on connection pool (needed for proxy account
since it has only read access to tables)

Create Primary Keys

• Verify that primary keys were imported and identified in the tables. If not, add
them.
• Right-click table, select Properties
• Select the Keys tab
• If primary key is not marked or the wrong one is selected, modify the keys
accordingly.

Create the Physical Diagram

• Click Foreign Join button, then click Dimension table and drag to Fact table
• For various DATES in the fact table, copy the TIME dimension 3 times with
different alias names, then join to 3 different dates in the fact table (e.g.,
withdrawn_date, enrolled_date, dropped_date)
Business Layer

• Use Complex Joins to join new items created in the Business Layer, if any
• Aggregate items that can be summed
• Use Rename Wizard to rename fields and tables
o One table at a time (in on-line mode)
o Rules must be ordered, or perform one rule at a time
• Group columns with dummy columns, e.g.

--- Program Plan Begin ---

--- Program Plan End ---

o Check "Use existing logical column as the source"


o Enter the number 1 in the formula box

Presentation Layer

• Group columns with subfolders


o Create new Presentation Table as "– [Name]" (that's hyphen space name)
o Creates a subfolder under the parent folder
 It is also possible to do this with "->" in the Description, but this is
less desirable because it is not visible in Administrator unless you
are looking at the column's Properties
• For Administrator convenience, use icons to relate items in Presentation and
Business Model layers

Denormalized tables

• Physical layer
o Several subject-related denormalized tables may be pulled in together
o Create an alias of each denormalized table
o In the Physical Diagram, use a simple join from the original table to the
alias
o The alias becomes the "fact" table
• Business Layer
o Delete columns from the "fact" table that are not used for measures (all but
the join column in most cases)
• Presentation Layer
o Delete alias table (assuming it does not contain measures)
o Create different folders (subject areas) for different denormalized tables so
that people don't try using them together. (While this would not be
allowed in Answers, trying to do so returns an error message that is
confusing for the user.)
Repository Documentation Utilities

• BI Administration Tool > Tools menu > Utilities


• For more information search Help

Rename Wizard

• Rename columns
• Remove underscore character
• Change all column names so that first letter of each word is uppercase and rest is
lowercase.

Generating and Deploying a Metadata Dictionary

• Web-based XML file of information in the repository file.

Generating Documentation of Repository Mappings

• Use this tool to document the tables, columns, etc in the repository file.
• Produces csv or tab-delimited file.
• Only columns that are in the Presentation Layer will be included in the report.
• Run the utility after ALL columns are brought to the Presentation Layer and after
running the Rename Wizard.
• Run the utility before fields (ex. Keys, Extract Date, Maint Date) are deleted from
the Presentation Layer.

Verify Data Type

Before moving tables to the Business Model or Presentation Layer verify that the Data
Type is correct for the columns. OBIEE may define the column as a different data type
than what was intended or used in the Warehouse.

Change the Date Data Type

Most date columns come into OBIEE with a Data Type of DATETIME which produces a
date format that includes the date and the time.

To remove the time from the field change the Data Type to DATE. This change will push
through from the Physical Layer through the Business Model and Presentation Layers.

Dates are brought in as DATETIME.

• DATETIME = 1900/02/25 00:00:00


• DATE = 1900/02/25
Columns defined as Number will come into OBIEE as DOUBLE which includes two
decimal places. If this is incorrect for the column select INTEGER.

• DOUBLE = 19000225.00
• INTEGER = 19000225

Note: Column Properties which include Style, Column Format, Data Format and
Conditional Format are applied in Answers.

Where are system-wide changes stored?

System-wide changes are stored in the Web Catalog and must be moved from
development to production servers along with requests and dashboards. These files are
stored in folders separate from requests and dashboards.

Create Hierarchies

1. Right-click dimension table and select Create Dimension at the bottom of the shortcut
menu.
2. A Hierarchy is created using the name of the dimension table followed by the word
Dim.
3. A Grand Total level and Detail level will be created. The Detail level will contain all
of the columns of the dimension.
4. Create child levels using names that match your hierarchy. Right-click the Detail
level, select New Object, Child level.
5. Create a child level below the lowest child level (this will contain the dimension key.)
Move the dimension key to this level. Right-click this key column and remove the
checkmark next to Use for drilldown.
6. Move other columns to their respective levels.
7. Delete any unneeded columns from any level.
8. Select columns within each level, right-click and select New Logical Level Key.
Select all columns as keys if you want to see them in drilldown.
9. For any levels that should be skipped in the drill-down, select all columns in the level,
right-click and remove the checkmark next to Use for drilldown.
10. At level two check for a Detail key that was automatically created. (Right-click level
two, Properties, click the Keys tab). Highlight and delete Detail Key.

Hierarchy Aggregation

In the hierarchy, starting from the top and moving down. At each level below the Grand
Total level, double-click the level. In the field Number of elements at this level: enter
10 and increase in increments of 10 through all levels in the hierarchy.

Hierarchy Errors

OBI Tool Metadata Errors and Solutions


[nQSError: 15001] Could not load navigation space for subject area Student Enrollment.

[nQSError: 15019] Table Student Class Section is functionally dependent upon level
Institution Code, but a more detailed child level has associated columns from that same
table or a more detailed table.

This message occurs when either a key is in the Total level or when the first child level
below the Total level contains two keys. Right-click on the Total level and select
Properties. The Keys tab should be grayed out. If it is selectable, check to see what is
there and delete it. Go to the next level down, right-click, select the Keys tab. Only one
Key should appear. Delete the key that does not belong.

http://www.oracle.com/technology/products/bi/index.html

Building Hierarchies With Columns In Different Tables

This first step is needed if columns for hierarchy reside in multiple dimension tables
or a dimension table and fact table or because the data model contains only one
dimension table and no fact table.

1. Create a new or second dimension table from an already existing table


2. Create an alias of the table in the physical layer.
3. Right-click on table in Physical layer, select New object > Alias.
4. Join the new dimension in the Physical diagram.
5. Select the data model, right-click Physical Diagram > Object(s) and All Joins.
6. Click the New Foreign Join button. Click the new dimension table, then click the Fact
table.
7. If you are joining a new dimension table to itself then create joins between all of the
keys in both tables.
8. Click the first key in the table on the left. Find the matching key in the table on the
right. Hold down the CTRL key and click the next key on the left and the matching
key on the right. Continue until all keys are identified in the Expression box at the
bottom of the Physical Foreign Key window. Click the OK button.
9. A message “ a matching table key doesn’t exist…” will appear. Click Yes.
10. Check in changes.
11. Drag the new dimension (Alias) table from the Physical Layer to the Business Model.
12. Select the data model, right-click Business Model Diagram > Whole Diagram.
13. Use the Complex Join button between the two tables. No changes are made on the
Logical Join window. Click OK. Close the Diagram Window.
14. The new dimension table now has a dimension table icon.
15. Check in changes. Save repository.
16. Cleanup on tables

• Remove all “Fact” columns from the new dimension table.


• Remove all dimension columns that don’t belong in the “Fact” table.
Assuming an item in the newly created dimension A that needs to be used for a
hierarchy in a different dimension B:

Select items from physical layer table corresponding to A and drag to the B dimension in
the Business Model.

Expand Sources for B table (click + ) then double-click on the B table source.

Click the Add button on the General tab.

In the box below name click the B table then click the Select button. This creates a join
between those two dimensions.

Then move Items that you want in the B dimension from the Physical layer to the A
dimension Source name.
This forces a logical join from the physical join between the dimension table and Fact
table.

Changes Applied Through Answers

Formatting Data Values

Column Properties which include Style, Column Format, Data Format and Conditional
Format are applied in Answers. When a certain format is desired for a column everytime
the column is used then system-wide changes can be applied. System-wide changes can
be saved at the column level or the data type level.

A perfect example for a system-wide change is the GPA (Grade Point Average). The
field is defined with a Data Type of DOUBLE which gives you two decimal places but
you need three decimal places in Answers and Dashboards.

1. Select the GPA column in Answers.


2. Select the Column Properties button.
3. Select the Data Format tab.
4. Click the Override Default Data Format checkbox
5. Change the number of Decimal Places.
6. Click the Save button and choose "as the system-wide default for "Student.GPA"
Request Filters

Protect Filters Passed through Dashboard

On the filter in the Answers Request, select Filter Options button > Protect Filter.

This ensures that the filter used in the request is not lost or overwritten by another filter
or dashboard prompt that may supersede the request.

This option is only available if a value has been specified in the filter. If the filter item is
set to "is prompted" then the Protect Filter option if not available.

Anda mungkin juga menyukai