Students: Nguyen Ngoc Quang, Bui Nguyen Thang Instructor: Dr Nguyen Quynh Chi
Table of Contents
1. Introduction ..................................................................................................................................... 2 1.1. 1.2. 2. 3. Objective .................................................................................................................................. 2 Scope ....................................................................................................................................... 2
Business Requirement ...................................................................................................................... 3 Functional Specification ................................................................................................................... 3 3.1. a) b) 3.2. Building Data warehouse phase ................................................................................................ 4 Input ........................................................................................................................................ 4 Output ..................................................................................................................................... 4 Generating the OLAP report phase ........................................................................................... 4
4.
Data Warehousing Design ................................................................................................................ 5 4.1. a) b) 4.2. 4.3. a) b) Translate each ER model into EER model .................................................................................. 5 Headquarter Database: ............................................................................................................ 5 Sales Databases: ....................................................................................................................... 8 Integrate these 2 models into final model .............................................................................. 11 Star Scheme design ................................................................................................................ 13 Design Star scheme process.................................................................................................... 13 Depicting Star schema: ........................................................................................................... 14
5. 6. 7. 8.
Data cube Implementation............................................................................................................. 17 OLAP Report .................................................................................................................................. 18 Data Verification ............................................................................................................................ 26 Conclusion ..................................................................................................................................... 27
This projects objective is to design and implement a data warehouse for a customer order processing system in a company using MS SQL Server and Oracle. 1.2. Scope
The target of our data warehouse system is an enterprise that consists of a number of stores located in different cities and states. Each store holds a variety of items in various quantity. In addition, the enterprise keeps the information of the customers. There are two kinds of customers: One is walk-in led by tourism guide and the other is mail-order by post address inclusive. The city location of the customer, together with the data of the customers first order, is stored by the existing system. Each customer lives in one city only, and the enterprise will try to satisfy the customers order items by the present stock in the city where the customer lives. Each customer order can be for any quantity of any number of items, and each order is uniquely identified by an order number. The location of the stores is also recorded. Each store is located in one city, and there can be many stores in the city. Each city has one headquarter for coordinating all of its stores. The enterprises goal is to meet all of the customers requirements from stores located in the customers city. If the requirement cannot be met, the company will turn to the other cities where the item can be found if there is any. The current relational schemas of the enterprises current databases are: Headquarter Database: Relation Customer (Customer_id, Customer_name, City_id, First_order_date) Relation Walk-in_customers (*Customer_id, tourism_guide, Time) Relation Mail_order_customers (*Customer_id, post_address, Time) Sales Databases: Relation Headqarters (City_id, City_name, Headquarter_addr, State, Time) Relation Stores (Store_id, *City_id, Phone, Time)
Relation Items (Item_id, Description, Size, Weight, Unit_price, Time) Relation Stored_items (*Store_id, *Item_id, Qantity_held, Time) Relation Order (Order_no, Order_date, Customer_id) Relation Ordered_item (*Order_no, *Item_id, Quantity_ordered, Ordered_price, Time) Where underlined are primary key and * prefixed are foreign keys.
2. Business Requirement The data warehouse system extracts data from the existing two database, and provides online analytical processing (OLAP) with typical OLAP operations: roll up, drill down, slice and dice according to users selections based on dimension tables to meet the user requirements. When we construct Data Cube, we put up a new supplemental dimension of date. The system needs to generate an OLAP report (application specification of the data warehousing for the users) for the following tasks: 1. Find all the stores along with city, state, phone, description, size, weight and unit price that hold a particular item of stock. 2. Find all the orders along with customer name and order date that can be fulfilled by a given store. 3. Find all stores along with city name and phone that hold items ordered by given customer. 4. Find the headquarter address along with city and state of all stores that hold stocks of an item above a particular level. 5. For each customer order, show the items ordered along with description, store id and city name and the stores that hold the items. 6. Find the city and the state in which a given customer lives. 7. Find the stock level of a particular item in all stores in a particular city. 8. Find the items, quantity ordered, customer, store and city of an order. 9. Find the walk in customers, mail order customers and dual customers (both walk-in and
mail order).
3. Functional Specification
In this part, we define the input and output specification of the data warehousing
3.1.
a) Input
Input here is 2 existing and separate databases of enterprise Headquater Database Relation Customer (Customer_id, Customer_name, City_id, First_order_date) Relation Walk-in_customers (*Customer_id, tourism_guide, Time) Relation Mail_order_customers (*Customer_id, post_address, Time) Sales Database Relation Headqarters (City_id, City_name, Headquarter_addr, State, Time) Relation Stores (Store_id, *City_id, Phone, Time) Relation Items (Item_id, Description, Size, Weight, Unit_price, Time) Relation Stored_items (*Store_id, *Item_id, Qantity_held, Time) Relation Order (Order_no, Order_date, Customer_id) Relation Ordered_item (*Order_no, *Item_id, Quantity_ordered, Ordered_price, Time)
b) Output
Output here is data warehouse for system of the enterprise by integrating data from 2 given databases.
3.2.
Input and output of this phase can be shown by the following summary table: Input Item Output all the stores along with city, state, phone, description, size, weight and unit price that hold that particular item. Store All the orders along with customer name and order date that can be fulfilled by that given store. Customer all stores along with city name and phone that hold items ordered by given customer
Item (along with level of the headquarter address along with city and state of all stores that hold stocks of an item above that particular level. item) Order the items ordered along with description, store id and city name and the stores that hold the items. Customer Citystate in which that given customer lives.
stock level of a that item in all stores in a given city items, quantity ordered, customer, store and city of that order. walk-in customer mail-order customer dual customer (both walk-in and mail order)
4.1.
a) Headquarter Database:
Step 1: Defining each relation, key and field: Relation Relation type Customer PR1 Customer_id Customer_name, First_order_date Primary key KAP KAG FKA NKA
Walk-in_customers
PR2
Customer_id
Customer_id
tourism_guide, Time
Mail_order_customers
PR2
Customer_id
Customer_id
post_address, Time
Step 2 Map each PR1 into entity Relation Customer (Customer_id, Customer_name, City_id, First_order_date)
Step 3 Map each PR2 into a subclass entity or weak entity Map PR2: Relation Walk-in_customers (*Customer_id, tourism_guide, Time) Relation Mail_order_customers (*Customer_id, post_address, Time) Belong to case 2: PR1 : Customer (Customer_id, Customer_name, City_id, First_order_date) PR2 :Walk-in_customers (*Customer_id, tourism_guide, Time) Mail_order_customers(*Customer_id, post_address, Time) We map Walk-in_customers,Mail_order_customers into subclass entity (overlap generalization since a customer can be both of them):
Step 4 Map SR1 into binary/n-ary relationship: There is no SR1. Step 5. Map SR2 into binary/n-ary relationship There is no SR2. Step 6. Map each FKA into relationship There is no FKA Step 7. Map each inclusion dependency into semantics (binary/n-ary relationship) Step 8: Draw EER model
b) Sales Databases:
Step 1: Defining each relation, key and field: Relation Relation type Headquarters PR1 City_id City_name, Headquarter_addr, State, Time Stores Items PR1 PR1 Store_id Item_id City_id Phone, Time Description, Size, Primary key KAP KAG FKA NKA
Weight, Unit_price, Time Stored_items SR1 Store_id, Item_id Order PR1 Order_no Store_id, Item_id Order_date, Customer_id Ordered_item SR1 Order_no, Item_id, Order_no, Item_id, Quantity_ordered, Ordered_price, Time Qantity_held, Time
Step 2 Map each PR1 into entity Relation Headquarters (City_id, City_name, Headquarter_addr, State, Time) Relation Stores (Store_id, *City_id, Phone, Time) Relation Items (Item_id, Description, Size, Weight, Unit_price, Time) Relation Order (Order_no, Order_date, Customer_id)
Step 3 Map each PR2 into a subclass entity or weak entity Step 4 Map SR1 into binary/n-ary relationship: Relational Schema: Relation Stores (Store_id, *City_id, Phone, Time) Relation Items (Item_id, Description, Size, Weight, Unit_price, Time) Relation Stored_items (*Store_id, *Item_id, Qantity_held, Time)
Relation Items (Item_id, Description, Size, Weight, Unit_price, Time) Relation Order (Order_no, Order_date, Customer_id) Relation Ordered_item (*Order_no, *Item_id, Quantity_ordered, Ordered_price, Time)
Step 5. Map SR2 into binary/n-ary relationship There is no SR2. Step 6. Map each FKA into relationship
Step 7. Map each inclusion dependency into semantics (binary/n-ary relationship) Step 8: Draw EER model
4.2.
Step 1: Resolve conflicts among EER model Step 2: Merge entities Merge entities by Implied Binary Relationship We see that: City_id is the primary key of Headquarter Table It is also the non-primary key of Customer Table We need to build up a relationship between Headquarter and Customer as following diagram:
Similarly, we have: Customer_id is the primary key of Customer Table Customer_id is also the non-primary key attribute of Order Table We need to build up a relationship between Orderand Customer as following diagram:
Step 3: Merge relationships From these new relations, we can build the final integrated EER model for building the data warehouse for the enterprise as following:
4.3.
In order to build the fact table, measure and set of dimension tables, we have to base on service requirement of enterprise. In detail, for this problem, we have to generate the OLAP report for the given operations: 1. Find all the stores along with city, state, phone, description, size, weight and unit price that hold a particular item of stock. 2. Find all the orders along with customer name and order date that can be fulfilled by a given store. 3. Find all stores along with city name and phone that hold items ordered by given customer. 4. Find the headquarter address along with city and state of all stores that hold stocks of an item above a particular level. 5. For each customer order, show the items ordered along with description, store id and city name and the stores that hold the items. 6. Find the city and the state in which a given customer lives.
7. Find the stock level of a particular item in all stores in a particular city. 8. Find the items, quantity ordered, customer, store and city of an order. 9. Find the walk in customers, mail order customers and dual customers (both walk-in and
mail order). From these tasks, we divide tables into 2 fact table to solve 2 sub-kind of operation: y Fact_Store : to manage the items of stores.This fact table is built mainly based on the relationship between Store and Item table y Fact_Order : to manage the transaction through orders. So, it is mainly based on Order and Items Table.
From that, we can identify measures of the Fact Table as following: Fact Table Fact_Store Fact_Sale Key Store_id, City_id, Item_id, Time_id Order_no, Item_id, Store_id, Customer_id, Time_id Measure Quantity_held Quantity_ordered Unit_price Ordered_price
And then, we identify the set of Dimensons: Fact Table Fact_Store Dimension Headquarter Item Stores Time Customer Items Order Stores Time Foreign Key City_id Item_id Store_id Time_id Customer_id Item_id Order_no Store_id Time_id
Fact_Sale
Fact_Store:
Fact_Sale:
5. Datacube Implementation
In this part, we make the computer automation of implementing the data warehousing loading data into data cubes. The implementation of data cube help to make the OLAP report for enterprise more convenient. For implementing the data cube, we use SQL Server Business Intelligence Development Studio (Integrationof Visual Studio and SQL Server advance feature):
6. OLAP Report
We use Crystal Report to invoke panels to generate OLAP reports. The reason here is when we create a report with Crystal Reports, and then we can run this kind of dynamic report whenever we want the latest status of the information query by using typical OLAP operation such as drill down or roll up. In short, though the data in that database changes with time, we still can keep up with its current status. We only need to create the report only once, but we can run it many times, getting the latest results with each successive run
And for each given query, we have the following OLAP report by using Crystal Report: 1. Find all the stores along with city, state, phone, description, size, weight and unit price that hold a particular item of stock. Choose the item to see report:
Summary report:
2. Find all the orders along with customer name and order date that can be fulfilled by a given store. Input store:
Summary report:
3. Find all stores along with city name and phone that hold items ordered by given customer: Enter the customer:
Summary report:
4. Find the headquarter address along with city and state of all stores that hold stocks of an item above a particular level. Enter item and level:
Summary report:
5. For each customer order, show the items ordered along with description, store id and city name and the stores that hold the items. Enter the order:
Summary report:
6. Find the city and the state in which a given customer lives: Enter the customer name:
Report:
7. Find the stock level of a particular item in all stores in a particular city. Enter the item and city:
Summary item:
8. Find the items, quantity ordered, customer, store and city of an order: Enter the order:
Summary report:
9. Find the walk in customers, mail order customers and dual customers (both walk-in and mail order): Dual customer:
Walk in customer:
7. Data Verification
In this part, we verify the OLAP reports source relational tables data of the enterprise
8. Conclusion
Data warehouse design not only helps the enterprise to store, organize data in a safe and professional way, but also helps to analyzing data and generate dynamic report. The OLAP report by using Crystal Report helps to retrieve the useful information in real time and in a dynamic way by using typical OLAP operation, such as drill down, roll up