Anda di halaman 1dari 25

BUSINESS ANALYTICS

PRACTICAL FILE
ON
SQL SERVER 2012

SCHOOL OF MANAGEMENT STUDIES,


PUNJABI UNIVERSITY
PATIALA
Submitted to:
Dr. Sahil Raj

Submitted by:

Sapna
MBA-II (sem-3rd)
Roll No-15421184
Section D

Creating Data Mart using SQL Server Data Tools


The process of creating a schema using the SQL server management studio is
redundant and time-consuming. Therefore, most often SSDT is used which will
automatically define a schema (star schema by deafult) and will aid in creation of
dimension and fact tables.
1.1The first step is to create a database for Toothworks. For this open SQL Server
Management Studio from start programs.Open the Windows Start button and
choose All Programs

1.2The following window opens. Here, the server types should be Database engine,
select the appropriate name for the target server. Click Connect.

1.3 In the left pane, right click on databases and select new database.

1.4 In the new database window opened, name the database as Toothworks. Next,
click on Options from the left pane, and change the recovery model to Simple.
Keep the other settings at default. Click OK.
3

1.5 This will create a database by the name Toothworks which is now visible in
the database drop down list.
1.6 Next close the management studio and open SSDT from Start.

1.7 In the start page of SSDT, click on New Project.

1.8 Select Analysis Service from Business Intelligence group on the left pane and
Analysis Service Multidimensional and data mining project. Rename it to
ToothworksDM. Check Create directory.

1.9 Right click on new cube from the solution explorer on the right side. This opens
the cubs wizard as shown:

1.10 Click next on welcome screen. Then select Generate tables in the data source
radio button and click next. Now in Define new measures window, and Total Sales
in measure name, Sales Information in measure group, change data type to
currency and keep aggregation default. All measures have to be grouped into a
particular measure group.

1.11 The next page allows you to add dimensions to the cube. Uncheck time
dimension, we will add this later. In add new dimensions: Write Product, pass
TAB twice, enter Customer as a second dimension name. Note that all these
would automatically create a table with the same name as the dimension name
in the database. Click Next.

1.12 The next page lets the user select which dimension relates to which measure

group. Remember that we need to analyse total sales by both customer and
product dimension. So check against both the fields.

1.13

Next figure enlist summary of cube. Check generate schema now, which

will enable creation of tables in the database. Click finish.

1.14 Click next on welcome screen, in specify target window, check on create a
new data source view. This is where the schema would be generated and the
under lying tables would be created in the database. Click on New in Data
Source view.

1.15 This opens the data source wizard. In how to define the connection, select
Create data source based on existing or new connection and click on New.

1.16 This opens the connection manager window which connects to the database.
Enter the server name where the database was created. Keep windows
authentication checked. For the database name, select toothworks from drop
down list.

1.17 Click OK. This takes you back to data source wizard. The connection string
with toothworks database should now appear. With this string selected, click
Next. In the impersonation window, provide a valid windows user name and
password. These credentials would e used when deploying or processing cube.

1.18 Click Next and then Finish the wizard. This takes you back to schema
generation wizard. The10data source Toothworks now appears automatically.
On the next screen. Database Schema Options, uncheck referential.

1.19 We keep the default naming conventions and click finish. Now the progress
for the schema is shown. Once it shows generation complete successfully, click
Cloth.

1.20 The schema created 11


for the cube is as shown below. However, it does not
contain the attributes we need. These would be added using the SQL Server
management studio. As is evident, the ID keys for each dimension have been

generated automatically. Also, the primary key for each dimension is a foreign
key in the fact table by default.

1.21 Now open SSMS and expand database. In table, it has created the tables
which we added as dimension in the cube wizard. Now right click on customer
table and select design.

12
1.22 Add the following fields: Age, int, allow nulls; Marital Status, char (10),
allow nulls.

1.23 In a similar manner, add product name and size as navrchar(20). For the sales
fact table, add Date of Purchase, date and click on save. With the control key
held, click on FK_product, FK_Curstomer and date of purchase, and click on
primary key on left side of ribbon. This would be saved as a compound primary
key.

13

1.24 Go back to SSDT, double click ToothworksDM data source view and right
click anywhere in the schema tab. Click on refresh and click OK to accept the
changes.

1.25 The process of building a cube with dimensions in complete. Now we need to
load data into the tables or this data mart. This would be done in hands-on
project of the next chapter on ETL.

14

2. Extraction, Transformation and Loading (ETL)


For any effective analysis to be run on Toothworks, we would need to first load the
multidimensional model created in previous hands on with the data. This hands-on
project would let you do just that.
The fact table has a column Date of Purchase, which can be used for time-based
analysis. For this purpose, we would need to add another dimention Time. After
successfully loading the data we will add another dimension Time from the fact
table data to our Toothworks cube, using a feature of SQL Server 2012 called
Named Calculations.
2.1 Using Integration Services to Load Data Mart
This hands-on exercise will help you to understand how to load data into fact as well
as dimension table using SQL Server 2012 Integration services. First, we will load
the dimension tables and then load fact table. The source of data (extraction) is .csv
sheets maintained by Toothworks for each dimension separately. They will be loaded
into SQL server tables we created in the previous hands-on exercise.
Use the steps listed below to load data into the data mart created in the previous
exercise:
2.1.1

Open new project in SSDT and select integration services from the
business intelligence group. Select integration service project as shown.
Rename the project to ToothworksLoad the uncheck Create directory.

15

2.1.2

In the solution explore window, rename package.dtsx in SSIS packages to


DimensionLoad.dtsx. Here, we will load the data into dimension tables
from csv sheets. So from SSIS toolbox in the left pane, drag and drop data
flow task item onto control flow tab.

2.1.3

Rename it to load customer. Double click on this and data flow tab will
open. We wish to load customer table (which is an SQL server or OLEDB
destination) from Customer.csv which is a flat file source item. So we need
to create a connection for the same. In the connection manager tray below
in data flow tab, right click to add the new flat file connection.

16

2.1.4

Rename the connection manager name or Customer flat file and browse for
the appropriate location of customer flat file. Check column names in the
first data row. You can preview the data in columns tab. In the advanced
tab, you need to make certain changes. Change the data type of age and
customer ID to four byte signed integer. Keep the string data type for
marital status. Click OK after changes have been made.

2.1.5

Next, create on OLE DB connection for destination table. Again, right click
on the connection manager tray and select new OLE DB Connection.
Select toothworks database for the same and click OK.

17

2.1.6

From other sources in SSIS toolbox, drag and drop the flat file source item
onto the data flow tab. Double click on this and flat file source editor
opens. In the columns button, make sure all the columns are checked. You
can preview the data using the preview button. Click OK.

2.1.7

Next, drag and drop the OLE DB destination item from destinations and
connect with blue arrow from source item. Double click this item.
Destination editor window opens. HP-PC.toothworks should already be
selected in the OLE DB connection manager. Select dbo.Customer from the
name of the table or view. Click mappings from left pane of destination
editor. The mappings should be done automatically due to similar names.
Else drag and drop from input onto output as shown below.

18

2.1.8

Click OK. Go back to control flow tab and drag and drop another data flow
task item and rename to LoadProduct. Drag and drop the green arrow
dangling from LoadCustomer item onto LoadProduct item. This indicates
precedence. Only when loading customer dimension is successfully
completed, will loading of product dimension start.

2.1.9

Double click to open data flow tab. Repeat the steps as before to load
customer dimension. First, create another flat file connection with
Product.csv. Then add flat file source, change data types from advanced tab
and then add the destination item and check mappings. One error would be
highlighted here: conversion of Unicode and non-Unicode strings. This is
where the transformations are applicable.

2.2 Using Data Conversion Transformation to Load data into Destination


2.2.1

Because the data type of the product name and size is nvarchar, SSIS treats
these data from CSV files as non-Unicode strings, whereas SQL tables are
defined as Unicode strings. Simply changing the data type in advanced tab
does not serve our purpose. So we need to use data conversion (as
transformation) as shown in the snapshot. Drag and drop data conversion
item onto data flow19tab and connect with blue arrow from source item.

2.2.2

Double click this item to open transformation editor window. Here, both
product name and product size have non-Unicode string data type. So
check the two columns, rename in output alias by prefixing nu. Change the
data type to Unicode string [DT_WSTR] and length as required. Click OK.

20

2.2.3

Next drag and drop OLE DB destination item, and drop blue arrow onto it.
Double click to open destination editor. Select product table from
connection manager and check the mappings. They should be as below.
Click OK. Go back to control flow and click Save ALL.

2.3Executing ETL Package


2.3.1

Now, we need to execute the dimension load package created. Click on


debug from the ribbon above.

21

2.3.2

Upon successful execution, the window should appear as shown:

This implies the customer and product dimension have been successfully loaded.
2.3.3

You can verify the same in SSMS by right clicking on the table and select
top 1000 rows.

Or create a query as shown in the figure:

22

2.4 Loading Fact Table


2.4.1

Right click on SSIS packages and select new package.

Rename the same to FactLoad.dtsx. Again, drag and drop data flow task items on to
the control flow tab. Double click to open data flow tab.
2.4.2

In data flow tab, right click in the connection manager tray, and select new
flat file connection.
23

Name the connection manager as sales fact file. Browse for the file, review the
columns.
2.4.3

In the advanced tab, change the data type of customer id, product id, and
total sales to four byte signed integer, and change the date data type to date.

2.4.4

Drag and drop flat file source item and double click to connect to sales fact
file. Review the columns.

2.4.5

Create a new OLE DB connection from the tray. Select toothworks

2.4.6

database from the editor window.


24 OLE DB destination and connect to blue arrow from
Next drag and drop
source. Double click on this and select sales information table in the editor
window. Check mappings, they should be as shown below:

Click OK and then Save ALL. Again, debug the same as before till successful
completion appears.

This part of the exercise will let you understand how to add another dimension
Time to existing cube. This dimension would be created from the field Date of
Purchase contained in the fact table. This would be done by first creating named
calculations and then create time dimension based on this.
The attributes of this dimension are year, month and date. The hierarchy of time
dimension would be as under:
Year Month Date.
25

As year and month are not directly available, they need to be extracted from the
data of purchase field using named calculations in SQL Server 2012.

Anda mungkin juga menyukai