Practical :- 1
Data Mining
Act of excavation in th2222e data from which patterns can be extracted
Alternative name: Knowledge discovery in databases (KDD)
Multiple disciplines: database, statistics, artificial intelligence
Fastly maturing technology
Unlimited applicability
Define a model
Data Mining
Train the Training
Management
model Data
System
(DMMS)
Mining
Model
Prediction using the model
Prediction Input
Data
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Sequence Analysis
Time-series Analysis
Text categorization
Advanced insights discovery
Others
The data mining tutorial is designed to walk you through the process of creating
data mining models in Microsoft SQL Server 2005. The data mining algorithms and tools
in SQL Server 2005 make it easy to build a comprehensive solution for a variety of
projects, including market basket analysis, forecasting analysis, and targeted mailing
analysis. The scenarios for these solutions are explained in greater detail later in the
tutorial.
The most visible components in SQL Server 2005 are the workspaces that you use
to create and work with data mining models. The online analytical processing (OLAP)
and data mining tools are consolidated into two working environments: Business
Intelligence Development Studio and SQL Server Management Studio.
Using Business Intelligence Development Studio, you can develop an Analysis
Services project disconnected from the server. When the project is ready, you can deploy
it to the server. You can also work directly against the server. The main function of SQL
Server Management Studio is to manage the server. Each environment is described in
more detail later in this introduction.
All of the data mining tools exist in the data mining editor. Using the editor you
can manage mining models, create new models, view models, compare models, and
create predictions based on existing models.
After you build a mining model, you will want to explore it, looking for
interesting patterns and rules. Each mining model viewer in the editor is customized to
explore models built with a specific algorithm.
Often your project will contain several mining models, so before you can use a
model to create predictions, you need to be able to determine which model is the most
accurate. For this reason, the editor contains a model comparison tool called the Mining
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Accuracy Chart tab. Using this tool you can compare the predictive accuracy of your
models and determine the best model.
To create predictions, you will use the Data Mining Extensions (DMX) language.
DMX extends SQL, containing commands to create, modify, and predict against mining
models. Because creating a prediction can be complicated, the data mining editor
contains a tool called Prediction Query Builder, which allows you to build queries using a
graphical interface. You can also view the DMX code that is generated by the query
builder.
The key to creating a mining model is the data mining algorithm. The algorithm finds
patterns in the data that you pass it, and it translates them into a mining model it is the
engine behind the process. SQL Server 2005 includes nine algorithms:
1. Microsoft Decision Trees
2. Microsoft Clustering
3. Microsoft Nave Bayes
4. Microsoft Sequence Clustering
5. Microsoft Time Series
6. Microsoft Association
7. Microsoft Neural Network
8. Microsoft Linear Regression
9. Microsoft Logistic Regression
Using a combination of these nine algorithms, you can create solutions to
common business problems. Some of the most important steps in creating a data mining
solution are consolidating, cleaning, and preparing the data to be used to create the
mining models. SQL Server 2005 includes the Data Transformation Services (DTS)
working environment, which contains tools that you can use to clean, validate, and
prepare your data. The audience for this tutorial is business analysts, developers, and
database administrators who have used data mining tools before and are familiar with
data mining concepts.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
You can integrate your Analysis Services project with a variety of other
business intelligence projects encapsulating your entire solution into a single
view.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
create prediction queries against it. Analysis Services can use datasets from both
relational and OLAP databases, and includes a variety of algorithms that you can use to
investigate that data.
SQL Server 2005 provides different environments and tools that you can use for
data mining. The following sections outline a typical process for creating a data mining
solution, and identify the resources to use for each step.
Creating an Analysis Services Project
To create a data mining solution, you must first create a new Analysis Services
project, and then add and configure a data source and a data source view for the project.
The data source defines the connection string and authentication information with which
to connect to the data source on which to base the mining model. The data source view
provides an abstraction of the data source, which you can use to modify the structure of
the data to make it more relevant to your project.
Adding Mining Structures to an Analysis Services Project
After you have created an Analysis Services project, you can add mining
structures, and one or more mining models that are based on each structure. A mining
structure, including tables and columns, is derived from an existing data source view or
OLAP cube in the project. Adding a new mining structure starts the Data Mining Wizard,
which you use to define the structure and to specify an algorithm and training data for use
in creating an initial model based on that structure.
You can use the Mining Structure tab of Data Mining Designer to modify existing
mining structures, including adding columns and nested tables.
Working with Data Mining Models
Before you can use the mining models you define, you must process them so that
Analysis Services can pass the training data through the algorithms to fill the models.
Analysis Services provides several options for processing mining model objects,
including the ability to control which objects are processed and how they are processed.
After you have processed the models, you can investigate the results and make
decisions about which models perform the best. Analysis Services provides viewers for
each mining model type, within the Mining Model Viewer tab in Data Mining Designer,
which you can use to explore the mining models. Analysis Services also provides tools, in
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
the Mining Accuracy Chart tab of the designer, that you can use to directly compare
mining models and to choose the mining model that works best for your purpose. These
tools include a lift chart, a profit chart, and a classification matrix.
Creating Predictions
The main goal of most data mining projects is to use a mining model to create
predictions. After you explore and compare mining models, you can use one of several
tools to create predictions. Analysis Services provides a query language called Data
Mining Extensions (DMX) that is the basis for creating predictions. To help you build
DMX prediction queries, SQL Server provides a query builder, available in SQL Server
Management Studio and Business Intelligence Development Studio, and DMX templates
for the query editor in Management Studio. Within BI Development Studio, you access
the query builder from the Mining Model Prediction tab of Data Mining Designer.
SQL Server Management Studio
After you have used BI Development Studio to build mining models for your data
mining project, you can manage and work with the models and create predictions in
Management Studio.
SQL Server Reporting Services
After you create a mining model, you may want to distribute the results to a wider
audience. You can use Report Designer in Microsoft SQL Server 2005 Reporting
Services (SSRS) to create reports, which you can use to present the information that a
mining model contains. You can use the result of any DMX query as the basis of a report,
and can take advantage of the parameterization and formatting features that are available
in Reporting Services.
Working Programmatically with Data Mining
Analysis Services provides several tools that you can use to programmatically
work with data mining. The Data Mining Extensions (DMX) language provides
statements that you can use to create, train, and use data mining models. You can also
perform these tasks by using a combination of XML for Analysis (XMLA) and Analysis
Services Scripting Language (ASSL), or by using Analysis Management Objects (AMO).
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
You can access all the metadata that is associated with data mining by using data mining
schema rowsets. For example, you can use schema rowsets to determine the data types
that an algorithm supports, or the model names that exist in a database.
Data Mining Concepts
Data mining is frequently described as "the process of extracting valid, authentic,
and actionable information from large databases." In other words, data mining derives
patterns and trends that exist in data. These patterns and trends can be collected together
and defined as a mining model. Mining models can be applied to specific business
scenarios, such as:
Forecasting sales.
Targeting mailings toward specific customers.
Determining which products are likely to be sold together.
Finding sequences in the order that customers add products to a shopping cart.
An important concept is that building a mining model is part of a larger process
that includes everything from defining the basic problem that the model will solve, to
deploying the model into a working environment. This process can be defined by using
the following six basic steps:
1. Defining the Problem
2. Preparing Data
3. Exploring Data
4. Building Models
5. Exploring and Validating Models
6. Deploying and Updating Models
The following diagram describes the relationships between each step in the
process, and the technologies in Microsoft SQL Server 2005 that you can use to complete
each step.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Although the process that is illustrated in the diagram is circular, each step does
not necessarily lead directly to the next step. Creating a data mining model is a dynamic
and iterative process. After you explore the data, you may find that the data is insufficient
to create the appropriate mining models, and that you therefore have to look for more
data. You may build several models and realize that they do not answer the problem
posed when you defined the problem, and that you therefore must redefine the problem.
You may have to update the models after they have been deployed because more data has
become available. It is therefore important to understand that creating a data mining
model is a process, and that each step in the process may be repeated as many times as
needed to create a good model.
SQL Server 2005 provides an integrated environment for creating and working
with data mining models, called Business Intelligence Development Studio. The
environment includes data mining algorithms and tools that make it easy to build a
comprehensive solution for a variety of projects.
Defining the Problem
The first step in the data mining process, as highlighted in the following diagram,
is to clearly define the business problem.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
This step includes analyzing business requirements, defining the scope of the
problem, defining the metrics by which the model will be evaluated, and defining the
final objective for the data mining project. These tasks translate into questions such as the
following:
What are you looking for?
Which attribute of the dataset do you want to try to predict?
What types of relationships are you trying to find?
Do you want to make predictions from the data mining model or just look for
interesting patterns and associations?
How is the data distributed?
How are the columns related, or if there are multiple tables, how are the tables
related?
To answer these questions, you may have to conduct a data availability study, to
investigate the needs of the business users with regard to the available data. If the data
does not support the needs of the users, you may have to redefine the project.
Preparing Data
The second step in the data mining process, as highlighted in the following
diagram, is to consolidate and clean the data that was identified in the Defining the
Problem step.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Microsoft SQL Server 2005 Integration Services (SSIS) contains all the tools that
you need to complete this step, including transforms to automate data cleaning and
consolidation.
Data can be scattered across a company and stored in different formats, or may
contain inconsistencies such as flawed or missing entries. For example, the data might
show that a customer bought a product before that customer was actually even born, or
that the customer shops regularly at a store located 2,000 miles from her home. Before
you start to build models, you must fix these problems. Typically, you are working with a
very large dataset and cannot look through every transaction. Therefore, you have to use
some form of automation, such as in Integration Services, to explore the data and find the
inconsistencies.
Exploring Data
The third step in the data mining process, as highlighted in the following diagram,
is to explore the prepared data.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
You must understand the data in order to make appropriate decisions when you
create the models. Exploration techniques include calculating the minimum and
maximum values, calculating mean and standard deviations, and looking at the
distribution of the data. After you explore the data, you can decide if the dataset contains
flawed data, and then you can devise a strategy for fixing the problems.
Data Source View Designer in BI Development Studio contains several tools that you can
use to explore data.
Building Models
The fourth step in the data mining process, as highlighted in the following
diagram, is to build the mining models.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Before you build a model, you must randomly separate the prepared data into
separate training and testing datasets. You use the training dataset to build the model, and
the testing dataset to test the accuracy of the model by creating prediction queries. You
can use the Percentage Sampling Transformation in Integration Services to split the
dataset.
You will use the knowledge that you gain from the Exploring Data step to help
define and create a mining model. A model typically contains input columns, an
identifying column, and a predictable column. You can then define these columns in a
new model by using the Data Mining Extensions (DMX) language, or the Data Mining
Wizard in BI Development Studio.
After you define the structure of the mining model, you process it, populating the
empty structure with the patterns that describe the model. This is known as training the
model. Patterns are found by passing the original data through a mathematical algorithm.
SQL Server 2005 contains a different algorithm for each type of model that you can
build. You can use parameters to adjust each algorithm.
A mining model is defined by a data mining structure object, a data mining model
object, and a data mining algorithm.
Microsoft SQL Server 2005 Analysis Services (SSAS) includes the following algorithms:
Microsoft Decision Trees Algorithm
Microsoft Clustering Algorithm
Microsoft Naive Bayes Algorithm
Microsoft Association Algorithm
Microsoft Sequence Clustering Algorithm
Microsoft Time Series Algorithm
Microsoft Neural Network Algorithm (SSAS)
Microsoft Logistic Regression Algorithm
Microsoft Linear Regression Algorithm
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
You do not want to deploy a model into a production environment without first
testing how well the model performs. Also, you may have created several models and will
have to decide which model will perform the best. If none of the models that you created
in the Building Models step perform well, you may have to return to a previous step in
the process, either by redefining the problem or by reinvestigating the data in the original
dataset.
You can explore the trends and patterns that the algorithms discover by using the
viewers in Data Mining Designer in BI Development Studio. You can also test how well
the models create predictions by using tools in the designer such as the lift chart and
classification matrix. These tools require the testing data that you separated from the
original dataset in the model-building step.
Deploying and Updating Models
The last step in the data mining process, as highlighted in the following diagram,
is to deploy to a production environment the models that performed the best.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
After the mining models exist in a production environment, you can perform
many tasks, depending on your needs. Following are some of the tasks you can perform:
Use the models to create predictions, which you can then use to make business
decisions. SQL Server provides the DMX language that you can use to create
prediction queries, and Prediction Query Builder to help you build the queries.
Embed data mining functionality directly into an application. You can include
Analysis Management Objects (AMO) or an assembly that contains a set of
objects that your application can use to create, alter, process, and delete mining
structures and mining models. Alternatively, you can send XML for Analysis
(XMLA) messages directly to an instance of Analysis Services.
Use Integration Services to create a package in which a mining model is used to
intelligently separate incoming data into multiple tables. For example, if a
database is continually updated with potential customers, you could use a mining
model together with Integration Services to split the incoming data into customers
who are likely to purchase a product and customers who are likely to not purchase
a product.
Create a report that lets users directly query against an existing mining model.
Updating the model is part of the deployment strategy. As more data comes into
the organization, you must reprocess the models, thereby improving their
effectiveness.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Practical :- 2
Aim: Design and Create cube by identifying measures and dimensions for Star Schema.
Software Required: Analysis services- SQL Server-2005.
Knowledge Required: Data cube
Theory/Logic:
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
8. Wait for the Cube Wizard to analyze the data and then click Next.
9. The Wizard will get most of the analysis right, but you can fine-tune it a
bit. Select DimTime in the Time Dimension combo box. Uncheck the Fact
checkbox on the line for the dbo.DimTime table. This will allow you to
analyze this dimension using standard time periods.
10. Click Next.
11. On the Select Time Periods page, use the combo boxes to match time
property names to time columns according to Table 2-1.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
The Cube Browser is a drag-and-drop environment. If youve worked with pivot tables in
Microsoft Excel, you should have no trouble using the Cube browser. The pane to the left
includes all of the measures and dimensions in your cube, and the pane to the right gives
you drop targets for these measures and dimensions. Among other operations, you can:
Drop a measure in the Totals/Detail area to see the aggregated data for that
measure.
Drop a dimension or level in the Row Fields area to summarize by that level or
dimension on rows.
Drop a dimension or level in the Column Fields area to summarize by that level or
dimension on columns
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Drop a dimension or level in the Filter Fields area to enable filtering by members
of that dimension or level.
Use the controls at the top of the report area to select additional filtering
expressions.
To see the data in the cube you just created, follow these steps:
1. Right-click on the cube in Solution Explorer and select Browse.
2. Expand the Measures node in the metadata panel (the area at the left of the user
interface).
3. Expand the Fact Finance node.
4. Drag the Amount measure and drop it on the Totals/Detail area.
5. Expand the Dim Account node in the metadata panel.
6. Drag the Account Description property and drop it on the Row Fields area.
7. Expand the Dim Time node in the metadata panel.
8. Drag the Calendar Year-Calendar Quarter-Month Number of Year hierarchy and
drop it on the Column Fields area.
9. Click the + sign next to year 2001 and then the + sign next to quarter 3.
10. Expand the Dim Scenario node in the metadata panel.
11. Drag the Scenario Name property and drop it on the Filter Fields area.
12. Click the dropdown arrow next to scenario name. Uncheck all of the checkboxes
except for the one next to the Budget name.
Figure 2-7 shows the result. The Cube Browser displays month-by-month budgets by
account for the third quarter of 2001. Although you could have written queries to extract
this information from the original source data, its much easier to let Analysis Services do
the heavy lifting for you.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Practical :- 3
Aim: Design and Create cube by identifying measures and dimensions for Snowflake
schema.
Software Required: Analysis services- SQL Server-2005.
Knowledge Required: Data cube
Theory/Logic:
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
18. Select the AdventureWorksDW database. Figure 3-2 shows the filled-in
Connection Manager dialog box.
19. Click OK to dismiss the Connection Manager dialog box.
20. Click Next.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Analysis Services
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
includes all of the measures and dimensions in your cube, and the pane to the right gives
you drop targets for these measures and dimensions. Among other operations, you can:
Drop a measure in the Totals/Detail area to see the aggregated data for that
measure.
Drop a dimension or level in the Row Fields area to summarize by that level or
dimension on rows.
Drop a dimension or level in the Column Fields area to summarize by that level or
dimension on columns
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Drop a dimension or level in the Filter Fields area to enable filtering by members
of that dimension or level.
Use the controls at the top of the report area to select additional filtering
expressions.
To see the data in the cube you just created, follow these steps:
13. Right-click on the cube in Solution Explorer and select Browse.
14. Expand the Measures node in the metadata panel (the area at the left of the user
interface).
15. Expand the Fact Finance node.
16. Drag the Amount measure and drop it on the Totals/Detail area.
17. Expand the Dim Account node in the metadata panel.
18. Drag the Account Description property and drop it on the Row Fields area.
19. Expand the Dim Time node in the metadata panel.
20. Drag the Calendar Year-Calendar Quarter-Month Number of Year hierarchy and
drop it on the Column Fields area.
21. Click the + sign next to year 2001 and then the + sign next to quarter 3.
22. Expand the Dim Scenario node in the metadata panel.
23. Drag the Scenario Name property and drop it on the Filter Fields area.
24. Click the dropdown arrow next to scenario name. Uncheck all of the checkboxes
except for the one next to the Budget name.
Figure 3-7 shows the result. The Cube Browser displays month-by-month budgets by
account for the third quarter of 2001. Although you could have written queries to extract
this information from the original source data, its much easier to let Analysis Services do
the heavy lifting for you.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Practical : - 4
Aim: Design and Create cube by identifying measures and dimensions for Design storage
for cube using storage mode MOLAP, ROLAP and HOALP.
Software Required: Analysis services- SQL Server-2005.
Knowledge Required: MOLAP, ROLAP, HOLAP
Theory/Logic:
Usage-Based You can select a storage mode and optimize aggregation design
Optimization Wizard based on queries that have been sent to the cube.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
MOLAP
The MOLAP storage mode causes the aggregations of the partition and a copy of its
source data to be stored in a multidimensional structure in Analysis Services, which
structure is highly optimized to maximize query performance. This can be storage on the
computer where the partition is defined or on another Analysis Services computer.
Storing data on the computer where the partition is defined creates a local partition.
Storing data on another Analysis Services computer creates a remote partition. The
multidimensional structure that stores the partition's data is located in a subfolder of the
Data folder of the Analysis Services program files or another location specified during
setup of Analysis Services.
Because a copy of the source data resides in the Analysis Services data folder, queries can
be resolved without accessing the partition's source data even when the results cannot be
obtained from the partition's aggregations. The MOLAP storage mode provides the most
rapid query response times, even without aggregations, but which can be improved
substantially through the use of aggregations.
As the source data changes, objects in MOLAP storage must be processed periodically to
incorporate those changes. The time between one processing and the next creates a
latency period during which data in OLAP objects may not match the current data. You
can incrementally update objects in MOLAP storage without downtime. However, there
may be some downtime required to process certain changes to OLAP objects, such as
structural changes. You can minimize the downtime required to update MOLAP storage
by updating and processing cubes on a staging server and using database synchronization
to copy the processed objects to the production server. You can also use proactive caching
to minimize latency and maximize availability while retaining much of the performance
advantage of MOLAP storage.
ROLAP
The ROLAP storage mode causes the aggregations of the partition to be stored in tables
in the relational database specified in the partition's data source. Unlike the MOLAP
storage mode, ROLAP does not cause a copy of the source data to be stored in the
Analysis Services data folders. When results cannot be derived from the aggregations or
query cache, the fact table in the data source is accessed to answer queries. With the
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
ROLAP storage mode, query response is generally slower than that available with the
other MOLAP or HOLAP storage modes. Processing time is also typically slower. Real-
time ROLAP is typically used when clients need to see changes immediately. No
aggregations are stored with real-time ROLAP. ROLAP is also used to save storage space
for large datasets that are infrequently queried, such as purely historical data.
Note: When using ROLAP, Analysis Services may return incorrect information related
to the unknown member if a join is combined with a group by, which eliminates
relational integrity errors rather than returning the unknown member value.
If a partition uses the ROLAP storage mode and its source data is stored in SQL Server
2005 Analysis Services (SSAS), Analysis Services attempts to create indexed views to
contain aggregations of the partition. If Analysis Services cannot create indexed views, it
does not create aggregation tables. While Analysis Services handles the session
requirements for creating indexed views on SQL Server 2005 Analysis Services (SSAS),
the creation and use of indexed views for aggregations requires the following conditions
to be met by the ROLAP partition and the tables in its schema:
The partition cannot contain measures that use the Min or Max aggregate
functions.
Each table in the schema of the ROLAP partition must be used only once. For
example, the schema cannot contain "dbo"."address" AS "Customer Address" and
"dbo"."address" AS "SalesRep Address".
Each table must be a table, not a view.
All table names in the partition's schema must be qualified with the owner name,
for example, "dbo"."customer".
All tables in the partition's schema must have the same owner; for example, you
cannot have a FromClause like : "tk"."customer", "john"."store", or
"dave"."sales_fact_2004".
The source columns of the partition's measures must not be nullable.
All tables used in the view must have been created with the following options set
to ON:
o ANSI_NULLS
o QUOTED_IDENTIFIER
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
The total size of the index key, in SQL Server 2005, cannot exceed 900 bytes.
SQL Server 2005 will assert this condition based on the fixed length key columns
when the CREATE INDEX statement is processed. However, if there are variable
length columns in the index key, SQL Server 2005 will also assert this condition
for every update to the base tables. Because different aggregations have different
view definitions, ROLAP processing using indexed views can succeed or fail
depending on the aggregation design.
The session creating the indexed view must have the following options on:
ARITHABORT, CONCAT_NULL_YEILDS_NULL, QUOTED_IDENTIFIER,
ANSI_NULLS, ANSI_PADDING, and ANSI_WARNING. This setting can be
made in SQL Server Management Studio.
The session creating the indexed view must have the following option off:
NUMERIC_ROUNDABORT. This setting can be made in SQL Server
Management Studio.
HOLAP
The HOLAP storage mode combines attributes of both MOLAP and ROLAP. Like
MOLAP, HOLAP causes the aggregations of the partition to be stored in a
multidimensional structure on an Analysis Services server computer. HOLAP does not
cause a copy of the source data to be stored. For queries that access only summary data
contained in the aggregations of a partition, HOLAP is the equivalent of MOLAP.
Queries that access source data, such as a drilldown to an atomic cube cell for which
there is no aggregation data, must retrieve data from the relational database and will not
be as fast as if the source data were stored in the MOLAP structure.
Partitions stored as HOLAP are smaller than equivalent MOLAP partitions and respond
faster than ROLAP partitions for queries involving summary data. HOLAP storage mode
is generally suitable for partitions in cubes that require rapid query response for
summaries based on a large amount of source data. However, where users generate
queries that must touch leaf level data, such as for calculating median values, MOLAP is
generally a better choice.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Steps:
1. In the Analysis service object explorer tree pane, expand the Cubes folder, right-
click the created cube, and then click Property.
2. In the property wizard, select proactive caching and then select option button.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
3. Select MOLAP/HOLAP/ROLAP as your data storage type, and then click Next.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
Practical :- 6
Aim: Create and use Excel Pivot Table report based on data cube.
Software Required: Analysis services- SQL Server-2005.
Knowledge Required: Data Mining Concepts
Theory/Logic:
1. Start Microsoft Excel.
2. When the blank spreadsheet appears, on the Data menu, select from other sources
-> from analysis services.
3. Data connection wizard get open in which enter server name and login credentials.
Click next.
4. Select the database that contain data you want. Click next. Click finish.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
5. In import data dialogue box select PivotTable and PivotChart Report. Select the
portion of spreadsheet where you want to place data.
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
6. You are returned to the Excel spreadsheet, where you can drag dimensions in
columns and rows and analyze data .
BITS[CSE] Page
Business Intelligence & Data Mining 100050131008
BITS[CSE] Page