Sandro Bimonte
Pascal Wehrle
1
A tour of Mondrian+JPivot
• Introduction
• Installation and configuration
• How to design a Cube in Mondrian
• Aggregates and Caching
• Mondrian and XMLA
• BIOLAP
• Pentaho
2
Introduction
3
4
3 tier architecture
5
Functionality – presentation tier
• Web interface in HTML rendered by
Browser
• Javascript & HTML Forms for interaction
• Managed by Web Component Framework
(WCF) on the server
6
Functionality – application logic tier
• Pivot tables and OLAP operations
managed by JPivot
• Execution of MDX queries by Mondrian
• Hosted by Tomcat Servlet/JSP container
7
Functionality – data tier
• Relational DBMS stores data according to
ROLAP storage model
• SQL queries generated by Mondrian are
executed by DBMS
• Computing of aggregates on data
performed by DBMS as part of query
8
Functionality – Features
• Mondrian:
– Manages the data warehouse’s meta-data
– Caches computed results for future use
– Usage of pre-computed aggregates
• JPivot/WCF:
– Provides advanced OLAP operations on
warehouse data
– Visualization of warehouse data using charts
9
History behind Mondrian+JPivot
• Mondrian, started as open source project
by Julian Hyde, who also works on
• The Eigenbase Project
(www.eigenbase.org), an open-source
platform for building data management
systems
• Jpivot, started by developers working for
Tonbeller® AG Business Intelligence and
Financial Solutions
(www.tonbeller.com)
Installation and configuration
11
DBMS: PostgreSQL - Installation
• Download from:
http://www.postgresql.org
• Installed version: 8.1.2-1
• Installation type:
– Local standalone server (run as a service)
– Allow only local connections
– JDBC driver for communication with Java applications
• Operating System:
Microsoft Windows XP Professional SP2
12
DBMS: PostgreSQL - Installation
13
DBMS: PostgreSQL - Installation
14
DBMS: PostgreSQL - Installation
15
DBMS: PostgreSQL - Configuration
• Create dedicated user account
– Creation of unprivileged user “foodmarti”
• Create an example database
– Add a database “Foodmart” with owner
foodmarti
• Load example data into the database
– Use provided MondrianFoodMartLoader to
load data warehouse into example database
Foodmart
16
DBMS: PostgreSQL - Configuration
17
DBMS: PostgreSQL - Configuration
18
DBMS: PostgreSQL - Configuration
19
DBMS: PostgreSQL - Configuration
• The easiest way to use
MondrianFoodMartLoader:
– Download & unzip Eclipse IDE (special
WebTools package – useful later), from
http://www.eclipse.org/webtools/
– Download & unzip Mondrian (2.0.1)
• Unzip the mondrian.war file in mondrian-2.0.1\lib
20
DBMS: PostgreSQL - Configuration
• Start Eclipse and create a new Java
project from existing sources using the
mondrian-2.0.1 folder as root
21
DBMS: PostgreSQL - Configuration
• Add the following jars to the build path:
– PostgreSQL JDBC Driver
– Apache log4j
– Eigenbase XOM
– Eigenbase properties
22
DBMS: PostgreSQL - Configuration
• Finally, run :
mondrian.test.loader.MondrianFoodMartLoader
-verbose -tables -data –indexes
-jdbcDrivers=org.postgresql.Driver
-outputJdbcURL=jdbc:postgresql://localhost/Foodmart
-outputJdbcUser=foodmarti
-outputJdbcPassword=footest
-inputFile=demo/FoodMartCreateData.sql
23
DBMS: PostgreSQL - Configuration
24
Tomcat Servlet/JSP container -
Installation
• Download from:
http://tomcat.apache.org
• Installed version: 5.5.15
• Installation type:
– standard server (run as a service)
– Integrated with Eclipse WebTools
• Operating System:
Microsoft Windows XP Professional SP2
25
Tomcat Servlet/JSP container -
Installation
26
Tomcat Servlet/JSP container -
Installation
27
Tomcat Servlet/JSP container -
Configuration
• Create a new Eclipse project of type “Server”
and follow instructions
• Specify the server type (Apache Tomcat 5.5),
host (localhost) and runtime configuration:
28
Mondrian+JPivot - Installation
• Download from:
http://jpivot.sourceforge.net
• Installed version: 1.5.0
• Installation type:
– Import of deployment package as Eclipse
project
– Use Mondrian included with JPivot package
29
Mondrian+JPivot - Installation
• Download&unzip jpivot-1.5.0.zip
• In Eclipse, select File->Import->WAR File
• Select jpivot-1.5.0\jpivot.war as input file
30
Mondrian+JPivot - Installation
32
Mondrian+JPivot - Configuration
• Edit WebContent\WEB-INF\queries\mondrian.jsp
• Add JDBC connection parameters to the query
33
Mondrian+JPivot - Configuration
• Run the JPivot web project on the server
and enjoy…
34
How to design a Cube in
Mondrian
35
Outline
• Cube
• Measure
• Dimension
– Multiple Hiearchies
– Snowflake schema
– Shared dimensions
– Parent-child hierarchies
• Calculated members
• User-defined functions
• Named Set
• Aggregate Table
• Access-control
MDX
SELECT
{[Measures].[0], [Measures].[1],
[Measures].[2] } ON COLUMNS,
FROM Sales
Cube
Day_of_week quarter
<Dimension name="Time" foreignKey="time_id">
<Hierarchy hasAll="false" primaryKey="time_id">
<Table name="time_by_day"/>
<Level name="Year" column="the_year" type="Numeric"
year
uniqueMembers="true"/> week
<Level name="Quarter" column="quarter" type="Numeric"
uniqueMembers="false"/>
<Level name="Month" column="month_of_year" type="Numeric"
uniqueMembers="false"/> year
</Hierarchy>
<Hierarchy name="Time Weekly" hasAll="false" primaryKey="time_id">
<Table name="time_by_week"/>
<Level name="Year" column="the_year" type="Numeric"
uniqueMembers="true"/>
<Level name="Week" column="week"
uniqueMembers="false"/>
<Level name="Day" column="day_of_week" type="String"
uniqueMembers="false"/>
</Hierarchy>
</Dimension>
product
The fact table joins to "product" (via the foreign key "product_id")
"product" is joined to "product_class" (via the foreign key
"product_class_id")
"product_class" is joined to "product_type" (via the foreign key
Shared dimensions
• <Dimension name="Store Type">
<Hierarchy hasAll="true" primaryKey="store_id">
<Table name="store"/>
<Level name="Store Type" column="store_type" uniqueMembers="true"/>
</Hierarchy>
</Dimension>
<Cube name="Sales">
<Table name="sales_fact_1997"/>
... Sales
<DimensionUsage name="Store Type" source="Store
Type"foreignKey="store_id"/>
</Cube>
<Cube name="Warehouse"> Store Type Dim
<Table name="warehouse"/>
...
<DimensionUsage name="Store Type" source="Store Type"
foreignKey="warehouse_store_id"/>
</Cube>
Warehouse
Parent-child hierarchies (1)
employee
supervisor employee full_na
_id _id me Frank
All
0 1 Frank
1 2 Bill Bill Jane
2 3 Eric
Employee 1 4 Jane
3 5 Mark Eric
2 6 Carla
…
Parent-child hierarchies (2)
• <Dimension name="Employees" foreignKey="employee_id">
<Hierarchy hasAll="true" allMemberName="All Employees" primaryKey="employee_id">
<Table name="employee"/>
<Level name="Employee Id" uniqueMembers="true" type="Numeric"
column="employee_id" nameColumn="full_name"
parentColumn="supervisor_id" nullParentValue="0">
<Property name="Marital Status" column="marital_status"/>
<Property name="Position Title" column="position_title"/>
<Property name="Gender" column="gender"/>
<Property name="Salary" column="salary"/>
<Property name="Education Level" column="education_level"/>
<Property name="Management Role" column="management_role"/>
</Level>
</Hierarchy>
</Dimension>
• parentColumn attribute is the name of the column which links a member to its parent
member
• nullParentValue attribute is the value which indicates that a member has no parent
• <Schema>
...
<UserDefinedFunction name="PlusOne"
class="com.acme.PlusOneUdf">
</Schema>
SELECT
{[Measures].[Warehouse Sales]} ON COLUMNS,
{[Top Sellers]} ON ROWS
FROM [Warehouse]
Aggregates and Caching
53
Aggregate Tables
• An aggregate table contains pre-aggregated measures
build from the fact table
54
Aggregate Tables : Use Case
STAR SCHEMA
55
56
Aggregate Tables: Schema
• <AggName name is the name of the Aggregate
Table associated at levels specified in <
AggLevel name>
• <AggLevel name= "xxxx" column= " xxx"/>
– column indicates wich column associate to the level
indicated in name attribute
• <AggFactCount column= > is an obligatory value
• <AggMeasure name= "xxx" column= "xxx"/>
– column indicates wich column associate to the
measure indicated in name attribute
Aggregate Tables: Rules
• In the example Aggregate Table has the
default name: agg_l_pollution and the
same columns names of the fact table
ones: value_read, region_code…
• This permits to Mondrian to recognize
tables as Aggregate Table by default
• Rules can be setted with a file.xml defined
in a property
– <TableMatch id="ta" posttemplate="_agg_.+" />
– _agg_l_pollution
Aggregate Tables: properties
Property Type Default Value Description
60
Access-control
• Mondrian provides Rules to access to Cubes… too
XLMA Query in JPivot
• <jp:xmlaQuery
id="query01"
uri="http//localhost:8080/jpivot/xmla"
catalog="mortalityEU">
select {[Measures].[Ndeaths]} on columns,
{([Countries], [diseases])}on rows
from mortalityEU
where ([temps].[2000])
<jp:xmlaQuery/>
BIOLAP
BIOLAP
• BIOLAP is an extended version of Mondrian to support Biological
Data
Aggregator sum…
Mondrian
Aggregator SeqMin Cube xml
71
Pentaho : Overview
• Open Source BI application suite made
from free component applications
• Reporting: Eclipse BIRT (Business
Intelligence and Reporting Tools)
• Analysis: Mondrian, Jpivot
• Data Mining: Weka (University of Waikato
Machine Learning Project)
• Workflow: Enhydra Shark, Enhydra JaWE
72
Pentaho : Architecture
73
Pentaho: Analysis
• Another skin for JPivot?!
74
Pentaho: Analysis
• But there's also this (using Apache Batik)...
75
Pentaho: Analysis
• ...and this!
76
Pentaho, the future of Mondrian
77