Anda di halaman 1dari 36

1

Business
Intelligence
WEEK 3-4
Arsitektur BI 2
Topik Data Warehouse (DW) 3

 Definisi DW
 Karakteristik DW
 Data Marts
 ODS, EDW, Metadata
 Framework DW
 Arsitektur DW
 Proses ETL
 Pengembangan DW
 Isu-isu DW
Apa itu Data Warehouse? 4

 Sebuah repository fisik dimana data relasional


dikelola untuk menyediakan data yang berkualitas
dengan standar format untuk skala enterprise

 “Data warehouse adalah kumpulan dari database


yang terintegrasi dan berorientasi subjek untuk
menunjang fungsi DSS, dimana setiap unit data
bersifat non-volatile dan relevan pada suatu
momen waktu.
Karakteristik DW 5

Subject Time-
Integrated
Oriented variant

Summarize Not
Nonvolatile
d normalized

Relational /
multi- Client /
Metadata
dimensiona Server
l
Subject Oriented
1. Data warehouse diorganisasikan dalam lingkup subjek, sebagai

contoh: Penjualan, Produk, dan Pelanggan.

2. Berfokus ke dalam pemodelan dan analisis data untuk pihak-

pihak pembuat keputusan.

3. Memisahkan data yang tidak berguna di dalam proses

pendukung keputusan.
Subject Oriented (cont’d )
Integrated

1. Data warehouse dikonstruksikan dengan cara mengintegrasikan

sejumlah sumber data yang berbeda.

2. Data preprocessing diaplikasikan untuk meyakinkan

kekonsistensian data.
Integrated (cont’d)

Savings Loans Trust Credit card

Same data Different data Data found here Different keys


different name Same name nowhere else same data
Integrated (cont’d)

Encoding
Structures

Measurement
of attributes

Multiple
Sources

Data Type
Formats
Integrated (cont’d)
Data Warehouse

appl A - m,f
appl B - 1,0
appl C - x,y
appl D - male, female

appl A - pipeline - cm
appl B - pipeline - in
appl C - pipeline - feet
appl D - pipeline - yds

appl A - balance
appl B - bal
appl C - currbal
appl D - balcurr
Integrated (cont’d)
Integrated (cont’d)

Data perlu distandarkan :

Sales Inventori Transaksi Penjualan


Format Key Key Key
Text Integer Yes/No
Description Nama pelanggan Nama pelanggan Nama pelanggan
U.N.I.J.O.Y.O UNIPAHIT Universitas majapahit
Unit Tinggi Tinggi Tinggi
centimeter meter inch
Encoding Sex Sex Sex
Yes = Laki-laki L = laki-laki 1 = Laki-laki
No = Perempuan P = Perempuan 0 = Perempuan
Time-Variant

1. Menghasilkan informasi dari sudut pandang historical (misal:

5-10 tahun yang lalu).

2. Setiap struktur kunci mengandung elemen waktu.


Time-Variant (cont’d)
Time-Variant (cont’d)
Operasional :
 Data pada saat ini (current value)

Datawarehouse :
 Analisa data pada masa lampau
 Informasi pada saat ini
 Forecast untuk masa yang akan datang

Time-Variant (cont’d)
Nonvolatile

1. Sekali data direkam maka data tidak bisa diupdate.

2. Data warehouse membutuhkan dua operasi pengaksesan data,

yaitu:

a. Initial loading of data

b. Akses data
Nonvolatile (cont’d)
Operasional :
 Add, change, delete data pada sistem operasional secara real
time setiap transaksi terjadi

Datawarehouse
 Update ketika kita perlukan saja, bisa secara periodik

Data pada DW dikhususkan untuk


query dan analisa data

Nonvolatile (cont’d)
Nonvolatile (cont’d)
Data Mart 22

Sebuah data warehouse yang lebih kecil (per departemen) yang


hanya menyimpan data yang relevan dengan area tertentu

 Dependent data mart


Sebuah subset yang diturunkan langsung dari data warehouse

 Independent data mart


Sebuah data warehouse kecil yang khusus didesain untuk suatu unit
bisnis atau departemen
DW Terms 23

 Operational data stores (ODS)


Sebuah tipe database yang biasanya digunakan sebagai
penyimpanan sementara untuk data warehouse (short-term
memory)
 Enterprise data warehouse (EDW)
Sebuah data warehouse dengan skala enterprise yang digunakan
untuk mendukung pengambilan keputusan.
 Metadata
Data tentang data. Dalam data warehouse, metadata
mendeskripsikan konten dari data warehouse, membantu dalam
mengkonversi data menjadi informasi/pengetahuan
DW Framework 24

No data marts option


Data Applications
Sources (Visualization)
Access
Routine
ERP Business
ETL
Reporting
Process Data mart
(Marketing)
Select
Legacy Metadata Data/text

/ Middleware
Extract mining
Data mart
(Engineering)
Transform Enterprise
POS Data warehouse
OLAP,
Integrate
Data mart Dashboard,

API
(Finance) Web
Other Load
OLTP/wEB
Replication Data mart
(...) Custom built
External
applications
data
Arsitektur DW 25

 Arsitektur three-tier
1. Data acquisition software (back-end)
2. Data warehouse yang berisi data & software
3. Client (front-end) software yang memungkinkan
user untuk melakukan akses dan analisis data dari
DW
 Arsitektur two-tier
Dua tier awal pada arsitektur three-tier
dikombinasikan menjadi satu
Arsitektur DW 26

Tier 1: Tier 2: Tier 3:


Client workstation Application server Database server

Tier 1: Tier 2:
Client workstation Application & database server
Arsitektur Web-based DW 27

Web pages
Application
Server

Client Web
(Web browser) Internet/ Server
Intranet/
Extranet
Data
warehouse
Arsitektur DW 28

 Hal-halyang perlu diperhatikan dalam memilih


arsitektur:
 Apakah database management system (DBMS)
yang akan digunakan?
 Apakah data migration tools akan digunakan
untuk mengisi data warehouse?
 Apa tools yang digunakan untuk mendukung
data retrieval dan analisis?
Alternatif Arsitektur DW 29

(a) Independent Data Marts Architecture

ETL
End user
Source Staging Independent data marts
access and
Systems Area (atomic/summarized data)
applications

(b) Data Mart Bus Architecture with Linked Dimensional Datamarts

ETL
Dimensionalized data marts End user
Source Staging
linked by conformed dimentions access and
Systems Area
(atomic/summarized data) applications
(c) Hub and Spoke Architecture (Corporate Information Factory)

ETL

Source Staging Normalized relational


End user 30
access and
Systems Area warehouse (atomic data)
applications

Dependent data marts


(summarized/some atomic data)

(d) Centralized Data Warehouse Architecture

ETL
Normalized relational End user
Source Staging
warehouse (atomic/some access and
Systems Area
summarized data) applications

(e) Federated Architecture

Data mapping / metadata


End user
Logical/physical integration of access and
Existing data warehouses
common data elements applications
Data marts and legacy systmes
Integrasi Data 31

 Data integration
Integrasi yang terdiri dari 3 proses utama: data access, data
federation, dan change capture
 Enterprise application integration (EAI)
A technology that provides a vehicle for pushing data from
source systems into a data warehouse
 Enterprise information integration (EII)
An evolving tool space that promises real-time data integration
from a variety of sources, such as relational databases, Web
services, and multidimensional databases
Extraction, Transformation, and 32

Load (ETL)

Packaged Transient
application data source

Data
warehouse

Legacy
Extract Transform Cleanse Load
system

Data mart
Other internal
applications
ETL 33

 Issues affecting the purchase of ETL tool


 Data transformation tools are expensive
 Data transformation tools may have a long learning curve
 Important criteria in selecting an ETL tool
 Ability to read from and write to an unlimited number of
data sources/architectures
 Automatic capturing and delivery of metadata
A history of conforming to open standards
 An easy-to-use interface for the developer and the
functional user
Pengembangan Data Warehouse 34

 Data warehouse development approaches


 Inmon Model: EDW approach (top-down)
 Kimball Model: Data mart approach (bottom-up)
 Which model is best?
 There is no one-size-fits-all strategy to DW
 One alternative is the hosted warehouse
 Data warehouse structure:
 The Star Schema vs. Relational
 Real-time data warehousing?
Representasi Data pada DW 35

 Dimensional Modeling – a retrieval-based system that supports


high-volume query access
 Star schema – the most commonly used and the simplest style of
dimensional modeling
 Contain a fact table surrounded by and connected to several dimension tables
 Fact table contains the descriptive attributes (numerical values) needed to
perform decision analysis and query reporting
 Dimension tables contain classification and aggregation information about the
values in the fact table
 Snowflakes schema – an extension of star schema where the
diagram resembles a snowflake in shape
 Lebih jelas, kunjungi: https://www.guru99.com/star-snowflake-data-
warehousing.html
Multidimensionality 36

 Multidimensionality

The ability to organize, present, and analyze data by


several dimensions, such as sales by region, by product,
by salesperson, and by time (four dimensions)
 Multidimensional presentation
 Dimensions: products, salespeople, market segments, business
units, geographical locations, distribution channels, country, or
industry
 Measures: money, sales volume, head count, inventory profit, actual
versus forecast
 Time: daily, weekly, monthly, quarterly, or yearly

Anda mungkin juga menyukai