Transaction:
An Enterprise Data warehouse is a relational DB which is specially designed for analyzing the
business and making decisions to achieve the business goals and responding to business
problems ,but not designed for business transactional processing
A Data warehouse is a concept of consolidating the data from multiple OLTP data bases
1.Low range
2.Mid range
3.High range
Example:Ms-Access
2.Mid range DB:
Can organized and managed Tera bytes and Peta Bytes of information
Example:Teradata,Netezza,GreenPlum,Hadoop.
There are two types of data storage patterns which are supported by relational DB
7.Recomended for data warehousing for small and medium scale enterprises with storage
capacity of gigabytes
2.Storing nothing architecture (every processor has dedicated memory& disk that is not shared
by another processor)
4.Unlimited Scalability
5.Designed only for Building Enterprise data warehouse but not for OLTP
Example:Teradata,Netteza,Hadoop,green plum
Enterprise DWH database Evaluation:
1.Data base that supports enormous storage capacity(Billions of rows and Tera bytes)
6.DB that supports mature optimizers to handle complex SQL Queries( Run the queries more
faster with less system resource usage
8.100% data without data loss even S/W,H/W components are down
10.That DB supports low TCO (total cost of owner ship) ease to set up ,administrate & Manage
11. Single DB server that can provide access to hundreds of users concurrently
Data Acquisition:
It is a process of extracting the data from multiple source systems,transforming the data into
consistent format and load in to a target system,To implement the ETL process we need ETL
tools
ETL applications are developed using simple graphical user interface,point& click features
Data Cleansing:
Data Merging:
2.Union
Data warehouse:
1.Data warehouse is a relational DB that is used to store the historical data for query& Analysis
SOR-->Source of records
Computer system that stores time sensitive transaction related data that is processed
immediately and analysis and always kept current.
1. Dimension Table
2. Fact Table
1. Dimension Table:
Customer,Product,Stores,Employees,Pramotions,Time
Fact Table:
Sales,Purchase,Inventry
1. SA_LoanTransaction Fact
2. CC_Transaction Fact
3. CC_Statement Fact
Fact table consists of Keys and Measures and Fact table consist of Composite Primary key
Composite Primary Key
Store Prod Date
Key(X) Key(X) Key(X) Revenue(X)
S1 P1 D1 3000
S1 P2 D1 2000
S2 P1 D1 2000
3.Fact less Fact table acts as a Bridge between the Dimensional tables
Example of Fact less Fact table: Employee Attendence Fact less Fact
Dimension Tables
Auditorim Sponsors Time Paticipant Events
Aud Id Sponsor Id Date Key Paticipant Id Event Id
Sponsor Month Paticipant Event
Aud Name Name Key Name Name
Aud Type Contribution Qtr Gender Event Type
Aud Mgr Address Year Address Event Desc
Aud
Address
Fact Table
Aud Id Sponsor Id Paticipant id Event id
A1 S1 P1 E1
A1 S1 P2 E1
A2 S1 P3 E1
2. Cumulative Fact table:
It consist of semi additive facts and non additive facts it describes states of things in a
particular instance of time
Types of Facts:
1. Additive Facts
1.Additive Fact: Business measurements in a fact table that can be summed up through all of
the dimensional Keys
Fact Table
Store Key Prod Key Date Key Revenue
S1 P1 12-Jan-15 600
S1 P2 12-Jan-15 400
S2 P2 12-Jan-15 800
S2 P3 13-Jan-15 500
S3 P1 13-Jan-15 700
S3 P3 14-Jan-15 900
Semi Additive Fact: Business measurements in a fact table that can be summed up across only
few Dimensional Keys
Transaction Profit
Acct Id Date Balance Margin
21653 12-Jan-15 700000 -
21654 12-Jan-15 400000 -
21653 13-Jan-15 900000 -
21654 13-Jan-15 600000 -
Reports:
Balance By Acct Id
Acct
Id Balance Balance
21653 1600000 900000
21654 1000000 600000
Balance By Date
Date Key Balance
12-Jan-15 1100000
13-Jan-15 1500000
The above example is for Semi additive Fact
3.Non Additive Fact:Business measurements in a fact table that cannot be summed up across
any Dimension KeysNote: In a Fact table percentage are always non additive
SEM1 80%
SEM2 60%
TOTAL 140% Wrong
Types of Dimensions:
1. Confirmed Dimension
2. Degenerated Dimension
3. Shrunken Dimension
4. Junk Dimension
5.Dirty Dimension
Types of Dimensions:
Conformed Dimension: A Dimension that is shared across multiple Fact table that is called
Conformed Dimension Or Dimension that is used to join Data mart
Banking Domain:
Degenerated Dimension:
If a fact table act as dimension and it’s shared with another fact table (or) maintains foreign key
in another fact table .such a table called degenerated dimension.
Shrunken Dimension:
Or
Cardinality is no of unique values in a column or Cardinality expresses the minimum and the
maximum no of instances of an entity ‘B’ that can be associated to an instance of Entity ‘A’
Dirty Dimension:
If a record occurs more than one time in a table by the difference of non key attribute such a
table is called dirty dimension
Orders:
Order Order
Order Id Date Id Amount
111 - 1 -
112 - 1 -
113 - 2 -
114 - 1 -
115 - 1 -
116 - 3 -
117 - 1 -
Or
1.SCD TYPE1
2.SCD TYPE-II
3.SCD TYPE-III
1. SCD TYPE-I:
SCD TYPE-II:
PRODUCTS
PID PNAME PRICE EFF_DATE
11 ABC 300 12-JAN-10
12 PQR 270 15-JAN-10
PRODUCT PRICE OF 12
CHANGED 199 27-AUG-11
CURR
CKEY CID CNAME LOC PREVLOC
101 11 BEN HYD
102 12 TOM CHE
CID CNAME LOC
11 BEN HYD
12 TOM BNG
CURR
CKEY CID CNAME LOC PREVLOC
101 11 BEN HYD -
102 12 TOM BNG CHE
CURR
CKEY CID CNAME LOC PREVLOC
101 11 BEN KER HYD
102 12 TOM BNG CHE
Role Play Dimension: Dimension that is recycled in multiple applications within the DB
Data Modeling:
Model: Business presentation of the structure of the data in one or more database
OLTP:ER-Mode is used
Model is normalized
Types of Schema:
1.Star Schema
3.Galaxy Schema
1.Star Schema: In a star schema a centre of a star is Fact table and corners are Dimension
tables
Customer
Cid Cname Gender Geoid
11 C1 1 111
12 C2 1 111
13 C3 0 112
14 C4 1 111
Geography
Geoid City State Country Region
111 Hyd Ts India Asia
112 VSP Ap India Asia
1.B*Tree Index
2.BitMap Index
1.B*Tree Index
2.BitMap Index
STEP1:
=============================================================================
STATE_ID,STATE_NAME,COUNTRY_ID
250.00,Rio Negro,111
251.00,Buenos Aires,111
252.00,Victoria,115
253.00,South Australia,115
254.00,Queensland,115
255.00,Northern Territory,115
258.00,Sao Paulo,110
259.00,Santa Catarina,110
260.00,Rio de Janeiro,110
=============================================================================
C:\SOURCE\States.txt
STEP2:
Create table in the oracle data base using the script given below
CREATE TABLE STATES(STATE_ID NUMBER(5,2),STATE_NAME VARCHAR2(25),COUNTRY_ID
NUMBER(3));
LOAD DATA
INFILE 'C:\SOURCE\States.txt'
STEP4:
C:\oracle\product\10.2.0\db_1\BIN>exit
STEP5:
Informatica Powercenter is a data integration tool from Informatica corporation which was founded in
1993 Redwood City of Las Angels California.
Informatica Power center is a GUI based data integration plotform which can access the data from
various types of source systems,transform the data into universal format
Informatica Products:
1.Power Center Designer: It is an GUI based client component which allow you to design ETL
applications known as Mapping.
2.Target Definition
It is a GUI based client component which allow you create following objects.
1.Session
2.Work flow
3.Schedulers
Session:
Work flow:
A Work flow is a set of instructions that tells how to execute the session tasks
Sequential batch process is recommended when hence it is a dependency between data loads
2.Concurrent batch Process:
Work flow executes the session tasks all at once this is recommended when there is no dependency
between data loads.
a)It is a GUI based client component which allows you to monitor the execution of sessions and work
flows on ETL server
No of records Extracted
No of records Rejected
No of recorded Loaded
It defines the efficiency or the rate at which records are extracted/sec,the records are loaded/sec
Step3: Design Mapping (ETL application with or with without Business rules)
It is a GUI based Administrative client which is used to perform the following tasks
A repository is a brain of ETL system that stores ETL objects or Meta Data.
A Power Center client component connects to the repository DB using repository service
Or
It provided run time Environment where ETL objects are executed integration Service creates Log’s and
saved in the repository data base through repository service.
1.Reader
2.DTM
3.Writer
Reader: It connects to the source and Extract the data from tables,Files,etc
Data Transformation Manager (DTM): It process the data according to the business rules that you
configured in the mapping
Writer: It connects to the target system and loads the data into the tables (or) Files
Note: Log Created by Integration service and saved in repository that log can accessed by work flow
Manager
1.The Informatica power center has the ability to scale the services and shared resources across multiple
machines
2.The power center domain is a primary unit for managing and administrating application
services(PCRS,PCIS)
3.Power Center Domain is a collection of one or more Nodes
4.A Node which host the Domain is known as Primary Node or Master Gate way Node
5.If master gate way Node fails users request can’t be processed
6.Hence it is recommended to configure more than one Node as Master Gate way Node
7.If the worker Node fails the request can be distributed to other Nodes[High Availabilty]
1.It is an Administrative web client which is used to manage&administrate power center Domain
a) creation of users,Groups
Create User:
SQL>SHO USER
A Target Definition is created using Target Designer tool in the Designer Client Component
a) Source(E)
b) Business Rule(T)
c) Target (L)
4.Creation of Session:
2.It is created using Task Developer tool in Work flow Manager Client component
a) Source Connection
b) Target Connection
c) Load Type
From the client Power center work flow Manager Select connections menu click on Realational select
the type Oracle click on New Enter the following details
Creation of Writer Connection(Oracle):
From the client Power center work flow Manager Select connections menu click on Relational select the
type Oracle click on New Enter the following details
1.Double click the session select the mapping tab from left window select the source
2.From Left window select source and from connections section click on ( down arrow) to open
relational connection browser select connection ORACLE_SCOTT_DB
3.From Left window select target and from connections section click on(down arrow) to open relational
connection browser select connection ORACLE_BATCH7AM_DB click Ok
4.From properties section select target load type=Normal click apply and click Ok
From Client Power Center Work flow Manager select Tools menu click on Work flow Designer
2.From work flow menu select create enter the Work flow Name w_s_flatmapping_oracle
3.From left window drag the session drop in Work flow Designer
1.Open the click Power Center Designer from Tools menu select designer
3.Drop the source definition [EMP] to the target Designers work space
5.Select columns tab from tool bar click on Cut to delete columns
6.From tool bar click on add a new column click apply click Ok
Select Create table& Click on Generate&Execute and click Ok then the SQL stores in a file ,file
name called MKTABLES.SQL
Transformations&Types of Transformations:
A transformation is a power center object which allow you to develop the business rules to process the
data in desired business formats.
1.Active transformation
2.Passive Transformation
1.Active transformation:
A transformation that can effect the no of rows(or) change the no of rows is known as Active
transformation
The following are the list of active transformations used to process the data
2.Filter Transformation
3.Rank Transformation
4.Sorter Transformation
5.TransactionControl Transformation
7.Normalizer Transformation
8.Aggrigator Transformation
9.Joiner Transformation
10.Union Transformation
11 .Router Transformation
12.SQL Transformation
13.JAVA Transformation
A transformation that doesn’t effect the no of rows(or) does n’t change the no of rows is known as
Passive transformation
The following are the list of active transformations used to process the data
2.Expression Transformation
1.Input Port(I)
2.Output Port(O)
Input Port(I): A Port which can receives the data is known as Input Port
Output Port(O): A port which can can provide the data is known as Output Port
Connected Transformation:
A Transformation which is the part of mapping in Data flow direction is known as connected
Transformation
2.Connected to the source and connected to the Target
3. A connected transformation can receive multiple Input Ports and Can return Multiple Output Ports
Note: All active and passive transformations can be configured as connected transformation
Un Connected Transformation:
A Transformation which is not a part of Data flow direction neither connected to the source nor
connected to the target is known as Un Connected Transformation
2.Can receive multiple input ports but it always returns a single Output Port.
Look Up Transformation
Filter Transformation:
1.It is an active transformation that can filter the records based on the given condition
4. True indicates that the records are allowed for further processing (or) Loading the data into target
5. False indicates that the records are rejected from filter transformation
Limitations :
1.Keep the filter transformation as close o the Source Qualifier transformation as possible o filter the
rows early in the data flow,as a result we can reduce the no of rows for further processing
Expression Transformation:
1.It is a passive transformation which allow you to calculate the expression for each row
Input,Output,Variable
6.Varible ports are recommended to create to simplify the complex expressions and reuse expressions
Scenario1:
Calculate the tax for each employee who belongs to the sales department ,If sal is greater than 5000
then calculate the tax as Sal*0.17 else calculated the tax as Sal*0.13
Logic:
Expression transformation
SAL-[I]
TAX[O] (IFF(SAL>5000,SAL*0.17,SAL*0.13)
LOAD_DATE[O](SYSDATE)
Scenario2:
Calculate the total salary for each employee based on Sal and Commission
Total sal=Sal+Comm
Logic:
Expression transformation
TotSal=IIF(ISNULL(COMM),SAL,SAL+COMM)
Scenario3:
Implement LIKE operator using filter transformation in job column of EMP table ‘SALESMAN’ is
represented 3 different format
SALESMAN
SALES-MAN
PRE-SALES
Variable Port:
A port which can store the data temporarly is known as variable port(v)
2.Varible ports are created to simplify the complex expressions and reuse expressions in several Output
Ports
6.The default value for variable port with data type string is “space”
7.Varible ports are not visible normal view of transformation but in edit view
Router Transformation:
A Router Transformation it is of type an active transformation which allows you to create multiple
conditions and passes the data to the multiple target
1.Input Group
2.Output Group
Input Group:
Only Input Group can receive the data from source pipe line
Output Group:
2.Default group
Default Group
The router transformation has a performance advantage over multiple filter conditions because .A row
is read once into Input Group but evaluated multiple times based on the no of groups,where as using
multiple filter transformation requires the same data to be duplicated for each filter transformation.
Source Qualifier Transformation:
1.An active transformation that can read the data from relational sources and flat files
a) SQL Query(SELECT)
f) Select Distinct(Distinct)
1.Join Separate sources using where clause can join any no of source tables
3.Supports Standards SQL joins like Inner Join (Or) Equie Joins
a.Inner Join
Source Qualifier-Advantages:
2.May reduce the volume of data on network(when we are writing where clause in SQL statement)
2.Can effect the performance on the source database(when source database cache is less memory)
Joiner:
Source Qualifier:
Ports In Aggregator:
Input Port:
Output port:
Variable Port:
1.By default Integration service returns last record from each group ,if no group by port is specified
integration service consider entire data as single group and returns last record
Single: Min(sal)
Multiple: Min(Avg(sal))
3.Aggrigator doesn’t support multilevel aggregate functions with in a single Aggregator transformation
Aggregate Expression:
Min,Max,Avg,First,Last,Sum,Count
Sorted Aggregator:
An aggregator transformation with sorted input option enables is called sorted aggregator
2. If multiple ports are selected for group by perform sorting on all ports in same order
3.If sorted I/P is enabled, unsorted data provided integration services fails the Session
Performance Tuning:
Union Transformation:
This is of type active transformation which combines similar data sources into a single result set or
data set
4.Union transformation supports Heterogeneous data sources i.e different data bases
6. Use key port to specify based on which column sorting has to be performed
8.If more than one port is selected as a key port integration service perform sorting on all columns in
sequential order from top to bottom ,However ports appears in sorter transformation
9.With in a single sorter transformation both ascending and descending sort order can be configured
10.If a sorter transformation is configured with out Key port& Distinct option Integration service makes
the mapping is Invalid
11. If distinct option is selected then sorter transformation eliminate duplicate records hence sorter will
be called as a active transformation
Sorter Cache:
2.when a session started then sorter transformation Integration services Cache the data before
performing sort operation
3.Integration service perform sort operation inside cache memory based on key column&sort order,
return the records out
4.If Cache memory is less than required memory space to perform sorting,integration service writes the
data to disk memory
5.the process of writing data to disk memory&swapping that between cache memory&disk is called
paging
7.To improve the performance of sort operation ,increase the Cache memory.
1.This is of type passive transformation that can import the stored procedure from database
2.Can receive multiple I/P ports and can return Multiple O/P ports
Un Connected stored procedure:
2.Can receive multiple I/P ports and can return a single O/P port
Uses:
PL SQL Program:
V_EMPNO IN NUMBER,
IS
BEGIN
SELECT SAL+NVL(COMM,0),SAL*0.1,SAL*0.4
INTO
TOTSAL,TAX,HRA
FROM EMP
WHERE EMPNO=V_EMPNO;
END;
/
Procedure for Implementing stored procedure:
Source: Table[emp]
Target:Table[empno,ename,sal,comm,totsal,tax,hra)
Mapping:M_STG_EMP_ST_PROC
1.Drag and drop the source def,target def in mapping designer work space
2.From transformation menu select create select transformation type stored procedure
Enter the name click on create connect to the DB with following details
Username:
Owner Name:
Password:
Then next click on connect then select the procedure of name STG_CALC_PROCS from scott user click
on OK
EMPNO->V_EMPNO
3.Informatica power center the transactions can be controlled in two different ways
a)Mapping
b)Session
a) Maping Level
b) session lEVEL
To configure user defined commit at session level set commit type as userdefined
TCL Variable:
To configure user defined commit type TCL provides the following variable
1.TC_CONTINUE_TRANSACTION
2.TC_COMMIT_BEFORE
3.TC_COMMIT_AFTER
3.TC_ROLLBACK_BEFORE
4.TC_ROLLBACK_AFTER
Transaction control-Mapping:
If we want to control the transaction based on a given condition then create a transaction control
transformation in mapping
Ex: IF(SAL>8000,TC_COMMIT_AFTER,TC_ROLLBACK_AFTER)
Transaction Control-Session:
3.A commit interval is the no of rows that you want to use as a basis for commits
a) Source
b) Target
c) Userdefined
1.I/P port
2.O/P port
3.V/Port(Varible)
4.R/port(Rank)
Rank Port:
Variable Port:
A port which allow you to develop expression to store the data temporarily for rank calculation is known
as variable port
Variable port to support to write expressions which are required for Rank calculation
1.TOP(OR )Bottom=TOP
2.No of Ranks=3
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
2.A mapplet is created using mapplet designer tool in designer client component
a) Active Mapplet
b) Passive Mapplet
1.Active Mapplet:
A mapplet is created with at least one active transformation is known as active mapplet
2.Passive Mapplet:
A Mapplet which is created with all passive transformations is known as passive mapplet
Mapplet Limitations:
Keep the following instructions in the mind while creating the mapplet
1 If you want to create a stored procedure T/R you should create stored procedure with type normal
2.If you want to use a sequence generator T/R you should use reusable sequence generator T/R
a) Normalizer T/R
Advantages:
Note:
Reusable Transformation:
A reusable Transformation is a reusable object created with business rules using Single T/R
Limitations:
Open the client Power center Designer from tools menu select Transformation Developer,from
Transformation menu select Create and select the Transformation type Sequence Generator and enter
the name “DW_KEY” click on create and done
Select a Mapping in a Mapping Designer workspace,select the T/R which you want to convert to
reusable,double the transformation and from Trasformation tab select
make reusable next clik on YES and click Apply and click Ok
Key Points:
When we drag the reusable transformation to the mapping designer work space it will created as
instance,you can modify the instance properties,that doest not reflect on original Object
From Left window select User Defined function sub folder from tools menu select “user defined
function” click on “NEW” provide the function name “TRIM” and select type “Public” and next clik on
new Argument
And next clik on Launch Editor
Business Purpose:
Loading data into Snowflake Dimensions which are related to primary key and forigen key relations ship
Create a session and double click the session and Select Configure Object
Procedure:
Select the Mapping Menu Click on “Target Load Plan” and change the Load Order using Up and Down
arrows click ok and save mapping
4. If the [R] port is not checked then the mapping is valid ,but the session created for that mapping will
fail at run time.
5.An unconnected look up is very commonly used when the look up is not needed for every input
record
7.Look up function can be set with in any transformation that supports to write expressions(Expression
transformation)
9.The condition is evaluated for each record, but the look up function is only called if the condition
evaluates True
(:LKP.LookupName)
Business Purpose:
1.A source table or file may have a percentage of records with incomplete data.the holes in the data can
be filled by performing a look up to another table or tables.
2. As only a percentage of the rows are effected it is better to perform the look up on only those rows
that need it and not the entire data set
SCD-Type1:
SCD Type1 is used to maintain current status of the data in Dimension table
Business Functionality:
2.UPDATE_DATE existing records that are coming with change from source
EXAMPLE DATA:
EMPNO ENAME JOB SAL
INITIAL LOAD
3.Drag & Drop EMP source definition,Two instances of target EMP_SCD1 into mapping designer
workspace
6.Select EMPNO port from Source Qualifier and Connect to Look Up T/R
7.Double click on Look UP T/R go to condition tab and enter the below condition
EMPNO=EMPNO1
10.Create an Expression T/R,Select EMP_SURR_KEY,JOB,SAL ports from Look Up T/R and connect to
Expression T/R
11.Select All ports from Source Qualifier and connect to Expression T/R
12.Double click on Expression T/R ,Go to ports tab and add two O/P Ports INSERT,UPDATE_DATE and
enter the Expression as shown below.
INSERT =IIF(ISNULL(EMP_SURR_KEY),’TRUE’,’FALSE’)
15.Double click on ROUTER T/R go to group tab and add two groups as shown below
2.Double click on UPDATE_DATE STRATEGY T/R go to properties tab and enter the UPDATE_DATE
strategy expression value as DD_UPDATE_DATE or 1
4.Save the Mapping and Create Session with name S_M_EMP_SCD1 and create Work flow with name
W_S_M_EMP_SCD1,Execute Work flow
SCD-TYPE-2:
SCD-Type2 method is used to maintain current data along with complete history in Dimension tables
Business Functionality:
SOURCE
TARGET
2.Create a Mapping with Name M_EMP_SCD2,drag &drop EMP source Defination,two instances
EMP_SCD2 target into mapping designer workspace
6.Double click on Look Up T/R go to ports tab and configure the ports as shown below
EMPNO=EMPNO1
8.Go to properties tab and enter below query for Look Up SQL override attribute
EMP_SCD2.JOB AS JOB
EMP_SCD2.SAL AS SAL
EMP_SCD2.EMPNO AS EMPNO
EMP_SCD2
OR
EMP_SCD2.IND=’Y’
10.Create an Expression T/R select all ports from Look Up T/R,Source Qualifier and connect to
Expression T/R
11.Create a ROUTER T/R,Select all ports from EXPRESSION T/R and connect to ROUTER T/R
12.Double click on ROUTER T/R and two groups as INSERT,UPDATE_DATE enter below expression for
INSERT and UPDATE_DATE group
13.
INSERT=ISNULL(EMP_SURR_KEY)
OR
(JOB!=JOB1)
OR
(SAL!=SAL1)
UPDATE_DATE=NOT ISNULL(EMP_SURR_KEY)
AND
(JOB!=JOB1 OR SAL!=SAL1)
2.Double click on Expression transformation ,go to the ports tab and add two O/P ports STARTDATE,IND
3.
4.Click on Apply and click on OK
5.Select All ports from EXPRESSION T/R and Connect to EMP_SCD2_INSERT target INSTANCE
6.Create SEQUENCE GENERATOR T/R,Connect next value port to the EMP_SURR_KEY column in
EMP_SCD2_INSERT target Instance
1.Create an EXPRESSION T/R,Select EMP_SURR_KEY port from UPDATE_DATE group of ROUTER T/R
and connect to the EXPRESSION T/R
4.Create an UPDATE_DATE STRATEGY T/R,Select All ports from EXPRESSION T/R and connect to
UPDATE_DATE STRATEGY T/R
5.Double click on UPDATE_DATE STRATEGY T/R ,go to properties tab and enter DD_UPDATE_DATE or 1
as UPDATE_DATE STRATEGY EXPRESSION click on Apply and click on OK
6.Select All ports from UPDATE_DATE STRATEGY T/R and connect to respective ports in
EMP_SCD2_UPDATE_DATE target instance
1.Repositary Objects such as Mappings,Sessions etc can be exported into a metadata file format called
.XML
Procedure:
5.Select the file directory enter the file name “Dev_20150512 and click on save
Repository Objects such as Work flows,Sessions,Mapping etc can be imported from metadata file called
.XML
Procedure:Create a New folder from left window select the new folder(Repository Manager Client)
4.Click on Next
6.Click on Next
7.Choose the destination folder and click on Next and again click on Next and click on Import
8.Click on Done
a) SQL query(SELECT)
f) Select Distinct(DISTINCT)
7.User Defined Join:Joins separate sources using where clause can join any no of source tables
2.May reduce the volume of data on network (when we are writing where clause in SQL statement)
1.In a delimated flat file each column will be separated by a special char like
,Comma(,),Pipe(|),Dollor($) etc
File List:
1.File list is the method used for loading the data from multiple files to single target by using single
source definition
3.File list can be apply on files that are having similar metadata
Mapping Parameter:
A Parameter represents a constant value which can be defined before mapping run.
1.Mapping parameter is a constant value which means the value remains same throughout the session
Advantages:
1.Mapping parameters are created with to standardize the business and increase the flexibility in
development
3.Mapping parameters can be defined with constant value in parameter file which is saved with an
extension either .txt or .prm
[FOLDER.WF:WORKFLOW.ST:SESSION]
$$Param1=const value
$$Param2=const value
$$Param3=const value
Procedure:
8.Click ok
9.From Source Qualifier copy the required ports to the expression Empno,Ename,Job,Sal,Deptno
10.double click the source qualifier and select the properties tab
SELECT EMP.EMPNO,EMP.ENAME,EMP.JOB,EMP.SAL
SQL Query EMP.DEPTNO FROM EMP WHERE EMP.DEPTNO=$$DNO
14.Save Mapping
[BATCH7AM.WF:W_S_M_PARAM.ST:S_M_PARAM]
$$DNO=20
$$PERCENT=0.25
C:\Param\Test.txt
Attribute Value
9.Run workflow
2. It is used to process SQL queries in the mid stream of pipeline, we can insert and UPDATE_DATE and
delete and retrieve rows from the data base at run time using the SQL transformation
3. The SQL transformation process external SQL scripts or SQL queries created in the SQL editor
The SQL transformation runs SQL scripts that are externally located, we pass a script name to the
transformation, with each input row, The SQL transformation O/P‘s one for each I/P row
The SQL transformation executes a query that we defined in a query editor we can pass strings or
parameter s to the query to define dynamic queries or change the selection parameters, we can O/P
multiple rows when the query has select statement
Example:
SELECT DISTRICT,
CASE WHEN INSERT_DATE=UPDATE_DATE THEN 1ELSE NULL END AS OFFLINEFIR FROM FIR)Q GROUP BY
Q.DISTRICT)
Save as C:\SOURCE\FIR_SCRIPT.txt
And also create one more note pad text file and save it as C:\SOURCE\SCRIPT_ADD.txt
PC DESIGNER:
SA:Import C:\SOURCE\SCRIPT_ADD.TXT
TD:FIR_RES(FLAT FILE)
STATUS STRING 10
MD:MAP_SQL
CID
NAME
DOB
13 john 13-FEB-85
11 ben 12-JAN-67
12 alex 15-JAN-62
Target: CUST_TYP1
CKEY
CID
NAME
DOB
Mapping Designer: M_SCDTYPE1
Source:CUST
CID
NAME
LOC
11 BEN CHE
12 ALEN MUM
13 RAM PUN
Target: CUST_TYPE2
CKEY number,
CID number,
NAME varchar2(20),
LOC varchar2(20),
FLAG Number
);
11 BEN BLG
12 ALEN CHE
13 RAM PUN
CID
NAME
LOC
11 BEN CHE
12 ALEN MUM
13 RAM PUN
Target: CUST_TYPE3
CKEY number,
CID number(15),
NAME varchar2(20),
CLOC varchar2(20),
PLOC varchar2(10)
);
Mapping Designer: M_SCDTYPE3
Create Session with Name S_M_SCDTYPE3 and create Workflow with name Wf_S_M_SCDTYPE3
11 BEN BNG
12 ALEN VJY
13 RAM PUN
CID
NAME
LOC
11 BEN CHE
12 ALEN MUM
13 RAM PUN
Target: cust_type2_vrsn
CID number,
NAME varchar2(20),
LOC varchar2(20),
VERSION number
);
Before start work flow check data in source table and target table
11 BEN CHE
12 ALEN MUM
13 RAM PUN
No rows selected
Commit;
X KICK OF MEETINGS2
ANALYSIS PHASE
DESIGN PHASE
CODING PHASE
REVIEWS
TESTING PHASE
GO LIVE PHASE
SUPPORT
1.Analysis Phase:
Business Analyst:
Outcome:
2. DB Tool to be used
2.Design Phase:
Data warehouse Architect/ETL Architect provides solution to build the DW or Data marts
Outcome:
1.Summary Information
2.Project Architecture
3.System Architecture
4.Source
5.DB
8.Data Model
12.Mapping Details
Outcome: Low Level Design Document It consists of Source and Target Object Details
ETL Team:
3.Coding Phase:
Code Review: Code review is to check Business Logic and whether naming standards are
followed or Not
Peer Review:
Team member review the same as above mentioned, If everything is Ok then do testing
4.Testing Phase:
1. Unit testing (Mappings are tested by individual users debugger or enable test load to test
mapping with limited test data
2. SIT (System Integration testing): Mappings are tested according to their dependencies
3. UAT (User acceptance testing): Mappings are tested in the presence of onsite users
4.Production Phase:
Project Architecture:
Hi good morning
Coming to my education details, I have completed my MCA from a college which is affiliated to
JNTU University, Hyderabad.
My first experience in that project AP GBS DATA MART. We have developed data marts from
global data warehouse .after signing off from that project, I was mapped to another project in
the same company. In my second project we are worked for prudential insurance .there we
developed data warehouse for their proposed system. I worked as a Associate Software
engineer in IBM from Jun‘2014 –Oct’ 2016.
Later on that I was selected in IBM India Private Limited, Pune. I am working at IBM for client
GE. For GE we developed a data warehouse for their INTERNAL BILLING SYSTEM. Few days back
I got relieved from project and they are trying to map me into another project. I am working in
IBM from June’2014 to till date.
In my 6 years exp I involved in many things like developing mappings, performance tuning, Data
profiling etc
In these 6 years of experience I gained hands on Experience on tools Informatica, data stage,
Information Analyzer, Quality stage, Trillium Discovery oracle and some knowledge on PL/SQL
and UNIX environment.
Documented By
Mail:abreddy2003@gmail.com
Mobile:9948047694