Anda di halaman 1dari 31

SDLC

Informatica
Unix
Project
Functionality
Devlopment of Project
Simple Dimension Loading
Complex Dimension Loading
Simple Fact table loading
Complex Fact table Loading

Aggregates
cdc
src stg wh

100 1 100
1

date parameters
CDC Tools
power exchange
Dev
citrix dwhserver
unix applebigdata target olap
File

Dev_OLTP
Integration Servie
oltpserver Dev_OLAP
oltp

Repository
oltp Service
Putty
RepDev
OPB* winftp/wins
cp

Client(de.....)
applebigdata Client(de.....)

Client(de.....)

sIT

Integration Servie
target

Repository
Service
sit_OLTP
sit_OLAP
OPB*

oltp
Integration Servie
target

Repository
Service

OPB*

oltp
dwhserver

applebigdata
RepDev

Client(de.....)

..)

.....)
CreditLimitAmount
Requirements Gathering
Scope

Functional Design total credit limit of active accounts only


High level design document(HLD)

Data Modeling sum(case when accnt_status='Active' then cre


Conceptual
Logical S-T Mapping
Physical (Mapping Document)

Technical Design Document Architect


Low level design
Tech Spec
5000

work Assignment
Development Issue Tracker/closer
Prepare Unit test cases
Execute Unit testcases
Peer Review
SIT
(System Integration Testing)

Offshore
TL
SITSAT
(System Acceptance Testing)
(System Integration Testing) Support Testing
(defect/bug)

SATUAT PAT
(User Acceptance Testing)
(System Acceptance Testing)

Production Support
Production Monitoring jobs
Production
verify logs
correct data issues
re -run jobs
communicate to support team about producti
CreditLimitAmount

active accounts only

nt_status='Active' then cred_lmt else 0 end )

Delivery Manager
Sr Project Manager/Manager 8>
Team Leads(TL) 5>
Module Lead
Sr ETL Developer ETL dev Trainees
3-6 2-3 0
SE
client
Onsite US

meeting

TL
pport team about production issue
SURR_CUST_ID
CUST_ID
NAME
GENDER
INCOME
1) SDLC
2) Unix
3) Un known Inf
4) Project Overiview

Simple
5 DimensionMedium
Complex

Simple
6 Fact Medium
Complex
ETL Archi
Unix
Inform
AIX
SCO
Solaris
Linux
Rhel

putty

/ (root)

pwd display current directory


cd <path> change current directory to a particular folder
cd change the dir to home dir
cd ~ change the dir to home dir
cd .. change dir to one step back
date display current date
echo "Welcome" display message on to screen
who list out all the currently logged users
whoami display current user name
uname -a display platform information
ls display list of files and directories
ls -l display list of files and directories along with all attribute

first character represents "- or d"


- file
d folder
next 9 characters represents permissions
first 3 permissions of user
r read
w write
x execute
next 3 permissions of group
r read id
w write
x execute

next 3 others
r read
w write
x execute

ls -ltr
t -time based(order by time)
r - reverse order

ls -a display hidden files


ls -lh display files in human readable format

wildcard characters
* any character match
? Single character match

mkdir create dir


rmdir remove dir
rm remove files
rm -r recursively delete all folder folders
rm -f forcce delete

touch zero.txt
create zero byte file
also changes the file modifications time

vi editor

i insert mode
Esc takes to normal mode
when we are Esz(normal) mode
we can perform file operations like
:w save file
:w filename save filewith filename
:q quit file
:wq save and quit
! force quit or force write by iusing
:w!
:q!

A add content end of line


I add content before line
yy copy line
p paste
dd delete current line
<n>dd delete n number of line

^ go to first position of line


$ go to end of line

cat display content of file


cat filename.txt
cat *.txt
cat *.txt> final.txt

Redirction operator

> redirects the output to right side file


< take the input from command prompt to commans
ls *.txt > indirect.txt

head
tail c1 a
c2 x
c1 a c1 a
c1 a c2 x
c2 x c2 y
c2 y
c2 y

sort sorts data of a file


sort -u remove duplicates(row dupes)

uniq remove duplicates if data is already sorted s1 ameerpet


s1 hyd
s1 ap
sort filename|uniq
display uniq records only
sort filename |uniq -d
display only duplicate records
sort filename|uniq -dc
display only duplicate records with count

head -2 filename
display first 2 lines
tail -2 filesname
display bottom 2 lines

tail -2 filename |head -1


display 2 line from bottom

grep e1 -nH f1.txt f2.txt

H file name
n line number
v negative earch

cut -d',' -f1,2 filename

d represents delimited character


f represents filed numbers

cut -c1-4 filename

du
display file sizes of given path
h human readable format
s summarises the output

df
filesystem utilisation
h human readable format

find find the files in a directory

mtime modified time base

size size based search

type file type search

simple scripts
execute shell
sh a.sh
$# no of input params
$0 file name
$1 first param

$? status of current unix command status

1)
awk -F'|' '{print $2}' file_pipe.txt

awk -F'|' '{print $2 "|" $1 "|" $9}' file_pipe.txt|grep -e "A|GBR" -e "C|GBR" -e "9|GBR"

A | GBR | G 1901-01-01
C | GBR | G 1901-01-01
9 | GBR | J 1901-01-01

2)

cat file_pipe.txt | awk 'BEGIN{FS="|"};{print NF}'

3) Remove NON Ascii characters

tr -cd '\11\12\15\40-\176' < file.txt > new_file.txt


s1 ameerpet,hyd,ap
Architecture

Infromatica is SOA enable Architecture from 8.x onwards


It runs as services now, earler it used to run as Server

Server Cleint Repository

Domain is logic group/unit which controls all other services


we will have ome repository to manage i.e is domain repository

node is a logical name of physical machine

Repository Service is a service process to run the repository using repository database
Integration Service Is a service to run /execute code(workflows/sessions)
Grid Grid group of nodes to utilise more hardware processing
Domain

node1

Repository Service
Inf Client
Cleint
Integraton
node2 Service

Repdatabas
e

mapping target Load Order

src1 tgt1

src2 tgt2

constraint based loading


100

lkp
lkp

0
exp tgt

src

Source Qualifier

7369 APPLE 1234 10


7369 SMITH CLERK 7902 ### 800 90
7499 ALLEN SALESMAN 7698 ### 1600 300 30
7521 WARD SALESMAN 7698 ### 1250 500 30
7566 JONES MANAGER 7839 2-Apr-81 2975 20
7654 MARTIN SALESMAN 7698 ### 1250 1400 30
7698 BLAKE MANAGER 7839 1-May-81 2850 30
7782 CLARK MANAGER 7839 9-Jun-81 2450 10
7788 SCOTT ANALYST 7566 19-Apr-87 3000 20
7839 KING PRESIDENT ### 5000 10
7844 TURNER SALESMAN 7698 8-Sep-81 1500 0 30
7876 ADAMS CLERK 7788 ### 1100 20
7900 JAMES CLERK 7698 3-Dec-81 950 30
7902 FORD ANALYST 7566 3-Dec-81 3000 20
7934 MILLER CLERK 7782 23-Jan-82 1300 10
9999 BANANA 9876 10

FF
1MILL
sitory database

nf Client
Cleint
Filter
in(deptno,10,20)=1

Router
empno ename sal
e1 xyz 10 cond1

cond
Exp

def

Sort
trn3
(passive)

trn1 trn2

trn4
(Active)

default

1m
filt
(deptno=10)

1m SQ

filt
(deptno=20)

1m
filt
(deptno=20)
1m
filt
SQ (deptno=10)
deptno=20

Exp

1 rec rec

EMPNO_Prev_v EMPNO_v 0 e1 e1
EMPNO_Prev_Out EMPNO_Prev_v 0 e1 e1
ENAME ENAME xyz xyz abc
EMPNO EMPNO e1 e1 e2
.
.
EMPNO_v EMPNO e1 e1 e2

Sort
will remove only row duplicates

e1

e1
src exp tgt
1m
e1

src sort tgt


e1

Joiner
SQ

Lookup

Joiner Lookup
inner/outer..
condition "="
Master
Detail
Cartesian/cross join

Sorted Input

seq
Lookup
Connectivity
Connected part of pipelines can return miltiple ports
UnConnected not part only one

Cache
No Cache
when lookup table is getting updated while mapping is running
not good from performance as it always cheks data on database

c1 xyz
c1 abc

static Cache
index data
condition column data other columns data

Lookup policy on multiple match


c1 xyz
c1 abc

1 c1 xyz n
2 c1 abc y

1
2
c1 abc

Process writing custom SQL is called Lookup override


whenever there is a order by clause in Lookup override keep "--" at end of the SQL
lookup by default generates one order by on its own."--" will make sure to ignore informatica o

Always use inner join of source and target SQL in Lookup Override to reduce the cache in Loo

c1

c1
c9 c2
c1 c3
c4
c5
c6
c7
c8

Dynamic
Cache is updated during mapping execution
New Lookup row port indicates the status of record(new/old/unchanged)
Associated port to compare columns from lookup to source
we can also ignore some columns in comparision

this will solve same day multiple changes of a record


use dynamic lookup rather than no cache in this scenario
(when Duplicates in cache then Dynamic will fail

update else insert


insert else update

synchronise
along with updating data in cache this will also update data in lookup ta

Output Old Value On Update


it will return old value from lookup even though there are updates/inse
out put equals to static cache

Indicates if a cache is persistent or non-persistent


it keeps/preserve the cache
it helps to reuse the cache by other mapping

Cache File Name Prefix


to be used when cache has to be saved(persistent)

Re-cache from lookup source

refresh the lookup cache from lookup table

Lookup as Active
this is available from 9.x
this can be used as join .
when there are maultiple matches it will match will all records

Target
updatestartegy
SP

changing connections through param


pmcmd
sal<=10
e1 xyz 10
sal<=20
e1 xyz 10

trn2

trn4
(Active)

dept_t10

dept_t20
dept_t10

dept_t20

e1

tgt
tgt

good for per fact loading

ng is running
on database

c1 xyz n
c1 abc y
c1 xyz n
c1 abc y

-" at end of the SQL


ake sure to ignore informatica order by

rride to reduce the cache in Lookup

of record(new/old/unchanged)
s from lookup to source
ome columns in comparision
will also update data in lookup table

n though there are updates/insert

Anda mungkin juga menyukai