Abhisek
Scenario:
Objective: The interface aims to consolidate the round-trip ticket cost of passengers.
Description: The source file is a .txt file, train_route_src.txt, consolidated with trains from a particular
source to destination. There are many repetitions in the source file (Reverse Duplication: eg. BLR --> BBS
and BBS --> BLR). Remove the reverse duplicates and maintain the target in...
D Hemakumar
o
Source---->columngenarator--------->transformer---->target
generate the two columns col1 and col2 and generate the sequence numbers for both
columns in transformer write constraint col1>col2
take the source destination and distancefare columns in the output link
i hope this may work
can anyone pls solve in datastage.........
source column is..
A
A
B
B
B
C
C
D
........want output columns(2 cols) as.........
A1
A2
B1
B2
B3
C1
C2
D1
THANX IN ADVANCE.
D Hemakumar
o
sai3689
FEB
132014
01:50 PM
7131
Views
6
Ans
premox5
My input has a unique column-id with the values 10,20,30.....how can i get first record in one o/p file,last
record in another o/p file and rest of the records in 3rd o/p file?
hari
o
Take three stage variables as stgA,stgB,stgC . stgA --columnname . stgB --If columnname =
then A else if columnname (inrouwnum >0 and currentdate=sysdate then A esle B stgC
--stgB --If columnname = ...
Answer Question Select Best Answer
MAR
172015
07:18 AM
766
Views
1
Ans
sai3689
Hi,
Can you please solve this in datastage..
my input is
name,city
johnson,newyork
johnson,mumbai
johnson,delhi
want output as
name,city1,city2,city3
johnson,newyork,mumbai,delhi
pls explain with steps
thanking in advance...
sudip
o
Seq stg------>Sort stg--------> Trnsfrm stg-------->Tgt 1. read data in seq file stage. 2. in sort
stage enable key column change as true. 3. in Trnsfrm define stage var, logic -> if key
change =1 thn...
Answer Question Select Best Answer
APR
162008
09:58 AM
3695
Views
11
Ans
mohanreddy
1. Is the Hash file is active or passive? If we take as source ?2. can u take seq file as look up?3. In hash
file dynamic 30, there r two types:1)genaric2)specific what is the meaning?4. how to connect MERGE
STAGE while source as two tables?5. what is the purpose of MERGE?6. how can DS job scheduled in
Unix?7. how do u know how many rows rejected?8. use of universe stage?9. what is SEQ file buffer?10.
diff...
Pravin Patil
o
For Question 15
how do u decide when go for join or lookup?
If Large amount of data is comming from I/P source then use Join.
If Small amount of data is comming from I/P source then use Lookup.
Pravin Patil
o
For Question 47
Environment variable is predefined variable.
An user defined environment variable is a place holder to store a value that can used in the
entire project.
Answer Question Select Best Answer
NOV
082011
10:01 AM
1938
Views
3
Ans
SRC records
o
swapna
SRC has 1 record I want 10records in target how is it possible ? plz explain me ?
Murali
o
Goto transformer stage ,in that stage select system variables ,select iteration and specify
the condition in loopig is @iteration=10
Answer Question Select Best Answer
APR
062015
08:11 AM
396
Views
2
Ans
sai3689
add one extra column that value 1 or any thing but value same Use a vertical Pivot option
available in Pivot Enterprise stage(Stage tab-->Properties-->Pivot Type=Vertical) Again in
stage-->Pivot ...
vasuu
o
OCT
192013
11:18 PM
10821
Views
6
Ans
Sam Geek
I came across this question many times in interview, In specific what can i answer..... Please help..
rstrainings
o
we can say like there is a staging area to store the data in the form of tables, and then it
transforms to ETL stage (Here we will do all the conversions,remove
duplicates,joining,merging etc) then di...
shiv
o
The above answer, is to architecture of Datastage, its not the architecture of project. Project
architecture would be like: ***************************** You have: 1 Source--------> 1 Staging
Area----...
Answer Question Select Best Answer
MAR
172015
08:20 AM
745
Views
1
Ans
sai3689
hi
i have sourse like
name,city
johny,newyork
johny,mumbai
johny,del
want output like
name,city1,city2,city3
johny,nework,mumbai,delhi
thanks in advance..
Ruchir
o
Use a vertical Pivot option available in Pivot Enterprise stage(Stage tab-->Properties->Pivot Type=Vertical)
Again in stage-->Pivot properties tab-->group by on Name column and Pivot using City
column.
Then select array size=3.
Answer Question Select Best Answer
JUN
302006
01:03 AM
5690
Views
5
Ans
balu
Himanshu Maheshwari
o
The environment in which you run your parallel jobs is defined by your system s
architecture and hardware resources. All parallel processing environments are categorized
as one of: SMP (...
Saurabh Sinha
o
In SMP every processor share a single copy of the operating system (OS)
In MPP each processor use its own operating system (OS) and memory.
Answer Question Select Best Answer
OCT
202011
02:29 PM
11586
Views
8
Ans
amulas
source table
name
A
A
B
B
B
C
C
D
In source table data like this
but I want traget table like this
name count
A1
A2
B1
B2
B3
C1
C2
D1
FEB
262015
11:23 PM
1098
Views
7
Ans
Scenario in datastage
o
sravanthi
input is
cola
1
2
3
this should be populated at the output as
cola
1
22
333
satish
o
@iteration
Answer Question Select Best Answer
MAR
192015
01:55 AM
791
Views
2
Ans
sai3689
hi...
i have sourse like
name,city
johny,newyork
johny,mumbai
johny,del
want output like.....
name,city1,city2,city3
johny,nework,mumbai,delhi
----------------------------------------thanks in advance...
sai3689
Thank u abishek
Abhishek Surkar
o
Use Pivot Stage-->open Pivot stage select Properties tab select Pivot Type = Vertical
-->select Pivot Properties tab click Check Group by for Name --> click Check Pivot for city
-->Make Array size =4 -->
Map Output columns to output link and click Ok -->
Compile --> Run
Answer Question Select Best Answer
MAR
242015
09:32 AM
286
Views
0
Ans
I would like to have output as like this 101 90 next line 102 65
next line 102 55 etc...
o
Anjaneya Gupta
MAR
222015
08:59 AM
429
Views
0
Ans
teja
Answer Question
NOV
042014
09:51 PM
2873
Views
4
Ans
sravanthi
Source is a flat file and has 200 records . These have to split across 4 outputs equally . 50 records in
each .
The total number of records in the source may vary everyday ,according to the count records are to split
equally at 4 outputs.
Could someone post an answer for this question.
Thanks
Vinay
o
We can use the split command in the Filter section of the Sequential
file stage:
Split -l 50 File.txt Segment
O/P: will give 4 files namely Segmentaa,Segmentab,Segmentac and Segmented each of 50
records.
Gandhi
o
Hi all,
Keep four files as output.Simply give modulus(col_name,4)=0 constraint for first file
,modulus(col_name,4)=1 second file and modulus(col_name,4)=2 for third file and finally
modulus(col_name,4)=3
Answer Question Select Best Answer
MAR
04:31 AM
132015
542
Views
1
Ans
sravanthi
I have
source
10
20
20
reference 10
20
20
can someone tell me how a cartesian join performed using a join stage in datastage.
Thanks
Sravanthi
satjay
o
Create a dummy key column in both the links (eg with value 1) and join with this key
column. This is the approach to join two tables without keys.
Answer Question Select Best Answer
OCT
022014
08:01 AM
2549
Views
4
Ans
taruna.arya
what is the difference between dataset stage and sequential file stage? even one more to be added here,
What is the difference between dataset stage and fileset stage?
kritika
o
Dataset: It stores data in the ASCII format so it takes less time to read data for datastage IT
can accommodate large amount of data Sequential file stage: It contains data in the
readable format,...
ravikumar a v
o
Dataset stage is used for reference in put link in other jobs design and where as sequential
file stage is not possible.
Answer Question Select Best Answer
DEC
192013
08:51 AM
3211
Views
2
Ans
premox5
Ishu
o
Use of parameters: 1. avoid hardcoding. 2. Incase a value for some parameter is changed
in future, we can change it at a single point instead of making changes everywhere in every
job wherever it is ...
Lubna Khan
o
sekintrance
OCT
162014
07:59 AM
2236
Views
2
Ans
Alekhya ch
If you have multiple files ,need to change Read method to FILE PATTERN ,Then its run
parallel mode
Balu
o
Changing the "no.of readers per node" to more than one . you can read a sequential file in
parallel mode
Answer Question Select Best Answer
NOV
042014
09:48 PM
2190
Views
3
Ans
sravanthi
OCT
132011
11:17 AM
6380
Views
6
Ans
Job sequencing
amulas
I have 3 jobs A,B & C, which are dependent each other, I want to run A & C jobs daily and B job run only
on Sunday. How can I do it?
Sreelesh
o
Create a single sequence which will run on all days. Add a stage to call a routine which will
check whether the date is a sunday. If sunday call the job B. Else the sequence will call only
A &C.
Kartik Dharia
o
You can create a new job to test whether the current day is Sunday or not. If it is true then
create a file else create a 0KB file. Then you can create only 1 sequence with jobs A and C
one after the...
Answer Question Select Best Answer
NOV
202014
09:22 AM
2288
Views
2
Ans
saikumar
Hi, Check for valid date as below: 1.> Len of the string =8 then Date < sysdate else default
date 2.> We use a) "Iconv" function - Internal Convertion. b) "Oconv" function - Exter...
sonali
o
132010
JUL
12:32 AM
7192
Views
4
Ans
Data Granularity
manju_thanneeru
High level of detail-low level of granularity.& low level of detail-high level of granularity.
pari
o
DEC
032014
03:52 AM
987
Views
1
Ans
vidyasagarvuna
Having single input in the source wants the repetition of the input in the column of the target using only
copy stage.
@sheesh
o
OCT
022014
07:54 AM
2733
Views
5
Ans
taruna.arya
10
10
20
20
20
30
30
40
40
50
60
70
i want three output from the above input file, these output would be:
1) having only unique records no duplicates should be there. Like:
10
20
30
40
50
60
70
2) having only duplicate records,...
Mihir
o
Use Sort Stage-->Define the key field, In Property, Key Change column is TRUE.
Then use a Transformer, In constraint , KeyChange=1 for Unique record O/P and
KeyChange=0 for Duplicate O/P.
Dheerendra
o
NOV
142014
02:41 AM
1388
Views
1
Ans
Gopan.P
OSH (orchestrate ) is the answer , when we develop any parallel job it converts into OSH
script in the back ground , of course C++ because when we use transformer stage the code
generated in the background is C++.
Answer Question Select Best Answer
152011
JUL
02:02 AM
7036
Views
7
Ans
nareshketepalli
Increasing node will not always increase speed it might create lock for example in mass
update .
Indudhara V
o
Node is just a process. Its a logical thing used to increase the efficiency of the jobs by
running them in parallel just like multi processing in operating system. Each node processes
may run on the same processor or different processors.
Answer Question Select Best Answer
OCT
022014
08:05 AM
2069
Views
2
Ans
taruna.arya
Say I have 5 rows in a source table and for each rows 10 rows matching in a lookup table and my range
is for lookup is 9 to 99. what will be the row count in output table?
barak
o
That depends on , if you set up "more than one row match" you will get you "Cartesian
Product"
ramireddy
o
You will get same no of records in the source even there are duplicates in look up data
Answer Question Select Best Answer
OCT
162014
08:02 AM
1750
Views
1
Ans
Design a job
Alekhya ch
I have table(Emp) with the columns Eid,Ename,Sal,month(sal),year(sal) and DOB (say 15th-Jan-1981).
Design a job such that the output displays Ename,year(sal),tot(sal) and current age i.e. For Ex: 18 yrs
barak
o
Using Transformer Days Between and divided by 365 - we can check for year with 366 if
that's a big deal .
Answer Question Select Best Answer
OCT
022014
07:58 AM
1673
Views
3
Ans
taruna.arya
Suppose i am having a source file and 3 output tables and I want first row to be written to first table
second row to second table, third row to third table likewise how can we achieve this using datastage
without using partitioning?
Dheerendra
o
Use SEQ---> Transformer stage ----> 3 SEQ Files In transformer stage add constraints as
Mod(@INROWNUM,3) =1, Mod(@INROWNUM,3) =2, Mod(@INROWNUM,3) =0
Pratheeksha
o
OCT
022014
08:10 AM
2098
Views
1
Ans
taruna.arya
How can we identify updated records in datastage? Only updated records without having any row-id and
date column available.
Vikram Singh
o
I believe you can use the change capture stage which will show you the data before any
update change made and after the update change was made. Should work, if there are no
other constraints.
Answer Question Select Best Answer
JAN
032013
12:26 AM
14325
Views
9
Ans
naveen.chinthala
How to get top five rows in DataStage? I tried to use @INROWNUM,@OUTROWNUM system variables
in transformer..but they are not giving unique sequential numbers for every row...please help!
Thanks in advance!!
Vikram Singh
o
You can use Head Stage, that would be the most convenient way of getting top N rows from
the dataset.
Poorna
o
You can restrict data it at source stage level its self @ using filter option.
Apply in filter : head -5
Answer Question Select Best Answer
OCT
022014
08:06 AM
1578
Views
2
Ans
taruna.arya
SCD is a common problem particular to database.This applies to cases where the attributes
for a record varies over time.This can be solved by 3 types.type1,type2,type3
Answer Question Select Best Answer
OCT
022014
08:03 AM
1680
Views
2
Ans
taruna.arya
How we read comma delimiter file in sequential file stage and how we can remove the header and footer
from a comma delimiter file?
deepakatsit
For Removing Header file you can use SED Command of UNIX to remove Header.
deepakatsit
For Sequential file there is an option "First Row Column Name" set to true so Header is
removed and File End Type You need to select through which Footer is Removed.
Now For Comma Delimiter Use : :"Field Defaults -> Delimiter = Comma"
Answer Question Select Best Answer
112014
JUL
05:03 PM
3076
Views
7
Ans
Datastage job
hema123
i have a sequence of job in DataStage which is taking more than 4 hrs which is supposed to complete in
less than 1 hr
so what could be the possibilities to take much longer time than expected?
RamyaSujith
o
Check if any stage is reading/processing data sequentially which could have been done in
parallel.
Raveena Mittal
o
When sometimes ,some DML statements are executed in the db and they are not
committed,then also it may keep the job running for long hours
Answer Question Select Best Answer
182011
MAY
11:23 AM
15961
Views
19
Ans
Boopathy Srinivasan
Hello guy's, I would like to solve this by using the Change capture stage. First, i am going to
use source as A and refrerence as B both of them are connected to Change capture stage.
From, change capture stage it connected to filter stage and then targets X,Y and Z. In the
filter stage: keychange column=2 it goes to X [1,2,3,4,5] Keychange column=0 it goes to Y
[6,7,8,9,10] Keychange column=1 it goes to Z [11,12,13,14,15] Revert me PLz
Jithin
o
Do a full outer join between two files and from transformer draw three output links
1st link-->wherever left side is null
2nd link->wherever right side is null
3rd link->wherever match is there
ghost
o
create one px job. src file= seq1 (1,2,3,4,5,6,7,8,9,10) 1st lkp = seq2
(6,7,8,9,10,11,12,13,14,15) o/p - matching recs - o/p 1 (6,7,8,9,10) not-matching records o/p 2 (1,2,3,4,5) 2nd lkp: s...
Improve Answer
FEB
082006
07:05 AM
12732
Views
10
Ans
anunath
pundlik
o
SAS DI studio is best when compared to Informatica and Datastage as it generates SAS
code at the back end .SAS is highly flexible compared to other BI solution.
Kiranchandra
How to get a files from different servers to one server in datastage by using unix command?
Murali
o
Code
2 AUG
182014
06:14 AM
2495
Views
1
Ans
Display files date wise like aug 18th,19th,29th data files by using
Unix cmd?
o
Kiranchandra
Ashok
o
You can display files date wise by doing normal ls -latr cmd
Answer Question Select Best Answer
2 AUG
052014
09:29 PM
2643
Views
1
Ans
Kiranchandra
How can we load three different flat files(1 file .txt,2 file .csv,3 file xml) to sequential file at a time?
Devesh Ojha
o
If Metadata is the same then we can load by doing the Union operation and if metadata is
diffrent then first sync the metadata and then load them .
Answer Question Select Best Answer
3 SEP
122013
05:21 AM
4517
Views
3
Ans
naveen.chinthala
I have explored all the available functions in the transformer stage, but could not find the exact function to
get the last day of the current month. Can you please show me which function is available for this logic.
Arunjith B Indivar
DaysInMonth(CurrentDate())
venueksh
Oracle
242005
4 MAY
05:54 PM
10322
Views
3
Ans
What are Routines and where/how are they written and have you
written any routines before?
o
DataStage Interview Questions
Routines are stored in the Routines branch of the DataStage Repository, where you can create, view or
edit. The following are different types of routines: 1) Transform functions 2) Before-after job
subroutines 3) Job Control routines
Chalapathirao Maddali
o
Jul 11th, 2014
Datasatge has 2 types of routines ,Below are the 2 types. 1.Before/After Subroutine.
2.Transformer routines/Functions. Before/After Subroutines : These are built-in
routines.which can be called in...
bvrp
RoutinesRoutines are stored in the Routines branch of the DataStage Repository,where you
can create, view, or edit them using the Routine dialog box. Thefollowing program
components are classified as ...
Answer Question Select Best Answer
5 JUN
202005
05:59 AM
4130
Views
6
Ans
Ramyapriya Sudhakar
o
Jul 9th, 2014
operational data store : It is unlike real EDW ,data is refreshed near real time and used for
routine business activity. It is used as an interim logical area for data warehouse. This is the
pla...
Dharmendra
o
An operational data store (or "ODS") is a database designed to integrate data from multiple
sources to facilitate operations, analysis and reporting. Because the data originates from
multiple sources,...
Answer Question Select Best Answer
072014
6 JUL
02:07 AM
2933
Views
0
Ans
manojkitty
Answer Question
7 DEC
282012
12:49 AM
2078
Views
1
Ans
google_yahoo
What is the purpose of using user defined environment variables and parameter sets. I m little bit
confused. Could any one explain me in detail.?
Charmi
o
Hi,
Parameter Set is used when you want a set of user defined variables to be used many
times in a project.
For example, Variables like Server Name, User Id, Password can be added in a parameter
set and that can be used across the jobs , instead of including the three variables
everytime.
Answer Question Select Best Answer
8 APR
112012
12:13 AM
2762
Views
1
Ans
upendarkm
Rajesh B
o
9 OCT
272011
05:15 AM
7775
Views
4
Ans
ramamulas
karthick
o
Write a server job routine that takes input as the file and reads the parameters from it. If the
file contains more than one parameter each in a separate line, the your routine should
concatenate them...
Answer Question Select Best Answer
10 OCT
242013
09:58 PM
36097
Views
1
Ans
Sam Geek
Previous value in one stag_v1 and presend value in stg_v2 compare the two , if greater
then stg_v1=stg_v2 and move to next value. else loop it.
Answer Question Select Best Answer
11 NOV
032007
11:52 AM
4245
Views
3
Ans
srinivas
How do you you delete header and footer on the source sequential file and how do you create header and
footer on target sequential file using datastage?
Kalai
o
"Output --> Properties --> Option --> Filter --> add sed command here" to delete header and
footer records
leelasankar.pr
By using UNIX sed command we can delete header and footeri.e; for header
p'and footer
sed -n '$|p'
Answer Question Select Best Answer
12 FEB
132014
02:26 PM
sed -n '1|
10476
Views
5
Ans
NaveenKrish
A Sequences is calling activity 1, activity 2 and activity 3.while running, activity 1 and 2 got finished but 3
got aborted. How can I design a sequence such that the sequence has to run from activity 2 when I
restart the sequences?
Mallikarjuna_G
To make the job re-run from activity 3, we need to introduce restartability in the sequence
job. For this below points have to be taken care of in Job Sequence Adding Checkpoints:
Checkpoints have t...
Ritwik
o
You have to check the " Do not checkpoint run " checkbox for activity 2. If you set the
checkbox for a job that job will be run if any of the job later in the sequence fails and the
sequence is restarted.
Answer Question Select Best Answer
13 JAN
212014
03:59 PM
5154
Views
2
Ans
How to seperate two diff datatypes and load it into two files?
o
premox5
I think this Question is to confuse the Job Aspirant by using Datatypes and all... Its very
simple... File1-->2 Columns. 1.NO(Integer) 2.DEPT(Char). Target1: NO(Integer), Target2:
DEPT(Char). Take ...
Lubna Khan
o
In Transformer stage there is one function IsInteger and IsChar , We can identify If IsInteger
(column name) then file1 else file2
Answer Question Select Best Answer
14 JAN
202006
09:16 AM
14939
Views
12
Ans
Phantom
vij
o
Hope the below one helps you.Join Stage: 1.) It has n input links(one being primary and
remaining being secondary links), one output link and there is no reject link2.) It has 4 join
operations: inner...
Answer Question Select Best Answer
15 DEC
192010
10:53 PM
11435
Views
7
Ans
goodfriendsri
The main differences between 8.1 and 8.5 are 8.5 has Input looping ,Output looping. 8.5
has saving ,editing and compling is 40% faster. 8.5 has functions like LastRow,LastInGroup
and Iteration system...
Rupesh Agrawal
o
Nov 8th, 2013
16 JUN
052006
01:15 PM
4420
Views
2
Ans
kishore
Itishree
o
You can use either of the below options: Seq File->Sort->Remove Duplicate ->O/P in
Remove duplicate stage, choose the key column and duplicate to retain last properties Seq
File->Remove Duplicate ->O/...
Murat Nur
o
11
21
32&nbs...
17 OCT
132011
11:20 AM
5314
Views
4
Ans
amulas
How can we retrieve the particular rows in dataset by using orchadmin command?
sunitha
o
18 SEP
022013
11:49 PM
6089
Views
7
Ans
Sushils13
3
6
6
<br...< p=""></br...<>
bhargav
o
First take copy stage after take the 2 stages from copy at a time lookup and aggregater in
aggregater take count -> id after filter1 count =1 out put to lookup as a reference link and
stream link is copy stage again take filter2-> count=1 target1 and count1 target 2
Rohit K A
o
Define 3 Stage Variables in Transformer Stage. StageVar1 will hold the I/P field ID values.
StageVar2 hold StageVar1 Value and then write a condition in StageVar3 -> If
StageVar1=Stagevar2 Then "Repea...
Answer Question Select Best Answer
19 MAR
062013
02:27 AM
1559
Views
1
Ans
pamidisanthi
Rupesh Agrawal
o
Nov 8th, 2013
naveen.chinthala
rajesh
o
APR
182011
04:18 AM
2967
Views
1
Ans
singh6
vinay
o
Name cleaning
Address cleaning
Answer Question Select Best Answer
SEP
032013
03:25 AM
3907
Views
1
Ans
Ram_1104
101, 1 , 5, 1
Based on Qty value records will be incremented.If qty value is 4 then o/p will be like below
Num,SeqNo,Ln,Qty
101, 1...
akhilesh.shivhare
JUN
242012
10:46 AM
34869
Views
1
Ans
dwhnovice
Hi Guys,
I have 3 yrs for experiences in datastage , though not much practical experience due to various reasons,
now I have been asked this question in 4 of my interviews and I am always flounder at it . Have tried
different approaches , telling the truth , telling a real situation I faced which actually was not that difficult ,
but I always seems to flounder at this question...
Kamalakar Kalidindi
o
I too fed up with this question..i gave answer like this, every new job is difficult..when we are
building that job for first time, it will be difficult only..among those implementing SCD type -2
(ins...
Answer Question Select Best Answer
AUG
182013
02:14 AM
6679
Views
1
Ans
Mallikarjuna_G
Sushils13
JUN
272005
06:21 AM
25493
Views
11
Ans
o
DataStage Interview Questions
We use a) "Iconv" function - Internal Convertion. b) "Oconv" function - External Convertion. Function to
convert mm/dd/yyyy format to yyyy-dd-mm is Oconv(Iconv(Filedname,"D/MDY[2,2,4]"),"D-MDY[2,2,4]")
vipul choudhary
SEP
042011
02:56 AM
4712
Views
4
Ans
ARJUN REDDY
Sushils13
Call Interface (OCI), it is a set of low-level APIs used to interact with Oracle
databases. It allows one to use operations like logon, execute, parss etc. using a C or C++
program.
Oracle
Ramon
o
NOV
212011
11:15 AM
6167
Views
2
Ans
yaoliang
Sushil
o
Basically Environment variable is predefined variable those we can use while creating DS
job.We can set either as Project level or Job level.Once we set specific variable that
variable will be available into the project/job
glaciya
The Environmental variables in datastage are some pathes which can support system can
use as shortcuts to fulfill the program running instead of doing nonsense activity. In most
time, environmental variables are defined when the software have been installed or being
installed.
Answer Question Select Best Answer
JUN
072011
04:10 AM
6303
Views
2
Ans
G_G_GOUD
How to perform left outer join and right outer join in lookup stage
sivaksa
You need right outer join in lookup..please change the link order like left to right, Right to left
put lookup failure condition ..continue...
mallika_chaithu
In Lookup stage properties, you will have constraints option. If you click on constraints
button- you will get options like continue, drop, fail and rejectIf you select the option
continue: it means l...
Answer Question Select Best Answer
AUG
182011
08:25 AM
6213
Views
2
Ans
devanand
Dinesh
o
OCT
132011
11:26 AM
16856
Views
8
Ans
amulas
sivaksa
o
242005
MAY
05:58 PM
7141
Views
3
Ans
o
DataStage Interview Questions
Sequencers are job control programs that execute other jobs with preset Job parameters.
Read Best Answer
A sequencer allows you to synchronize the control flow of multiple activities in a job
sequence. It can have multiple input triggers as well as multiple output triggers.The
sequencer operates in two modes:ALL mode. In this mode all of the inputs to the sequencer
must be TRUE for any of the sequencer outputs to fire.ANY mode. In this mode, output
triggers can be fired if any of the sequencer inputs are TRUE
Mallikarjuna_G
Sequencer Activity Stage is stage that is used to control and coordinate the flow of
execution in a Job Sequence. It can accept multiple input triggers and multiple output
trigger. It has two modes...
Bimal Pradhan
As the name suggests, it is basically to execute the jobs in sequence.For example, if you
have 5 jobs which hare interdependent , then you would call them from a sequencer.the
execution of 2nd job will be dependent and will be triggered only after execution of the 1st
job.
Improve Answer
242005
MAY
05:57 PM
8581
Views
6
Ans
o
DataStage Interview Questions
In almost all cases we have to delete the data inserted by this from DB manually and fix the job and then
run the job again.
Read Best Answer
Have you set the compilation options for the sequence so that in case job aborts, you need
not to run it from from the first job. By selecting that compilation option you can run that
aborted sequence from the point the sequence was aborted.
Like for example, you have 10 jobs(job1, job2, job3 etc.) in a sequence and the job 5 aborts,
then by checking "Add checkpoints so sequence is restartable on failure" and "Automatically
handle activities that fail" you can restart this sequence from job 5 only. it will not run the
jobs 1,2,3 and 4.
Please check these options in your sequence.
Hope this helps.
Mallikarjuna_G
Two things needs to be handled when a Job Sequence aborts. 1. We must have an
exception handling code that notifies us about the failure 2. When we re-run the sequence
after fixing the Sequence, it ...
Riten
o
To Handle Aborted Sequence These are the steps to be taken: First Add Terminator to Job
Sequence and Choose Terminator with Other option(Trigger in Job sequence) ( if job fails it
will go to Other lin...
Improve Answer
AUG
282007
01:01 AM
3718
Views
7
Ans
How to connect two stages which do not have any common columns
between them?
Praveen
Mallikarjuna_G
If those two stages are sources and if it is valid that though the names of columns from two
sources differ but they are same, then use a copy stage infront of one source stage and
rename column as pe...
muralidhar
o
JUN
172008
07:51 AM
5043
Views
4
Ans
sharmilas
how to fetch the last row from a particular column.. Input file may be sequential file...
Mallikarjuna_G
There are multiple ways to do this, given that input is a sequential file. Two of them are
1. Use "Filter" option available in Seq file stage. Specify a unix command like: tail -1
(recommended)
There are 2 ways to fetch the last row of any file if the number of records in the file is not
known before run time:
1) Use the tail stage and run the tail stage in sequential mode
2) Use the lastrow() function in the transformer stage
Answer Question Select Best Answer
JUN
262013
08:37 AM
8432
Views
3
Ans
rameshkk
Use Copy stage instead of a Transformer for simple operations like : placeholder
between stages renaming Columns dropping Columns implicit (default) type
Conversions USe Stage variab...
Prabhakar Achyuta
o
1. First filter then extract. But dont extract and filter. Use SQL instead of table method when
extracting. Say 1 million records are coming from input table but there is a filter condition
(Acct_Type...
Answer Question Select Best Answer
132011
MAY
05:33 AM
5450
Views
6
Ans
siva3me
What is the uses of the copy stage with out copies input link to output link datasetsand it can have any
purposes pls send me with example
Mallikarjuna_G
Besides used for making copies of the input, copy stage is also helpful for achieving 1)
dropping columns between stages 2) to change column names 3) if the scenario is to end
the job flow directly f...
Answer Question Select Best Answer
APR
112012
12:24 AM
3499
Views
3
Ans
upendarkm
Mallikarjuna_G
Is Slowly Changing Dimension (SCD) Implementation we use Surrogate key as the primary
key is being duplicated for the sake of keeping history data for the records with the same
pk.
Answer Question Select Best Answer
AUG
282012
03:14 AM
5463
Views
2
Ans
Datastage partition
sreereddi
For Join, Merge and Remove duplicates, have data on links hash key partitioned an sorted
on Key columns specified. For lookup - primary link needs to be hash key partitioned and
sorted and reference link has to use entire partition method.
Shaik
o
Key partition is required and data should be sorted before all these stage while using
Answer Question Select Best Answer
AUG
312012
01:42 PM
5182
Views
4
Ans
yasodha krishnasamy
Mallikarjuna_G
It depends on the job design and requrement. When you specify SAME partitioning,
datastage uses the partioing method defined in the previous stage and will not perform any
partitioning in the current...
Muralidhar
Same partition.
The reason is it keeps the previous partition and send to output as it is.
No of processors in DataStage
o
naveen.chinthala
How do we know the no of processors in a job? Is there any specific calculation for this.
Mallikarjuna_G
There will not be any processors in the job. Your question could be - find the number of
nodes/processors on which the job is running. So the answer to this - go to Director and
open the log for the ...
Kumaresh
o
Run the job. Go to director log. Check for the apt config data displayed over there. That will
show the number of processors/nodes.
Answer Question Select Best Answer
JUN
022013
01:04 AM
3602
Views
2
Ans
sunitha.gummudu
Hi,
Job1 is running for 10 minutes for the first time and the same job1 is running for 15 minutes for the
second time ,the load being the same for both.Could someone explain..
Mallikarjuna_G
One reason could be that - for the first time the job is just loading/inserting data into target.
When you run the same job again, it would take more time as it tries to update from second
time. Upda...
Raju Nath
o
Hi you can use the delete and then load command during loading then it will be the same
time. First time when you are loading that time the dataset/table is empty that is why its
taking less time bu...
Answer Question Select Best Answer
JUN
262013
08:34 AM
4286
Views
3
Ans
Count in Dataset
o
rameshkk
How to get the dataset record count? with out using orchadmin command
Mallikarjuna_G
Browse the path where data set loaded and run wc -l filename
Answer Question Select Best Answer
MAR
282005
06:18 PM
4951
Views
4
Ans
Data Modelling is Broadly classified into 2 types. a) E-R Diagrams (Entity - Relatioships). b) Dimensional
Modelling.
BHANU
o
Dimensional modeling is a technique for conceptualizing and visualizing the data there are
two types of dimensional modeling
1.Snowflake schema
2.Star schema
cyberdiptikanta
OCT
042008
05:25 AM
9032
Views
3
Ans
Vanshika248
What are these terms used for in Datastage - Downstream and Upstream?
mohan
o
Could you plz site examples for upstream and downstream applications
Answer Question Select Best Answer
DEC
05:28 PM
062005
6977
Views
8
Ans
vishut
ss
o
1 conductor process
3 Section leader process
3 player process ( modify and filter are combined)
1 source sequential file
1 target sequential file
Total : 9
Answer Question Select Best Answer
MAR
042013
11:47 AM
4566
Views
2
Ans
goutam421
Your database size is the sum of all datafiles,tempfiles,redologs. so check the sum from
dba_data_files,dba_temp_files and v$logfile
Harikrishna Chidrala
o
Mar 15th, 2013
It Depends about the size,Example My Development Database size is 6TB and UAT Server
is 12 TB and Production is ~25TB.
Answer Question Select Best Answer
MAR
062013
02:17 AM
2456
Views
1
Ans
skjilani29
ysubba
o
OCT
202011
11:05 PM
4970
Views
1
Ans
Datastage 8.1
o
rachel797.ds
In datastage 8.1 What is the limit of the file size? Is there limit from # of rows and # of field perspective for
file extract to be fed in profile stage
varun khare
In 8.0/8.1 it uses parser which requires the entire XML document to be loaded into memory,
so that youre limited by the amount of available memory. In 8.5 it uses a really clever
streaming app
Answer Question Select Best Answer
JAN
252008
06:47 AM
4746
Views
4
Ans
Dataset in UNIX
o
manoharkolukula
How to see the data in the Dataset in UNIX. What command we have to use to see the data in Dataset in
UNIX?
karn khera
o
The command which must be used is orchadmin dump datasetname(without quotes). But
before that run this command "cd `cat/.dshome`dsenv" without quotes. The reason is: cd
`cat/.dshome`-> This will cha...
Saravanan Mani
o
Jun 20th, 2012
Orchadmin rm datasetname
Answer Question Select Best Answer
DEC
282012
12:08 AM
4479
Views
1
Ans
google_yahoo
Hi all, pls let me know the purpose of using User defined environment variables and parameters sets
MallikarjunaG
An user defined environment variable is a place holder to store a value that can used in the
entire project. Parameter set is new time saving feature added in DataStage 8x. Parameter
set is set of jo...
Answer Question Select Best Answer
MAR
132009
07:37 AM
7896
Views
11
Ans
rajivkumar23us
A sequential file has 8 records with one column, below are the values in the column separated by space,1
1 2 2 3 4 5 6In a parallel job after reading the sequential file 2 more sequential files should be created,
one with duplicate records and the other without duplicates.File 1 records separated by space: 1 1 2 2File
2 records separated by space: 3 4 5 6How will you do it
hussy
o
Its very simple: 1. Introduce a sort stage very next to sequential file, 2. Select a property
(key change column) in sort stage and you can assign 0-Unique or 1- duplicate or viceversa
as you wish. ...
Hemant Kanthed
o
Feb 19th, 2012
After source sequential we can use sort stage with dump_key in which 0 is assigned to
duplicate record and 1 is assigned to non duplicate record after sort stage we can use
transformer stage in whic...
Answer Question Select Best Answer
OCT
302009
12:16 AM
5127
Views
3
Ans
srikanth.ds
If are given a list of .txt files and asked to read only the first 3 files using seq file stage. How will you do it?
matan
o
In sequential file we can take a single file by using the file as Specified file.
But we can take the more than one file use File Pattern with different file names.
Metadata must be same.
narra satish
HI, In sequential file we can take a single file byusing the file as Specifyed file.But we can
take the more than one file use File Pattern with different file names.Metadata must be
same.
Answer Question Select Best Answer
AUG
112010
04:34 AM
14215
Views
9
Ans
greek143
If you have Numerical+Characters data in the source, how will you load only Character data to the target?
Which functions will you use in Transformer stage?
raj
o
Example : raje123ndh456ar
Code
1. convert(0123456789,,raje123ndh456ar) = rajendhar
now it is converted to character string and we can load only character.
Vinay Sharma
o
2 AUG
212008
01:44 AM
4095
Views
8
Ans
rajeshdhannawat1
What is
1.Stage variables
2.Constraints
3.Derivations
AMIT
o
Order of execution
1. Stage variable
2. Constraints
3. Derivations
Answer Question Select Best Answer
292012
3 MAY
08:14 AM
7777
Views
2
Ans
Jothi D
Incase of only dropping columns(without any transformations or business rules), we can go for copy stage
instead of transformer.But can anyone tell me exactly why copy stage is better in performance than
transformer?
Atangel
Transformer is a heavier stage as apart from the orchestrate we do have the C++ operators
for various derivations. Right from compilation to execution a Transformer will always take
more time than a copy stage which is a simple straight forward passive(if we can call) stage
kkreddy
o
Copy stage is a passive stage and Transformer is Active and it involves the process usage
and copy does not . So its good to go with copy since it takes less time then transformer to
propagate the columns over the link
Answer Question Select Best Answer
4 AUG
242011
07:58 AM
7595
Views
3
Ans
dronadula
Source has sequential file stage in 10 records and move to transformer stage it has one output link 2
records and reject link has 5 records ? But i want remaining 3 records how to capture
Shirisha
o
You can choose O.W option in constraints, so remaining records you will get.
anil_k_nayaka
In the transformer Stage Constraints we can define the constraints where this records can
get to which link so that the missing records can get catched.
Answer Question Select Best Answer
042008
5 JUL
04:33 AM
6642
Views
13
Ans
Sequencer Scenario
o
pavan.daddanalla
Scenario- if suppose we have 3 jobs in sequencer, while running if job1 is failed then we have to run job2
and job 3 ,how we can run? plz ans this thanks in advance
Rekha Ramakrishnan
In the first sequence trigger condition there would be option like condition, unconditional,
failed etc. You can select Unconditional so that even if first sequence fails or succeed it
automatically redirect to next sequence.
Gopi N
o
If the scenario we have like 1st is abort and then trigger 2nd and 3rd do the below, give the
trigger condition like, 1> If job aborted give the link trigger condition which is going to 2nd
and 3rd j...
Answer Question Select Best Answer
6 FEB
232008
08:45 AM
7263
Views
7
Ans
datastage8
If you have a huge volume of data to be referenced, which stage will you use? Join or Lookup stage?
Why
Gopi N
o
If we have a Huge data at reference defiantly we should go to Join stage because Look up
takes much time to process hence we will use entire partition but in Join stage we will give
the sorted data and it will simplify better than Look-up
arjunreddy
o
Look up stage
Answer Question Select Best Answer
7 APR
112012
12:07 AM
3314
Views
1
Ans
upendarkm
Gopi N
o
We can filter the data in hash file based on the key column, If we have a key duplicate we
reject in to files as well by writing a SQL statement
Sowmya
joel
o
The above answer is wrong. Please follow the steps below to load the excel in dataastage :first open excel sheet and open->saveas-> then save the file as .csv extension while
importing u have to s...
srinu5077
Profile Answers by srinu5077 Questions by srinu5077
I did same as step1, but when i import the metadata it asking username and password,
when press ok button it shows NO MATCHES FOUND, pls help me
Thank you
Regards
Srinivas
Answer Question Select Best Answer
JAN
082007
10:48 AM
2509
Views
1
Ans
izack
Shiv
o
1. Dynamic RDBMS is the only stage that supports N inputs and N outputs.
2. Using Dynamic RDBMS we can read multiple tables independently.
Answer Question Select Best Answer
AUG
292005
08:10 AM
3771
Views
3
Ans
mcrao1
DataStage Job run from Unix Command Line I am running DataStage Job from Unix
Command Line with job level parameters the job is getting abort , can someone correct if
there is any syntax problem in t...
Pavan
o
We can call Datastage Batch Job from Command prompt using 'dsjob'. We can also pass all
the parameters from command prompt. Then call this shell script in any of the market
available schedulers. The 2nd option is schedule these jobs using Data Stage director.
Answer Question Select Best Answer
APR
112012
12:35 AM
4811
Views
1
Ans
upendarkm
The errors can be handled in different way, will list few of the methods which we can
achieve via ETL If it is a critical fact table (Merchandise, location and time (weekly)) then
Non-Critical Data...
Answer Question Select Best Answer
APR
262012
09:38 AM
3818
Views
2
Ans
Constaints
o
upendarkm
Constraints are used for filter the data as transformer stage and from in sql different type of
constrains having null, isnull, primarykey,foreign key and unique constrains Ans unique
constrain is not allowed duplicate values and Notnull constrain means should be having a
data in that column.
suresh
o
Constrain is nothing but it is an condition, it is might be any type of condition. and second
one is unique and not null means unique is maintain one record do not accept duplicate
record . not null m...
Answer Question Select Best Answer
MAR
132006
03:15 PM
8921
Views
7
Ans
mallikharjuna reddy
praveen
o
Data set and file set are file/stage ,these to are data extract from db.in the data set does not
have rejected link file set had rejected link
arjun
o
The fundamental concept of the Orchestrate framework is the Data Set. Data Sets are the
inputs and outputs of Orchestrate operators. As a concept a Data Set is like a database
table, in so far as it...
Answer Question Select Best Answer
AUG
252005
06:32 AM
2475
Views
8
Ans
pradeep
o
Actually Data stage director is the GUI based component and this is the Clint component,
The main use of Director is view logs, we can see view log and status of running job,By
using the multiple instances we can run the job in period of time for testing
propose.schedule the job.
abhijeet
APR
112012
12:30 AM
4030
Views
1
Ans
upendarkm
pradeep
o
To the best of my knowledge, By using partitioning technique may be used for performance
increases. and to select relevant stage for developing transformation
Answer Question Select Best Answer
242005
MAY
05:52 PM
2010
Views
3
Ans
DS 7.0.2/6.0/5.2
svr
o
Datastage 7.5x2
Answer Question Select Best Answer
APR
262007
04:46 AM
2269
Views
2
Ans
sandy123123
Seio
I'm interested in this answer because for any reason my jobs appear not compile after I
begin the execution and I need an automatic process that auto compile the jobs when it
founds uncompiled.
Thanks.
nikhilanshuman
If the job failed to compile, there must be some problem with the job.If the problem is not
fixed then no matter how many times it is recompiled, the compilation will fail each
time.Automatic recompile option does not seem to be there in datastage.
Answer Question Select Best Answer
242005
MAY
05:47 PM
4010
Views
2
Ans
KRISHNA
o
Explain the interviewer about your project, its architecture and your role in that project. The
role can be developer, Tester, BA, lead etc.. Also, explain what you did in that project like
coding, preparing UTC, contributing effectively with testers to complete the UAT phase of
the project etc...
JUN
212010
10:10 PM
3844
Views
2
Ans
gantaravindranath
How do you perform commit and rollback in loading jobs? What happens if the job fails in between and
What will you do?
Sivaramakrishna
o
Feb 23rd, 2012
There is an option in RDBMS stage as Transaction Isolation tab whether data commit or un
commit.
Gokul21
There is an option called "Transsaction Grouping" is available. You can specify the condition
there to commit or rollback if the job fails. This option is available in teradata.
Answer Question Select Best Answer
FEB
172011
12:10 PM
3751
Views
4
Ans
nag_sree
If you want rejected records also, you need to use an outer join so that all the records are
carried forward and then youll need to use next stage as a transformer. In the transformer,
you can use appropriate condition in the constraints and redirect the unwanted records to a
sequential file.
Bharath
o
Outside the join stage, you can put a filter to filter out the records having null values for the
columns coming from right part of the join
Answer Question Select Best Answer
AUG
122008
10:39 AM
3548
Views
3
Ans
rajashekar kuraku
Datastage allows you to create your own stages with custom properties. There are three
types of custom stages: Custom: This is Orchestrate based custom stage. Orchestrate
operators are used for def...
MArcus_Datastage
FEB
082012
05:25 AM
2691
Views
0
Ans
MrReddy
Hi,
I have installed the datastage 7.5 in XP operating system, also installedvisual studio
compiler. But when trying to compile the datastage job which contain the transformer stage, I am unable
to compile.
Below warning I am getting:
FEB
262011
06:08 AM
3208
Views
1
Ans
praveen.bollu
What happens if we run 7.5 vesion job in 8 version? what is the error?
Mohammadsadiq
o
Feb 2nd, 2012
If we compile Datastage 7.5x2 job in compatible mode then it will run....if you wont complied
and run in 8.x....then it will pop up some error
Answer Question Select Best Answer
JUN
012010
12:29 AM
2209
Views
4
Ans
Nodes
o
gantaravindranath
Node is nothing but an identified by the number of nodes on which parallel jobs can run r
not
j.padma89
OCT
232007
03:44 AM
3653
Views
4
Ans
amarnreddy09
vjviji86
I work on Datastage 8.1 ...Most of the Clients framework prefers Connect DirectProtocol
when compared to SFTP?
nikhilanshuman
In Datastage there are the client components like Designer, Manager, Director.When you
open these applications, you are asked the user id, password and the Project name to
which to want to connect.The...
Answer Question Select Best Answer
NOV
072005
07:16 AM
2658
Views
1
Ans
ramireddy
anil_k_nayaka
UTF 8 is a type of file format to read files in dos format by datastage jobs.
This file may contain characters not only in English but also in foreign language
Answer Question Select Best Answer
FEB
282007
02:36 AM
7465
Views
3
Ans
infinity
kaps3157
o
Both stages functionality and responsibilities is same. But the difference way of execution
like.. In filter stage, we have to give the multiple conditions, on multiple columns. But every
time data ...
nikhilanshuman
A switch stage can have maximum 128 output links.A filter stage can have any number of
output links.
murali.d
How to extract your flat files using sequntial file or dataset? From where to get these flat files as source?
manish parashar
o
Jan 10th, 2012
Flat files can be accessed from unix by using sequential file stage. For example if we are
having csv file . we can give the path in the sequential stage for this file and we can access
the file directly.
Answer Question Select Best Answer
NOV
01:58 AM
142011
2957
Views
2
Ans
Gina Ying
I can only see the name of the stage variables but cannot see the definition, where can I find them?
manish_toy
If you are not able to view the definition then its has not been given. You have to provide the
definition by double clicking to the left of the variable and give the definition as per your
requirement.
manish_toy
If it is not visible in the transformer stage then the definition has not been given. You have to
provide the definition for the stage variable by double clicking to the left of the variable, then
you can give the definition as per your requirement.
Answer Question Select Best Answer
MAR
232006
12:25 PM
1478
Views
1
Ans
James
glaciya
Please check your language, it may be the language of your computer did not match your
servers language.
Answer Question Select Best Answer
AUG
112010
05:05 AM
4229
Views
3
Ans
greek143
What are the initial values of stage variables? How can we set this values?
glaciya
In my thought, the inital variables are the variables which can be defined when you just
have installed the software, in another word they are usually called default variable or you
have the first chance to create and config some variables that are used in a stage.
G. Venu
In Transformer stage properties we can set the stage variable initial value.
Answer Question Select Best Answer
NOV
142011
01:51 AM
2825
Views
0
Ans
Gina Ying
I can see the names of newly defined variables in the transformer stage property, but I cannot see the
definition and the logic under it. Is it normal? Where can I find the definition of the stage variables?
Answer Question
OCT
202011
03:49 AM
4983
Views
0
Ans
chandukommuri
Could any one please explain me how can we use change capture stage for scd1,2,3 types?
Answer Question
APR
222010
01:28 PM
1918
Views
2
Ans
Lookup Stage
o
sujan544
Lookup stage is similar as join stage,what ever the records from source to target we can
use by using lookup, join and in the lookup we have reject link option and join doesn't have
reject link option.
j.padma89
Look up stage is used to join dataset which has similar functionality of join stage with some
extras.
the difference between join and look up is here
http://www.geekinterview.com/question_details/82179
Hope it helps !
Answer Question Select Best Answer
JAN
262008
07:22 AM
5475
Views
4
Ans
pradeep.dwh
in how many ways we can delete dataset? If a record is duplicated 3 times then how to get middle
duplicated record? Is it advisable to use basic Tfr in Parallel jobs?
chhavis928
We can use filter stage here to get middle record from 3 duplicate records.
srkreddy111
o
First you have to open the data set and click on the partitioning and after click hash partition
and next click perform sort after click on unique and after ok and after compile and run the
job.open the target output,the duplicate records are removed..
Answer Question Select Best Answer
AUG
032011
02:18 AM
8583
Views
4
Ans
Mohit
Hi Im not sure which version of datastage you r using..this is pretty much simple in
Datastage8.5 by using DTS
Thanks n Regards
Venkat Duvvuri
Answer Question Select Best Answer
AUG
272007
03:13 AM
4302
Views
3
Ans
rangagopi
Venkat Duvvuri
o
Sep 9th, 2011
S I agree with Hari's answer..Thae max size of the dataset stage is entirely depends upon
the size of the resource disk space, which we have specified under config file.
Regards,
Venkat Duvvuri
harishsj
The Max size of a Dataset is equal to the summation of the space available in the Resource
disk specified in the configuration file.
Answer Question Select Best Answer
AUG
262011
10:37 AM
3272
Views
0
Ans
shyam
Can anybody tell in detail why we use Rate Routing in Repetitive Mfg. instead Normal Standard Routing?
Answer Question
242005
MAY
05:50 PM
2619
Views
3
Ans
Did you Parameterize the job or hard-coded the values in the jobs?
">
Always parameterized the job. Either the values are coming from Job Properties or from a Parameter
Manager a third part tool. There is no way you will hardcode some parameters in your jobs. The
often Parameterized variables in a job are: DB DSN name, username, password, dates W.R.T for the
data to be looked against at.
venkat
o
Never hard codes hardcode parameters in your jobs,Always parameterized the job. Either
the values are coming from Job Properties or parameters set
venkat
09940692102
Bimal Pradhan
Never hard-code , until and unless it is required.Always parametrize the job. If the i/p
parameters are dynamic or not in your control then , it should taken by scheduling jobs.
Answer Question Select Best Answer
MAR
06:18 PM
282005
7746
Views
2
Ans
U can use dsjob executable command from unix or command line.The previous post was
correct.Hanu.
Answer Question Select Best Answer
FEB
172011
12:09 PM
2498
Views
2
Ans
InvocationId
nag_sree
What is invocation Id? How to find InvoicationID? What is the need of Invocation Id?
tapan8984
Invocation ID is the unique identifier for distinguishing the instances in multiple instance job.
It is needed for uniquely define the instances. While running the job there is a field for
Invocation ID in the run window (pop-up).
narra satish
Invocation id is nothing about instance. Normaly job run in only one instance.By enabling it
will run in multiple instance. As long as running invocation Id is unique.
Answer Question Select Best Answer
FEB
262011
06:11 AM
2232
Views
1
Ans
praveen.bollu
tapan8984
We can use the Number of readers per Node property to enhance parallelism in Sequential
file, hence enhancing the performance..
Answer Question Select Best Answer
OCT
202005
05:48 AM
7435
Views
4
Ans
sreedhar
mahesh
o
With NLS enabled, datastage can process data in a wide range of languages & accept data
in any character set.
Anwar
o
NLS is basically Local language setting(characterset) .Once u install the DS u wil get NLS
present.Just login into Admin and u can set the NLS of your project based on your project
requirement.Just ne...
Answer Question Select Best Answer
232011
MAY
03:17 PM
7014
Views
1
Ans
dhora9999
SinhaS
1) First check that the Driver has been installed and the library env variable has been set
correctly to point to it by the UNIX admin who installed it ($LD_LIBRARY_PATH). 2) Go to
$DSHOME and update ...
Answer Question Select Best Answer
132011
JUL
02:02 AM
3347
Views
1
Ans
saratgunji
There is no such a question like base partitioning.Based on the requirement the partitioning
method of the stage changes.Mostly we will follow auto i.e., datastage will take RoundRobin
by default(auto).Partitioning method same will improve the performance characteristics of
datastage job.
Answer Question Select Best Answer
JUN
232011
06:17 AM
4867
Views
1
Ans
rupam
How many number of ways that you can implement SCD2 ? Explain them
RAJESH
o
NOV
05:47 AM
142007
2526
Views
4
Ans
Datastage Questions
o
Nagoor
what is multiple instances?how u trigger the job in windows?how can u see,delete dataset in unix and
windows?what is the function of shared container?why don't we use odd nodes to run the job?what is the
normal view and materialized view?when do u use separate sort stage and inbuilt sort utility in a stage?
what is the difference?if a job is aborted?can v run the job with out compilation?how do u commit...
j.padma89
How to check the no. of nodes while running the job in UNIX? Answer : type in the directory
path where the apt_config_file is located,then u can see the no of nodes and other config
information like ...
j.padma89
FEB
212011
06:59 AM
1575
Views
1
Ans
Job Endtime
blueboys.dsdw
FEB
262011
06:12 AM
3436
Views
2
Ans
praveen.bollu
j.padma89
To run 10 or any number of jobs one after the other or in any sequence, One can use "job
sequences".
In each sequence one can assign a job, so after the 10 sequences , you can run a
sequence job.
Using sequential jobs developer can even decide what should a job do if it is aborted.
amulas
By using multiple job compile from ds director, we can run multiple jobs at a time.....
Answer Question Select Best Answer
132011
JUL
02:36 PM
4425
Views
1
Ans
Incremental loading
amulas
Divya
o
There are ways to perform the incremental load. While performing incremental load Pls take
care of below points: 1. Retrieve changed records from last load run from source by using
appropriate extract...
Answer Question Select Best Answer
APR
092007
09:57 AM
3305
Views
4
Ans
Gnaneshwar
saratgunji
Both Server jobs and Parallel jobs run on Data Stage server only. where are server jobs
performance is slower than parallel jobs because parallel jobs run on SMP and MPP.
parallel jobs has partition...
nikhilanshuman
Datastage parallel jobs can run in parallel on multiple nodes. Server jobs do not run on
multiple node.Parallel jobs support partition parallelism(Round robin,Hash,modulus etc.),
server jobs don...
Answer Question Select Best Answer
JUN
232011
07:50 AM
3536
Views
1
Ans
amulas
How do we pass parameters from UNIX? & How do we pass parameters by using UNIX shell scripting?
mcrao1
+60182296096.
AUG
112006
08:08 AM
3587
Views
5
Ans
madycool
dhora9999
Ya you are right, If you do not know the answer please do not cut paste here. Please
answer the questions promptly. do not confuse anybody..Thanks D!!
rajani
o
In the Filter option,u can use the UNIX command like this to filter the input records..head -3
input_file.dat
Answer Question Select Best Answer
SEP
06:44 AM
052005
5562
Views
8
Ans
o
DataStage Interview Questions
Surrogate Key is a Primary Key for a Dimension table. Most importance of using it is it is independent of
underlying database. i.e Surrogate Key is not affected by the changes going on with a database.
Datastage Etl
Surrogate Key is used to produce the Sequence numbers. So that, based on the
Surrogate Key generated, we can identify the Unique Id in any column. Surrogate
Key is mainly implemented in the...
ASHOK1324
Primary Key does not allow data duplications for actual source data. We cannot maintain
historical data of each record using PK.SID acts as a primary key in target WH systems,
that allows data duplications and maintains complete historical data along with current data.
Answer Question Select Best Answer
FEB
06:03 AM
262011
4038
Views
2
Ans
praveen.bollu
Copy stage is used to send the data to the multiple sources. Here we can change the
column names . Rather than using Transformer Stage where ever necessary , we can use
the copy stage. Like this, c...
narra satish
By using copy stage we can increase the performance. In this stage we can do sorting,
Removing unwanted columns
Answer Question Select Best Answer
MAR
282005
06:18 PM
12229
Views
12
Ans
o
DataStage Interview Questions
Stage Variable - An intermediate processing variable that retains value during read and doesnt pass the
value into target column. Derivation - Expression that specifies value to be passed on to the target
column. Constant - Conditions that are either true or false that specifies flow of data with a link.
narra satish
Hi, Stage Variables:This are used in thr transformer For passing a Value in the Input but not
effected on output Data.It is applicable for all the output links.Dervation:This are the values
which pass...
amit101here
Read below from IBM pdf :1. Any before-stage subroutine is executed. If ErrorCode is nonzero, the job aborts. 2. A row is obtained from the stream input link. 3. For each reference
input li...
Answer Question Select Best Answer
192005
MAY
11:01 AM
2808
Views
7
Ans
o
DataStage Interview Questions
Amit_Mishra
The following is the order of execution done internally in the transformer:* Stage
variables*Constraints*Derivations or ExpressionsNote: We can't change the order of
execution by using Link Ordering option. Link Ordering is no where related to this.
Answer Question Select Best Answer
242005
MAY
05:58 PM
2415
Views
1
Ans
o
DataStage Interview Questions
Most of the times the data was sent to us in the form of flat files. The data is dumped and sent to us. In
some cases were we need to connect to DB2 for look-ups as an instance then we used ODBC drivers to
connect to DB2 (or) DB2-UDB depending the situation and availability. Certainly DB2-UDB is better in
terms of performance as you know the native drivers are always better than ODBC drivers. 'iSeries...
Read Best Answer
242005
MAY
05:57 PM
4846
Views
1
Ans
What are other Performance tunings you have done in your last
project to increase the performance of slowly running jobs?
o
DataStage Interview Questions
Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server using
Hash/Sequential files for optimum performance also for data recovery in case job aborts.Tuned the OCI
stage for 'Array Size' and 'Rows per Transaction' numerical values for faster inserts, updates and
selects.Tuned the 'Project Tunables' in Administrator for better performance.Used sorted data for
Aggregator.Sorted...
Read Best Answer
Minimise the usage of Transformer (Instead of this use Copy, modify, Filter, Row Generator)
Use SQL Code while extracting the data
3. Handle the nulls
4. Minimise the warnings
5. Reduce the number of lookups in a job design
6. Use not more than 20stages in a job
7. Use IPC stage between two passive stages Reduces processing time
8. Drop indexes before data loading and recreate after loading data into tables
9. Gen\'ll we cannot avoid no of lookups if our requirements to do lookups compulsory.
10. There is no limit for no of stages like 20 or 30 but we can break the job into small jobs then we use
dataset Stages to store the data.
1.
2.
11. IPC Stage that is provided in Server Jobs not in Parallel Jobs
12. Check the write cache of Hash file. If the same hash file is used for Look up and as well as target,
13.
14.
15.
16.
17.
18.
19.
20.
If possible, break the input into multiple threads and run multiple instances of the job.
Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server using
Hash/Sequential files for optimum performance also for data recovery in case job aborts.
Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for faster inserts,
o
o
o
jobs
Removed the data not used from the source as early as possible in the job.
Worked with DB-admin to create appropriate Indexes on tables for better performance of DS queries
Converted some of the complex joins/business in DS to Stored Procedures on DS for faster
jobs in parallel.
Before writing a routine or a transform, make sure that there is not the functionality required in one of
o
o
o
o
o
o
Make every attempt to use the bulk loader for your particular database. Bulk loaders are generally
faster than using ODBC or OLE.
sistlasatish
Minimise the usage of Transformer (Instead of this use Copy, modify, Filter, Row
Generator)Use SQL Code while extracting the dataHandle the nullsMinimise the
warningsReduce the number of lookups in a ...
Improve Answer
242005
MAY
05:55 PM
1972
Views
1
Ans
o
DataStage Interview Questions
Link Partitioner - Used for partitioning the data.Link Collector - Used for collecting the partitioned data.
Read Best Answer
Link Partitioner and collecter are basically used to introduce data parallellism in server
jobs.link partitioner,splits the data on many links.Once the data is processed,link collector
collects the data and passes it to a single link.These are used in server jobs.In datastage
parallel jobs,these things are inbuilt and automatically taken care of.
nikhilanshuman
Link Partitioner and collecter are basically used to introduce data parallellism in server
jobs.link partitioner,splits the data on many links.Once the data is processed,link collector
collects the da...
Improve Answer
242005
MAY
05:54 PM
6129
Views
1
Ans
What are OConv () and Iconv () functions and where are they used?
o
o
iconv is used to convert the date into into internal format i.e only datastage can understand
example :- date comming in mm/dd/yyyy format
datasatge will conver this ur date into some number like :- 740
u can use this 740 in derive in ur own format by using oconv.
suppose u want to change mm/dd/yyyy to dd/mm/yyyy
now u will use iconv and oconv.
ocnv(iconv(datecommingfromi/pstring,SOMEXYZ(seein help which is
iconvformat),defineoconvformat))
sekr
iconv is used to convert the date into into internal format i.e only datastage can understand
example :- date comming in mm/dd/yyyy format datasatge will conver this ur date into some
number like :- 7...
Improve Answer
242005
MAY
05:54 PM
1406
Views
1
Ans
o
DataStage Interview Questions
242005
MAY
05:53 PM
4167
Views
6
Ans
What is Metastage?
o
DataStage Interview Questions
MAR
06:18 PM
282005
2254
Views
3
Ans
o
DataStage Interview Questions
Config file consists of the following. a) Number of Processes or Nodes. b) Actual Disk Storage Location.
Venkat Poonati
Biggest strength of Datastage PX is its configuration file.. Datastage engine learns about
size and shape of the system by reading configuration file.It contains Nodes and resource
available for running the job.
Answer Question Select Best Answer
AUG
092010
09:54 AM
2666
Views
4
Ans
nagarjuna900
By using the Lookup stage we will combine the data from multiple table. By using join stage also we will
combine the data from multiple tables. Then what is the need of Lookup stage?
Venkat Poonati
There are three things to consider here1.memory usage As with joiner fewer rows has to be
in memory at any time.This is not with lookup2.treating of rows with unmatched keys
Joiner w...
narra satish
Join and lookup is differ in terms of memory usage. Join is light weight than lookup stage.
Join stage won't use much system resources. While performing lookup will used more
system resources. In case of lookup it fetch the whole data from refernce link into RAM
memory and then it perform lookup.
Answer Question Select Best Answer
OCT
122006
03:48 AM
2400
Views
5
Ans
praveen
narra satish
Hi,By using transformer we can generate the servgate key.In stagevariablesand write the
derviation.
neeraj82
You can generate through system variable in transformer which is built-in name is keymgt.
You don't have remember full name and in parallel you can generate through surrogate key
generator.
Answer Question Select Best Answer
FEB
262011
06:05 AM
4656
Views
3
Ans
praveen.bollu
What are the ways to read multiiple files from sequentila file if the both files are differnt
reddiraja
Number of nodes = true that time we read multiple files in sequential file stage.
bupesh
This can be achieved by selecting the File pattern option and the path of the the files in
the sequential stage.
subhalaxmipanda
How to change the attribute of a .dsx format file to make it read only?
narra satish
First import the job in .dsxformat and change the attributes ther 0-Editable1-Read onlyAnd
import again and override it.
Answer Question Select Best Answer
JAN
08:54 PM
292006
7317
Views
3
Ans
Amar
madhavsandireddi
In sequence job we have to run a job and if a job is aborted and if we do not have
terminator activity then control goes to the exception handler where it checks for what
reason the job is aborted..ThanksMadhav
nikhilanshuman
In datastage sequence,there is "Exception Handler" activity.When you are calling your jobs
from a Datastage Sequence you should do following :Step 1: Go to properties of master
sequence and ...
Answer Question Select Best Answer
NOV
032007
03:39 AM
2866
Views
3
Ans
What is the exact difference b/w LOOKUP stage , JOIN stage and
MERGE stage?
o
vijay
madhavsandireddi
The major difference between LKP stage, Join stage, merge stage are LKP & JOIN: If the
reference data in the LKP stage is huge when compared to the primary data. there is no
reject requiremen...
Pavan Batchu
o
Lookup stage: 1.Can only return multiple matching rows from one reference.2. Can reject
rows based on constraint.3. Can set failure.4. Does not need partitioned and sorted input.
Merge stage: 1. Can...
Answer Question Select Best Answer
JAN
162011
09:53 PM
3522
Views
1
Ans
AlamAlam
How can we pass parameters from one job to another job by using command line prompt?plz send ans...
thanks in advance.Alam
cevvavijay
We can pass parameter to a job using two ways .. using dsjob- command line or from a
sequencer.Other way would be -You configure single parameter set ( version 8.0 onwards)
and use the same in both th...
Answer Question Select Best Answer
OCT
292010
08:56 AM
3775
Views
3
Ans
na.sreedhar
I want to create new user in datastage 7.5 I dont know how to do it?
manoj kumar ganji
User creation in 7.5 is os dependant i.e.., you can make creation of users in OS. For
example when you load DS on Windows XP , then create new user in user accounts in control
panel.
tisha24
The user created in and for windows will be a user for datastage by default..You can't crete
a user from datastage environment. If you want to create a user create through Control
panel user.. The...
Answer Question Select Best Answer
AUG
082010
12:47 AM
1725
Views
2
Ans
Job Unlock
o
skyboyfli
$ps -EF$kill PIDBy using above 2 commands in unix ...then we can kill the jobotherwise
tools-> cleanup resourses
svasu.r
OCT
302009
12:18 AM
3055
Views
2
Ans
OCI Stage
o
srikanth.ds
Mili Jon
Array size is a property of OCI stage which help to reduce the context switch between
DataStage and Oracle database. We should keep this in mind that however it help us
reducing the context switch but...
DSQuest
I Suppose it is called "Array Size"....Array Size is mainly used to increase the buffer during
Write operations into a Oracle DB in a Server job using the Oracle OCI stage.This comes of
grea...
Answer Question Select Best Answer
192005
MAY
11:01 AM
1897
Views
4
Ans
Mili Jon
Job aborts and logs can be seen on Director client. However a notification can be set using
the notification Stage to send emails during failure.
gagan8877
U can define a job sequence to send an email using SMTP activity if the job fails. Or log the
failure to a log file using DSlogfatal/DSLogEvent from controlling job or using a After Job
Routine. orUse dsJob -log from CLI.
Answer Question Select Best Answer
252008
JUL
02:36 AM
3477
Views
1
Ans
Orchestrate Schema
o
nishu_so_005
What is Orchestrate Schema? Distinguish internal data type (Orchestrate schema) vs external data type
nikhilanshuman
Orchestrate schema defines the fields and their datatypes.There are two types of
Orchestarte shcemas a) Input Schems b)Output schema.Orchestrate Schema can be
compared with "STRUCT" of C.Example : RollNo - int32
Answer Question Select Best Answer
242005
MAY
06:00 PM
3640
Views
1
Ans
Tell me one situation from your last project, where you had faced
problem and How did u solve it?
o
DataStage Interview Questions
A. The jobs in which data is read directly from OCI stages are running extremely slow. I had to stage the
data before sending to the transformer to make the jobs run faster.B. The job aborts in the middle of
loading some 500,000 rows. Have an option either cleaning/deleting the loaded data and then run the
fixed job or run the job again from the row the job has aborted. To make sure the load...
nikhilanshuman
a) We had a big job with around 40 stages.The job was taking too long tocompile and
run.We broke the job into 3 smaller jobs.After this ,we observed that the performance was
slighly improved and maint...
Answer Question Select Best Answer
JUN
062005
05:22 AM
1894
Views
1
Ans
DWH's are typically read only, batch updated on a scheduleODS's are maintained in more real time,
trickle fed constantly
nikhilanshuman
An opearation data store contains the data which are constantly updated through the course
of the business operations. ODS is specially designed such that it can quickly perform
relatively simply que...
Answer Question Select Best Answer
112008
JUL
04:42 AM
3023
Views
1
Ans
DSParams file
o
arcteetc
Is it like that if we define the values of variables in DSParams file then there is no need to give the values
at job level ar Project level ?& how to configure this file at job level ?so that we need not hardcode the
values.....
nikhilanshuman
yes.DSParams file contains all the project level parameters which are set up in Datastage
Administrator.If the values are provided/modified in this file,the changes are automatically
reflected in the ...
Answer Question Select Best Answer
FEB
152008
04:48 AM
2333
Views
1
Ans
etl_bhargavi
How can i load a flat file into target as fast as i can?Assuming that the source bottleneck is not there,that
is there is no performance issues in the source side.
nikhilanshuman
A Flat file can be read using sequential file stage.To make data load faster,try implementing
least stages in you job and use minimal transformations.if you are trying to load the data of
a flat file ...
Answer Question Select Best Answer
DEC
072005
02:41 PM
893
Views
1
Ans
MustageemRaees
nikhilanshuman
JUN
072006
01:49 PM
992
Views
1
Ans
ub
nikhilanshuman
"DB2 UDB" is a stage in Datastage using which the connectivity could be made to DB2
databases.It could be used to fetch the data from DB2 databases or to perform DML
operations in DB2(e.g. Insert/update) or Bulk load etc..
Answer Question Select Best Answer
AUG
09:18 AM
202007
1862
Views
1
Ans
subharatanjena
nikhilanshuman
If the batch is taking more time to execute(10 mins),it may be due to the performance
issues.In such cases the performance optimizstion mesures should be taken.If without
making any changes,time taken...
Answer Question Select Best Answer
FEB
282007
02:45 AM
1803
Views
1
Ans
infinity
nikhilanshuman
Datastage is platform independent.A same job,when designed properly can run on SMP as
well as MPP systems.Partitioning does not need to be changed for SMP/MPP systems.
Answer Question Select Best Answer
NOV
222005
09:45 AM
1777
Views
1
Ans
What is trouble shhoting in server jobs ? what are the diff kinds
of errors encountered while running any job?
o
Ajju2005
nikhilanshuman
Troubleshooting in datastage server jobs involves monitoring the job log for fatal errors and
taking appropriate actions to resolve them.There can be various errors which could be
encountered while ru...
Answer Question Select Best Answer
242005
MAY
05:58 PM
6286
Views
1
Ans
Functions like [] -> sub-string function and ':' -> concatenation operatorSyntax: string [ [ start, ] length ]
string [ delimiter, instance, repeats ]
nikhilanshuman
It seems that it is being asked to explain string function.Following are some string functions
used in datastage:Compare,Field,Convert,Padstring,TrimB,TrimF,TrimLeadingTrailingSome
string conversion functions
:StringToDate,StringToDecimal,StringToTime,StaringToUstring,StringToTimestamp etc...
Answer Question Select Best Answer
032006
JUL
01:02 AM
1092
Views
1
Ans
sreedhar kancherla
nikhilanshuman
Here,the basic thing required is to add 1 to the number which was used in last
load.Following is the logic for this:a) Initially take a file and write 0 to it.b) Now in the
sequence,create a user vari...