Anda di halaman 1dari 15

INTERVIEW QUESTION

GENERAL QUESTIONS ABOUT PROJECT:

Name some measures in your fact table?


How many dimension tables did you had in your project and name some dimensions (columns)?
How many Fact and Dimension tables are there in your project?
How many Data marts are there in your project?
What is the daily data volume (in GB/records)?
What is the size of the data extracted in the extraction process?
What is the size of the database in your project?

1. We currently have several processes that run under 5 minutes in the production environment and this has
resulted in the repository DB growing quite rapidly. Are there any best practices associated with handling
volume growth in the repository or should I go ahead and run SQL scripts to delete historic data from the
repository tables?

A. In repository manager Select a folder from which you want to delete. Then try edit -> truncate log. You will
have option to delete the entire log or delete by date.

2. In a Mapping, to run a Oracle Store Procedure, where the procedure only does Updates and Inserts. Since all
the work is done by SP, can I only put the Source tables and the procedure, with no target table? (since the target
table is defined in the SP).

A. You can have a dummy mapping to target a flat file. Then in source Pre SQL have " call procedure_name();".
The flat file is only to complete the mapping. Have a table in the database (where this procedure runs) as source
( Say for ex DUAL). Then create a mapping like select * from dual and put that in target flat file dummy. Now
call the procedure before or after the mapping in source via PreSQL or PostSQL in mapping tab of workflow
manager. It will run the procedure. The procedure will do what ever it is supposed to do.

3. How to remove spaces in the middle?

A. Use REPLACECHR function to remove the spaces where ever it is.

(OR) Use function REG_REPLACE( email_id, ‘\s+’, ‘’)

4. There is a mapping which deals with close to 50 million of rows,it inserts incrementally on each session run.It
has a lookup on table which has close to 40M rows.When i run the session,it takes forever to insert rows
,previously it took like 3 hours to load incrementally (was run once in every 15 days,probably loaded 1M rows
on every run),I guess the throughput/sec was anywhere between 2000-3000 rows.But now lately i have realized
that the throughput is like 5 rows, sometime 20 or 45 or 100...not really more than 200 rows. What could have
caused the problem?? Is there a way to improve the throughput? How long should a normal mapping with just
exp and filter trs would take to load a few(2-3) million rows??

A. First step is to identify where the problem is - In Mapping or in Target. For this, make the target as Flat File
and run the mapping. If the throughput is the same or closer to what you are getting with relational target - then
may need to look at your mapping closely to improve performance.

On the other hand, if you see considerably higher throughput for flat file, you may need to look at your target.
Database maintenance on the target, such as reindexing, update statistics may help.

To improve performance, if you are using normal loading, may use bulk load. Bulk loading may require the
index/constraint to be removed or disabled prior to load (pre-session). You may recreate/enable the
index/constraint in post-load.

5. Need to 'Load record with latest timestamp'

Scenario:---

My Source:-[This is the source with 2 different timestamps]

Inv_id Inv_Name Inv_upd_time

1234 'abc' 12:00:00

1234 'jkl' 11:00:00

Expected Target:--[I want the record with the latest timestamp]

Inv_id Inv_Name Inv_upd_time

1234 'abc' 12:00:00

A. For this you can use the aggregator transformation as aggregator will fetch you the last record based on
inv_id, put sorter transformation(sorted by timestamp descending) before Aggregator to make sure that first
value in the group would be lasted timestamp value.

6. How to schedule two different sessions from two differenet folders, to run the sessions one after another.
Schedule second session to run only if first session succeded.

A. I. In Workflow1,at the end of session,create a command task to call a shell script which has this simple
command "touch sample.tag" which creates a dummy file sample.tag

II.In Workflow2,create a eventwait task which will wait for the sample.tag file which is an indication of the
completion of the workflow1.

7. How to modify Informatica Target commit interval across sessions?

There is no straight forward way to do this but there is a workaround.

a. Export the workflow to an XML file in workflow manager

b. Open the XML file in wordpad or notepad.

c. Find the occurrence of string

<ATTRIBUTE NAME ="Commit Interval" VALUE ="10000"/>

and replace with

<ATTRIBUTE NAME ="Commit Interval" VALUE ="5000"/>

d. Save the xml and import it back in informatica.


8. What happens if you increase commit intervals and also decrease commitExplain grouped cross tab?
Answer
A. if you have increased your commit interval to ~25000 rows the session will run faster but when your session
fails at 24000th record you will not have any data in your target.

When you decrease your commit interval to ~10000 rows your session will be slow when compared to previous
but if the session fails at 24000th record you will lose only 4000 records.

9. What are the different types of Commit intervals?


A. The different commit intervals are:
Target-based commit. The Informatica Server commits data based on the number of target rows and the key
constraints on the target table. The commit point also depends on the buffer block size and the commit interval.
Source-based commit. The Informatica Server commits data based on the number of source rows. The commit
point is the commit interval you configure in the session properties.

Difference between static cache and dynamic cache?

In case of Dynamic catche when you are inserting a new row it looks at the lookup catche to see if the row
existing or not,If not it inserts in the target and catche as well in case of Static catche when you are inserting a
new row it checks the catche and writes to the target but not catche

If you cache the lookup table, you can choose to use a dynamic or static cache. By default, the lookup cache
remains static and does not change during the session. With a dynamic cache, the Informatica Server inserts or
updates rows in the cache during the session. When you cache the target table as the lookup, you can look up
values in the target and insert them if they do not exist, or update them if they do.

Which transformation should we use to normalize the COBOL and relational sources?
Normalizer Transformation. Normalizer Transformation.

Normalizer Transformation.When we drag the COBOL source in to the mapping Designer workspace,the
normalizer transformation automatically appears,creating input and output ports for every column in the source.

What are the join types in joiner transformation?

Normal Join Master Join Detail Join Outer Join

The following are the join types Normal,MasterOuter,Detail Outer,Full Outer

What is the look up transformation?

Used to look up data in a reational table or view.

Lookup is a passive transformation and used to look up data in a flat file or a relational table. Generally we use
lookup transformation for 1) get a related value from key column value 2) check whether the record already
existing in the table 3) slowly changing dimension tables

A Lookup transformation is used for checking the matched values from the source or target tables,used for
updating the slowly changing dimensions and also performs some calculations.
What are the diffrence between joiner transformation and source qualifier transformation?

1. Source Qualifier Operates only with relational sources within the same schema. Joiner can have either
heterogenous sources or relation sources in different schema 2. Source qualifier requires atleats one
matching column to perform a join. Joiner joins based on matching port. 3. Additionally, Joiner requires
two separate input pipelines and should not have an update strategy or Sequence generator (this is no
longer true from Infa 7.2).

1)Joiner can join relational sources which come from different sources whereas in source qualifier the
relational sources should come from the same data source. 2)We need matching keys to join two
relational sources in source qualifier transformation.Where as we doesn?t need matching keys to join
two sources.

How can you improve session performance in aggregator transformation?

By using Incremental Aggregation create the sorter transformation before the aggregator sorted input

Ya we can use a Sorted Input option to improve the performance. Basically aggregate transformation reduces the
performance because it uses caches.

Can you use the maping parameters or variables created in one maping into any other reusable
transformation?

Yes. Because reusable transformation is not contained with any mapplet or mapping.

What is meant by lookup caches?

Session will read all unique rows from the reference table/ file to fill the local buffer first; then for each row
received from up-stream transformation, it tries to match them against the local buffer

Informatica server builts a cache in memory when it process the first row of a cached lookup transformation.
alidwh@gmail.com

- When server runs a lookup transformation, the server builds a cache in memory, when it process the first row
of data in the transformation. - Server builds the cache and queries it for the each row that enters the
transformation. - The server creates index and data cache files in the lookup cache drectory and used the server
code page to create the files. - index cache contains conductional values and data cache contains output values

The informatica server builds a cache in memory when it processes the first row of a data in a cached look up
transformation. It allocates memory for the cache based on the amount you configure in the transformation or
session properties. The informatica server stores condition values in the index cache and output values in the
data cache.

What is source qualifier transformation?

SQ is an active tramsformation. It performs one of the following task: to join data from the same source database
to filtr the rows when Power centre reads source data to perform an outer join to select only distinct values from
the source
In source qualifier transformatio a user can defined join conditons,filter the data and eliminating the duplicates.
The default source qualifier can over written by the above options, this is known as SQL Override.
alidwh@gmail.com

The source qualifier represents the records that the informatica server reads when it runs a session.

When we add a relational or a flat file source definition to a mapping,we need to connect it to a source qualifier
transformation.The source qualifier transformation represents the records that the informatica server reads when
it runs a session.

How the informatica server increases the session performance through partitioning the source?

Partittionig the session improves the session performance by creating multiple connections to sources and targets
and loads data in paralel pipe lines

What are the settiings that you use to cofigure the joiner transformation?

Master group flow detail group flow join condition type of join

take less no. of rows table as master table, more no of table as detail table and join condition. joiner will put all
row from master table into chache and check condition with detail table rows.

1) Master Source 2) Detail Source 3) Type Of Join 4) Condition of Join

What are the rank caches?

The informatica server stores group information in an index catche and row data in data catche

when the server runs a session with a Rank transformation, it compares an input row with rows with rows in data
cache. If the input row out-ranks a stored row,the Informatica server replaces the stored row with the input row.
During the session ,the informatica server compares an inout row with rows in the datacache. If the input row
out-ranks a stored row, the informatica server replaces the stored row with the input row. The informatica server
stores group information in an index cache and row data in a data cache.

How can you create or import flat file definition in to the warehouse designer?

By giving server connection path

Create the file in Warehouse Designer or Import the file from the location it exists or modify the source if the
structure is one and the same

first create in source designer then draginto warhouse designer you can't create a flat file target defenition
directly ramraj

There is no way to import target definition as file in Informatica designer. So while creating the target definition
for a file in the warehouse designer it is created considering it as a table, and then in the session properties of that
mapping it is specified as file.

U can not create or import flat file definition in to warehouse designer directly.Instead U must analyze the file in
source analyzer,then drag it into the warehouse designer.When U drag the flat file source definition into
warehouse designer workspace,the warehouse designer creates a relational target definition not a file
definition.If u want to load to a file,configure the session to write to a flat file.When the informatica server runs
the session,it creates and loads the flatfile.

What is Code Page Compatibility?

When two code pages are compatible, the characters encoded in the two code pages are virtually identical.

Compatibility between code pages is used for accurate data movement when the Informatica Sever runs in the
Unicode data movement mode. If the code pages are identical, then there will not be any data loss. One code
page can be a subset or superset of another. For accurate data movement, the target code page must be a superset
of the source code page.

What is aggregate cache in aggregator transforamtion?

Aggregator Index Cache stores group by values from Group-By ports and Data Cache stores aggregate data
based on Group-By ports (variable ports, output ports, non group by ports). When the PowerCenter Server runs a
session with an Aggregator transformation, it stores data in memory until it completes the aggregation. If you
use incremental aggregation, the PowerCenter Server saves the cache files in the cache file directory.

It is a temporary memory used by aggregator in order to improve the performance

aggregator transformation contains two caches namely data cache and index cache data cache consists
aggregator value or the detail record index cache consists grouped column value or unique values of the records

When the PowerCenter Server runs a session with an Aggregator transformation, it stores data in aggregator
until it completes the aggregation calculation.

The aggregator stores data in the aggregate cache until it completes aggregate calculations.When u run a session
that uses an aggregator transformation,the informatica server creates index and data caches in memory to process
the transformation.If the informatica server requires more space,it stores overflow values in cache files.

How can you recognise whether or not the newly added rows in the source are gets insert in the target?

In the type-2 mapping we have three options to recognise the newly added rows. i) Version Number ii) Flag
Value iii) Effective Date Range

we can add a count aggregator column to the target and generate it before running the mapping there might
couple of different ways to do this or we can run a sql query after running the mapping each time to make sure
new data is inserted

From session SrcSuccessRows can be compared with TgtSuccessRows

check the seesion log or check the target table.

What is cube?

Cube is a multidimensional representation of data. It is used for analysis purpose. A cube gives multiple views of
data.
What is Data Modeling? What are the different types of Data Modeling?

Data modeling is a process of creating data models. In other words, it is structuring and organizing data in a
uniform manner where constraints are placed within the structure.The Data structure formed are maintained in a
database management system. The Different types of Data Modeling are: 1. Dimension Modelling 2. E-R
Modelling

What are the different types of OLAP TECHNOLOGY?

Online Analytical process is of three types, they are MOLAP, HOLAP and ROLAP. MOLAP Mulidimensional
online analytical process. It is used for fast retrival of data and also for slicing and dicing operations. It plays a
vital role in easing complex calculations. ROLAP Relational online analytical process. It has the ability to
handle large amount of data. HOLAP Hybrid online analytical process. It is a combination of both HOLAP and
MOLAP.

What is the difference between a Database and a Datawarehouse?

Database is a place where data is taken as base to data access to retrieve and load data, whereas, a data
warehouse is a place where application data is managed for analysis and reporting services. Database stores data
in the form of tables and columns. On the contrary, in a data warehouse, data is subject oriented and stored in the
form of dimensions and packages which are used for analysis purpose. In short, we must understand that a
database is used for running an enterprise but a data warehouse helps in how to run an enterprise.

What is the difference between active transformation and passive transformation?

An active transformation can change the number of rows that pass through it, but a passive transformation can
not change the number of rows that pass through it.

What is the use of control break statements?

They execute a set of codes within the loop and endloop.

What are the types of loading in Informatica?

There are two types of loading, normal loading and bulk loading. In normal loading, it loads record by record
and writes log for that. It takes comparatively a longer time to load data to the target in normal loading. But in
bulk loading, it loads number of records at a time to target database. It takes less time to load data to target.

What is the difference between source qualifier transformation and application source qualifier
transformation?

Source qualifier transformation extracts data from RDBMS or from a single flat file system. Application source
qualifier transformation extracts data from application sources like ERP.
How do we create primary key only on odd numbers?

To create primary key, we use sequence generator and set the 'Increment by' property of sequence generator to 2.

What is authenticator?

It validates user name and password to access the PowerCenter repository.

What is the use of auxiliary mapping?

Auxiliary mapping reflects change in one table whenever there is a change in the other table.

What are Sessions and Batches?

Sessions and batches store information about how and when the Informatica Server moves data through
mappings. You create a session for each mapping you want to run. You can group several sessions together in a
batch. Use the Server Manager to create sessions and batches.

What are Source definitions?

Detailed descriptions of database objects (tables, views, synonyms), flat files, XML files, or Cobol files that
provide source data. For example, a source definition might be the complete structure of the EMPLOYEES
table, including the table name, column names and datatypes, and any constraints applied to these columns, such
as NOT NULL or PRIMARY KEY. Use the Source Analyzer tool in the Designer to import and create source
definitions.

What is Dynamic Data Store?

The need to share data is just as pressing as the need to share metadata. Often, several data marts in the same
organization need the same information. For example, several data marts may need to read the same product data
from operational sources, perform the same profitability calculations, and format this information to make it easy
to review.
If each data mart reads, transforms, and writes this product data separately, the throughput for the entire
organization is lower than it could be. A more efficient approach would be to read, transform, and write the data
to one central data store shared by all data marts. Transformation is a processing-intensive task, so performing
the profitability calculations once saves time.
Therefore, this kind of dynamic data store (DDS) improves throughput at the level of the entire organization,
including all data marts. To improve performance further, you might want to capture incremental changes to
sources. For example, rather than reading all the product data each time you update the DDS, you can improve
performance by capturing only the inserts, deletes, and updates that have occurred in the PRODUCTS table
since the last time you updated the DDS.
The DDS has one additional advantage beyond performance: when you move data into the DDS, you can format
it in a standard fashion. For example, you can prune sensitive employee data that should not be stored in any
data mart. Or you can display date and time values in a standard format. You can perform these and other data
cleansing tasks when you move data into the DDS instead of performing them repeatedly in separate data marts.
What is a Global repository?

The centralized repository in a domain, a group of connected repositories. Each domain can contain one global
repository. The global repository can contain common objects to be shared throughout the domain through
global shortcuts. Once created, you cannot change a global repository to a local repository. You can promote an
existing local repository to a global repository.

What is Local Repository?

Each local repository in the domain can connect to the global repository and use objects in its shared folders. A
folder in a local repository can be copied to other local repositories while keeping all local and global shortcuts
intact.

What are the different types of locks?

There are five kinds of locks on repository objects:


Read lock- Created when you open a repository object in a folder for which you do not have write permission.
Also created when you open an object with an existing write lock.
Write lock- Created when you create or edit a repository object in a folder for which you have write permission.
Execute lock- Created when you start a session or batch, or when the Informatica Server starts a scheduled
session or batch.
Fetch lock- Created when the repository reads information about repository objects from the database.
Save lock- Created when you save information to the repository.

After creating users and user groups, and granting different sets of privileges, I find that none of the
repository users can perform certain tasks, even the Administrator.

Repository privileges are limited by the database privileges granted to the database user who created the
repository. If the database user (one of the default users created in the Administrators group) does not have full
database privileges in the repository database, you need to edit the database user to allow all privileges in the
database.

I do not want a user group to create or edit sessions and batches, but I need them to access the Server
Manager to stop the Informatica Server.

To permit a user to access the Server Manager to stop the Informatica Server, you must grant them both the
Create Sessions and Batches, and Administer Server privileges. To restrict the user from creating or editing
sessions and batches, you must restrict the user's write permissions on a folder level.
Alternatively, the user can use pmcmd to stop the Informatica Server with the Administer Server privilege alone.

I created a new group and removed the Browse Repository privilege from the group. Why does every user
in the group still have that privilege?

Privileges granted to individual users take precedence over any group restrictions. Browse Repository is a
default privilege granted to all new users and groups. Therefore, to remove the privilege from users in a group,
you must remove the privilege from the group, and every user in the group.
How does read permission affect the use of the command line program, pmcmd?

To use pmcmd, you do not need to view a folder before starting a session or batch within the folder. Therefore,
you do not need read permission to start sessions or batches with pmcmd. You must, however, know the exact
name of the session or batch and the folder in which it exists.
With pmcmd, you can start any session or batch in the repository if you have the Session Operator privilege or
execute permission on the folder.

What are the types of metadata that stores in repository?

Data base connections,global objects,sources,targets,mapping,mapplets,sessions,shortcuts,transfrmations

The repository stores metada that describes how to transform and load source and target data.

Data about data

Metadata can include information such as mappings describing how to transform source data, sessions indicating
when you want the Informatica Server to perform the transformations, and connect strings for sources and
targets.

Following are the types of metadata that stores in the repository Database connections Global objects Mappings
Mapplets Multidimensional metadata Reusable transformations Sessions and batches Short cuts Source
definitions Target definitions Transformations.

What happens if Informatica server doesn't find the session parameter in the parameter file?

Workflow will fail.

Can you access a repository created in previous version of informatica?

We have to migrate the repository from the older version to newer version. Then you can use that repository.

Without using ETL tool can u prepare a Data Warehouse and maintain?

Yes we can do that using PL/ SQL or Stored procedures when all the data are in the same databases. If you have
source as flat files you can?t do it through PL/ SQL or stored procedures.

How do you identify the changed records in operational data?

In my project source system itself sending us the new records and changed records from the last 24 hrs.

Why couldn't u go for Snowflake schema?

Snowflake is less performance while compared to star schema, because it will contain multi joins while
retrieving the data.
Snowflake is preferred in two cases,
If you want to load the data into more hierarchical levels of information example yearly, quarterly, monthly,
daily, hourly, minutes of information. Prefer snowflake.
Whenever u found input data contain more low cardinality elements. You have to prefer snowflake schema. Low
cardinality example: sex , marital Status, etc., Low cardinality means no of distinct records is very less while
compared to total number of the records.

What is meant by clustering?

It will join two (or more) tables in single buffer, will retrieve the data easily.

Whether are not the session can be considered to have a heterogeneous target is determined?

It will consider (there is no primary key and foreign key relationship)

Under what circumstance can a target definition are edited from the mapping designer. Within the
mapping where that target definition is being used?

We can't edit the target definition in mapping designer. we can edit the target in warehouse designer only. But in
our projects, we haven't edited any of the targets. if any change required to the target definition we will inform to
the DBA to make the change to the target definition and then we will import again. We don't have any
permission to the edit the source and target tables.

Can a source qualifier be used to perform a outer join when joining 2 database?

No, we can't join two different databases join in SQL Override.

If u r source is flat file with delimited operator.when next time u want change that delimited operator
where u can make?

In the session properties go to mappings and click on the target instance click set file properties we have to
change the delimited option.

If index cache file capacity is 2MB and datacache is 1 MB. If you enter the data of capacity for index is 3
MB and data is 2 MB. What will happen?

Nothing will happen based the buffer size exists in the server we can change the cache sizes. Max size of cache
is 2 GB.

Difference between next value and current value ports in sequence generator?

Assume that they r both connected to the input of another transformer?


It will gives values like nextvalue 1, currval 0.
How does dynamic cache handle the duplicates rows?

Dynamic Cache will gives the flags to the records while inserting to the cache it will gives flags to the records,
like new record assigned to insert flag as "0", updated record is assigned to updated flag as "1", No change
record assigned to rejected flag as "2"

How will u find whether your mapping is correct or not without connecting session?

Through debugging option.

If you are using aggregator transformation in your mapping at that time your source contain dimension
or fact?

According to requirements, we can use aggregator transformation. There is no limitation for the aggregator. We
should use source as dimension or fact.

My input is oracle and my target is flat file shall I load it? How?

Yes, Create flat file based on the structure match with oracle table in warehouse designer than develop the
mapping according requirement and map to that target flat file. Target file is created in TgtFiles directory in the
server system.

For a session, can I use 3 mappings?

No, for one session there should be only one mapping. We have to create separate session for each mapping.

Type of loading procedures?

Load procedures are two types 1) Normal load 2) bulk loads if you are talking about informatica level. If you are
talking about project load procedures based on the project requirement. Daily loads or weekly loads.

Are you involved in high level r low level design? What is meant by that high level design n low level
design?

Low Level design:


Requirements should be in the excel format which describes field to field validations and business logic needs to
present. Mostly onsite team will do this Low Level design.
High Level Design:
Describes the informatica flow chart from source qualifier to target simply we can say flow chart of the
informatica mapping. Developer will do this design document.

what r the dimension load methods?

Daily loads or weekly loads based on the project requirement.


here we are using lkp b/n source to stage or stage to target?

Depend on the requirement. There is no rule we have to use in this stage only.

How will you do SQL tuning?

We can do SQL tuning using Oracle Optimizer, TOAD software

Did u use any other tools for scheduling purpose other than workflow manager or pmcmd?

Using third party tools like "Control M",

what is unbounded exception in source qualifier?

"TE_7020 Unbound field in Source Qualifier" when running session


A) Problem Description:
When running a session the session fails with the following error:
TE_7020 Unbound field <field_name> in Source Qualifier <SQ_name>"
Solution:
This error will occur when there is an inconsistency between the Source Qualifier and the source table. Either
there is a field in the Source Qualifier that is not in the physical table or there is a column of the source object
that has no link to the corresponding port in the Source Qualifier. To resolve this, re-import the source definition
into the Source Analyzer in Designer. Bring the new Source definition into the mapping.This will also re-create
the Source Qualifier. Connect the new Source Qualifier to the rest of the mapping as before.

Using unconnected lookup how we you remove nulls n duplicates?

We can't handle nulls and duplicates in the unconnected lookup. We can handle in dynamic connected lookup.

I have 20 lookup, 10 joiners, 1 normalizer how will you improve the session performance?

We have to calculate lookup & joiner caches size.

What is version controlling?

It is the method to differentiate the old build and the new build after changes made to the existing code. For the
old code v001 and next time u have to increase the version number as v002 like that. In my last company we
haven't use any version controlling. We just delete the old build and replace with the new code.
We don't maintain version controlling in informatica. We are maintaining the code in VSS (Virtual visual
Source) that is the software with maintain the code with versioning. Whenever client made change request came
once the production starts we have to create another build.

How is the Sequence Generator transformation different from other transformations?

The Sequence Generator is unique among all transformations because we cannot add, edit, or delete its default
ports (NEXTVAL and CURRVAL).
Unlike other transformations we cannot override the Sequence Generator transformation properties at the session
level. This protecxts the integrity of the sequence values generated.

What are the advantages of Sequence generator? Is it necessary, if so why?

We can make a Sequence Generator reusable, and use it in multiple mappings. We might reuse a Sequence
Generator when we perform multiple loads to a single target.
For example, if we have a large input file that we separate into three sessions running in parallel, we can use a
Sequence Generator to generate primary key values. If we use different Sequence Generators, the Informatica
Server might accidentally generate duplicate key values. Instead, we can use the same reusable Sequence
Generator for all three sessions to provide a unique value for each target row.

What are the uses of a Sequence Generator transformation?

We can perform the following tasks with a Sequence Generator transformation:


* Create keys
* Replace missing values
* Cycle through a sequential range of numbers

What is Sequence Generator Transformation?

The Sequence Generator transformation generates numeric values. We can use the Sequence Generator to create
unique primary key values, replace missing primary keys, or cycle through a sequential range of numbers.

The Sequence Generation transformation is a connected transformation. It contains two output ports that we can
connect to one or more transformations.

What is the difference between connected lookup and unconnected lookup?

Differences between Connected and Unconnected Lookups:


Connected LookupUnconnected Lookup
Receives input values directly from the pipeline.Receives input values from the result of a :LKP expression in
another transformation.
We can use a dynamic or static cacheWe can use a static cache
Supports user-defined default valuesDoes not support user-defined default values.

What is a Lookup transformation and what are its uses?

We use a Lookup transformation in our mapping to look up data in a relational table, view or synonym.

We can use the Lookup transformation for the following purposes:

* Get a related value. For example, if our source table includes employee ID, but we want to include the
employee name in our target table to make our summary data easier to read.
* Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per
invoice or sales tax, but not the calculated value (such as net sales).
* Update slowly changing dimension tables. We can use a Lookup transformation to determine whether records
already exist in the target.

What is a lookup table?

The lookup table can be a single table, or we can join multiple tables in the same database using a lookup query
override. The Informatica Server queries the lookup table or an in-memory cache of the table for all incoming
rows into the Lookup transformation.

If your mapping includes heterogeneous joins, we can use any of the mapping sources or mapping targets as the
lookup table.

Where do you define update strategy?

We can set the Update strategy at two different levels:


Within a session. When you configure a session, you can instruct the Informatica Server to either treat all
records in the same way (for example, treat all records as inserts), or use instructions coded into the session
mapping to flag records for different database operations.
Within a mapping. Within a mapping, you use the Update Strategy transformation to flag records for insert,
delete, update, or reject.

What are the different types of Commit intervals?

The different commit intervals are:


Target-based commit. The Informatica Server commits data based on the number of target rows and the key
constraints on the target table. The commit point also depends on the buffer block size and the commit interval.
Source-based commit. The Informatica Server commits data based on the number of source rows. The commit
point is the commit interval you configure in the session properties.

What is Event-Based Scheduling?

When you use event-based scheduling, the Informatica Server starts a session when it locates the specified
indicator file. To use event-based scheduling, you need a shell command, script, or batch file to create an
indicator file when all sources are available. The file must be created or sent to a directory local to the
Informatica Server. The file can be of any format recognized by the Informatica Server operating system. The
Informatica Server deletes the indicator file once the session starts.
Use the following syntax to ping the Informatica Server on a UNIX system:
pmcmd ping [{user_name | %user_env_var} {password | %password_env_var}] [hostname:]portno
Use the following syntax to start a session or batch on a UNIX system:
pmcmd start {user_name | %user_env_var} {password | %password_env_var} [hostname:]portno
[folder_name:]{session_name | batch_name} [:pf=param_file] session_flag wait_flag
Use the following syntax to stop a session or batch on a UNIX system:
pmcmd stop {user_name | %user_env_var} {password | %password_env_var}
[hostname:]portno[folder_name:]{session_name | batch_name} session_flag
Use the following syntax to stop the Informatica Server on a UNIX system:
pmcmd stopserver {user_name | %user_env_var} {password | %password_env_var} [hostname:]portno

Anda mungkin juga menyukai