INTRODUCTION
1.1. Datawarehousing
The fixed definition of datawarehouse given by William Inmon, a pioneer
in this field who popularized this term is “A subject-oriented, integrated, non-
volatile and time-variant collection of data in support of management decisions”.
Datawarehouse is a place where a wide variety of data is prepared,
organized and presented to its users in the best possible way. It helps to
consolidate information stored in heterogeneous business systems. It is a database
that does not delete, purge or update records, hence is a valuable historical
storage.Success of any business depends on its users. A datawarehouse provides
several advantages for a company struggling to provide its business users with an
effective decision support solution. Datawarehouse publish the organization’s data
assets and provides high quality information for the management to make timely,
consistent and reliable decisions that impact their business.
9. DATABASE OPTIMISATION
To gain the best Informatica performance, the database tables, stored
procedures and queries used in Informatica should be tuned well.
1. If the source and target are flat files, then they should be present in the system in
which the Informatica server is present.
2. Increase the network packet size.
3. The performance of the Informatica server is related to network connections.
Data generally moves across a network at less than 1 MB per second, whereas a
local disk moves data five to twenty times faster. Thus network connections often
affect on session performance. So avoid network connections.
4. Optimize target databases.
10. OPTIMUM CACHE SIZE IN LOOKUPS
10.1 Calculating Lookup Index Cache
The lookup index cache holds data for the columns used in the lookup condition. For
best session performance, specify the maximum lookup index cache size. Use the
following information to calculate the minimum and maximum lookup index cache for
both connected and unconnected Lookup transformations: -
To calculate the minimum lookup index cache size, use the formula:-
Columns in lookup cache = 200 * [<Column size> + 16]
To calculate the maximum lookup index cache size, use the formula:-
Columns in lookup cache = <Number of rows in lookup table>* [<column
size> + 16] * 2
Example:-
Suppose the lookup table has lookup values based in the field ITEM_ID. It uses
the lookup condition, ITEM_ID = IN_ITEM_ID1.
This ITEM_ID has data type as ‘integer’ and size as ‘16’.
Therefore the total column size is 16. The table contains 60000 rows.
Minimum lookup index cache size = 200 * [16 + 16]
= 6400
Maximum lookup index cache size = 60000 * [16+16] * 2 = 3,840,000
So this lookup transformation needs an index cache size between 6400 and
3,840,000.
For best session performance, this lookup transformation needs an index cache
size of 3,840,000 bytes.
10.2 Calculating Lookup Data Cache
In a connected transformation, the data cache contains data for the connected
output ports, not including ports used in the lookup condition. In an unconnected
transformation, the data cache contains data from the return port.
To calculate the minimum lookup data cache size, use the formula:-
Columns in lookup cache = <Number of rows in lookup table> * [<Column
size of connected output ports not in lookup condition > + 8]
Example:-
Suppose the lookup table has column names as PROMOTION_ID and
DISCOUNT which are connected output ports not in lookup condition
Column size of each is 16. Therefore total column size is 32.The table contains
60000 rows.
Minimum lookup data cache size = 60000 * [32 + 8] = 2,400,000
So this lookup transformation needs a data cache size of 2, 40,000 bytes.
*********************