Anda di halaman 1dari 8

What is Oracle Data Integrator (ODI)?

Oracle acquired Sunopsis in 2006 and with it "Sunopsis Data Integrator".


Oracle Data Integrator (ODI) is an E-LT (Extract, Load and Transform) tool used for high-speed data
movement between disparate systems.
The latest version, Oracle Data Integrator Enterprise Edition (ODI-EE) brings together "Oracle Data
Integrator" and "Oracle Warehouse Builder" as separate components of a single product with a single
licence.
What is E-LT?
E-LT is an innovative approach to extracting, loading and transforming data. Typically ETL
application vendors have relied on costly heavyweight, mid-tier server to perform the transformations
required when moving large volumes of data around the enterprise.
ODI delivers unique next-generation, Extract Load and Transform (E-LT) technology that improves
performance and reduces data integration costs, even across heterogeneous systems by pushing the
processing required down to the typically large and powerful database servers already in place within
the enterprise.
What components make up Oracle Data Integrator?
"Oracle Data Integrator" comprises of:
- Oracle Data Integrator + Topology Manager + Designer + Operator + Agent
- Oracle Data Quality for Data Integrator
- Oracle Data Profiling

What is Oracle Data Integration Suite?


Oracle data integration suite is a set of data management applications for building, deploying, and
managing enterprise data integration solutions:

Oracle Data Integrator Enterprise Edition

Oracle Data Relationship Management

Oracle Service Bus (limited use)

Oracle BPEL (limited use)

Oracle WebLogic Server (limited use)


Additional product options are:

Oracle Goldengate

Oracle Data Quality for Oracle Data Integrator (Trillium-based DQ)

Oracle Data Profiling (Trillium based Data Profiling)

ODSI (the former Aqualogic Data Services Platform)

What systems can ODI extract and load data into?


ODI brings true heterogeneous connectivity out-of-the-box, it can connect natively to Oracle, Sybase,
MS SQL Server, MySQL, LDAP, DB2, PostgreSQL, Netezza.
It can also connect to any data source supporting JDBC, its possible even to use the Oracle BI Server
as a data source using the jdbc driver that ships with BI Publisher

What are Knowledge Modules?


Knowledge Modules form the basis of 'plug-ins' that allow ODI to generate the relevant execution code
, across technologies , to perform tasks in one of six areas, the six types of knowledge module consist
of:

Reverse-engineering knowledge modules are used for reading the table and other object

metadata from source databases


Journalizing knowledge modules record the new and changed data within either a single table

or view or a consistent set of tables or views


Loading knowledge modules are used for efficient extraction of data from source databases for

loading into a staging area (database-specific bulk unload utilities can be used where available)
Check knowledge modules are used for detecting errors in source data

Integration knowledge modules are used for efficiently transforming data from staging area to
the target tables, generating the optimized native SQL for the given database
Service knowledge modules provide the ability to expose data as Web services
ODI ships with many knowledge modules out of the box, these are also extendable, they can modified
within the ODI Designer module.

How do 'Contexts' work in ODI?


ODI offers a unique design approach through use of Contexts and Logical schemas. Imagine a
development team, within the ODI Topology manager a senior developer can define the system
architecture, connections, databases, data servers (tables etc) and so forth.
These objects are linked through contexts to 'logical' architecture objects that are then used by other
developers to simply create interfaces using these logical objects, at run-time, on specification of a
context within which to execute the interfaces, ODI will use the correct physical connections,
databases + tables (source + target) linked the logical objects being used in those interfaces as defined
within the environment Topology.

Does my ODI infrastructure require an Oracle database?


No, the ODI modular repositories (Master + and one of multiple Work repositories) can be installed

on any database engine that supports ANSI ISO 89 syntax such as Oracle, Microsoft SQL Server,
Sybase AS Enterprise, IBM DB2 UDB, IBM DB2/40.
Does ODI support web services?

Yes, ODI is 'SOA' enabled and its web services can be used in 3 ways:
The Oracle Data Integrator Public Web Service, that lets you execute a scenario (a published

package) from a web service call


Data Services, which provide a web service over an ODI data store (i.e. a table, view or other

data source registered in ODI)


The ODIInvokeWebService tool that you can add to a package to request a response from a
web service
Where does ODI sit with my existing OWB implementation(s)?
As mentioned previously, the ODI-EE licence includes both ODI and OWB as separate products, both
tools will converge in time into "Oracles Unified Data Integration Product".
Oracle have released a statement of direction for both products, published January 2010:
OWB 11G R2 is the first step from Oracle to bring these two applications together, its now possible to
use ODI Knowledge modules within your OWB 11G R2 environment as 'Code Templates', an Oracle
white paper published February 2010 describes this in more detail:
Is ODI Used by Oracle in their products?

Yes there are many Oracle products that utilise ODI, but here are just a few:
Oracle Application Integration Architecture (AIA)

Oracle Agile products

Oracle Hyperion Financial Management

Oracle Hyperion Planning

Oracle Fusion Governance, Risk & Compliance

Oracle Business Activity Monitoring


Oracle BI Applications also uses ODI as its core ETL tool in place of Informatica , but only for one
release of OBIA and when using a certain source system.
What is load plans and types of load plans?
Load plan is a process to run or execute multiple scenarios as a Sequential or parallel or conditional
based execution of your scenarios. And same we can call three types of load plans, Sequential, parallel
and Condition based load plans.
What is profile in odi?
Profile is a set of objective wise privileges. we can assign this profiles to the users. Users will get the
privileges from profile

What is the odi console?


ODI console is a web based navigator to access the Designer, Operator and Topology components
through browser.
How to write the sub queries in odi?
Using Yellow interface and sub queries option we can create sub queries in odi.
or Using VIEW we can go for sub queries Or Using ODI Procedure we can call direct DB queries
in ODI.
Suppose i having 6 interfaces and running the interface 3 rd one failed how to run
remaining interfaces?
if you are running Sequential load it will stop the other interfaces. so goto operator and right click on
filed interface and click on restart. If you are running all the interfaces are parallel only one interface
will fail and other interfaces will finish.
How to remove the duplicate in odi?
Use DISTINCT in IKM level. it will remove the duplicate rows while loading into target.
Suppose having unique and duplicate but i want to load unique record one table and
duplicates one table?
Create two interfaces or once procedure and use two queries one for Unique values and one for
duplicate values.
How to write the procedures in odi?
Procedure is a step by step any technology code operations . you can refer
How will you bring in the different source data into ODI?
you will have to create dataservers in the topology manager for the different sources that you want.
How will you bulk load data?
In Odi there are IKM that are designed for bulk loading of data.
How will you bring in files from remote locations?
We will invoke the Service knowledge module in ODI,this will help us to accesses data thought a web
service.
How will you handle dataquality in ODI?
There are two ways of handling dataquality in Odi....the first method deals with handling the incorrect
data using the CKM...the second method uses Oracle data quality tool(this is for advanced quality
options).

Explain what is ODI?why is it different from the other ETL tools.


ODI stands for Oracle Data Integrator. It is different from another ETL tool in a way that it uses E-LT
approach as opposed to ETL approach. This approach eliminates the need of the exclusive

Transformation Server between the Source and Target Data server. The power of the target data server
can be used to transform the data. i.e. The target data server acts as staging area in addition to its role
of target database.
while loading the data in the target database (from staging area) the transformation logic is
implemented. Also, the use of appropriate CKM (Check Knowldege Module) can be made while doing
this to implement data quality requirement.
What is E-LT? Or What is the difference between ODI and other ETL Tools?
E-LT is an innovative approach to extracting, loading and Transforming data. Typically ETL
application vendors have relied on costly heavyweight , mid-tier server to perform the transformations
required when moving large volumes of data around the enterprise.
ODI delivers unique next-generation, Extract Load and Transform (E-LT) technology that improves
performance and reduces data integration costs, even across heterogeneous systems by pushing the
processing required down to the typically large and powerful database servers already in place within
the enterprise.
Components of Oracle Data Integrator?
"Oracle Data Integrator" comprises of:
- Oracle Data Integrator + Topology Manager + Designer + Operator + Agent
- Oracle Data Quality for Data Integrator
- Oracle Data Profiling
How to implement data validations?
Use Filters & Mapping Area AND Data Quality related to constraints use CKM Flow control.
How to handle exceptions?
In packages advanced tab and load plan exception tab we can handle exceptions.
In the package one interface got failed how to know which interface got failed if we no
access to operator?
Make it mail alert or check into SNP_SESS_LOG tables for session log details.
How to implement the logic in procedures if the source side data deleted that will
reflect the target side table?
Use this query on Command on target Delete from Target table where not exists (Select 'X' From
Source_table Where Source_table.ID=Target_table.ID).
If the Source have total 15 records with 2 records are updated and 3 records are newly
inserted at the target side we have to load the newly changed and inserted records
Use IKM Incremental Update Knowledge Module for Both Insert n Update operations.
Can we implement package in package?
Yes, we can call one package into other package.
How to load the data with one flat file and one RDBMS table using joins?
Drag and drop both File and table into source area and join as in Staging area.

If the source and target are oracle technology tell me the process to achieve this
requirement (interfaces, KMS, Models)
Use LKM-SQL to SQL or LKM-SQL to Oracle , IKM Oracle Incremental update or Control append.
What we specify the in XML data server and parameters for to connect to xml file?
File name with location :F and Schema :S this two parameters
How to reverse engineer views (how to load the data from views)?
In Models Go to Reverse engineering tab and select Reverse engineering object as
VIEW.

ELT Vs ETL
The ability to dynamically manage a staging area
The ability to generate code on source and target systems alike, in the same transformation
The ability to generate native SQL for any database on the marketmost ETL tools will generate code
for their own engines, and then translate that code for the databaseshence limiting their generation
capacities to their ability to convert proprietary concepts
The ability to generate DML and DDL, and to orchestrate sequences of operations on the
heterogeneous systems

Different Types of Dimensions and Facts in Data Warehouse


Dimension A dimension table typically has two types of columns, primary keys to fact tables and
textual\descreptive data.
Fact A fact table typically has two types of columns, foreign keys to dimension tables and measures
those that contain numeric facts. A fact table can contain facts data on detail or aggregated level.
Types of Dimensions Slowly Changing Dimensions:
Attributes of a dimension that would undergo changes over time. It depends on the business
requirement whether particular attribute history of changes should be preserved in the data
warehouse. This is called a Slowly Changing Attribute and a dimension containing such an
attribute is called a Slowly Changing Dimension.
Rapidly Changing Dimensions:
A dimension attribute that changes frequently is a Rapidly Changing Attribute. If you dont need
to track the changes, the Rapidly Changing Attribute is no problem, but if you do need to track
the changes, using a standard Slowly Changing Dimension technique can result in a huge
inflation of the size of the dimension. One solution is to move the attribute to its own dimension,
with a separate foreign key in the fact table. This new dimension is called a Rapidly Changing
Dimension.
Junk Dimensions:
A junk dimension is a single table with a combination of different and unrelated attributes to avoid
having a large number of foreign keys in the fact table. Junk dimensions are often created to
manage the foreign keys created by Rapidly Changing Dimensions.
Inferred Dimensions:
While loading fact records, a dimension record may not yet be ready. One solution is to generate

an surrogate key with Null for all the other attributes. This should technically be called an inferred
member, but is often called an inferred dimension.
Conformed Dimensions:
A Dimension that is used in multiple locations is called a conformed dimension. A conformed
dimension may be used with multiple fact tables in a single database, or across multiple data
marts or data warehouses.
Degenerate Dimensions:
A degenerate dimension is when the dimension attribute is stored as part of fact table, and not in
a separate dimension table. These are essentially dimension keys for which there are no other
attributes. In a data warehouse, these are often used as the result of a drill through query to
analyze the source of an aggregated number in a report. You can use these values to trace back
to transactions in the OLTP system.
Role Playing Dimensions:
A role-playing dimension is one where the same dimension key along with its associated
attributes can be joined to more than one foreign key in the fact table. For example, a fact
table may include foreign keys for both Ship Date and Delivery Date. But the same date
dimension attributes apply to each foreign key, so you can join the same dimension table to both
foreign keys. Here the date dimension is taking multiple roles to map ship date as well as delivery
date, and hence the name of Role Playing dimension.
Shrunken Dimensions:
A shrunken dimension is a subset of another dimension. For example, the Orders fact table may
include a foreign key for Product, but the Target fact table may include a foreign key only for
ProductCategory, which is in the Product table, but much less granular. Creating a smaller
dimension table, with ProductCategory as its primary key, is one way of dealing with this situation
of heterogeneous grain. If the Product dimension is snowflaked, there is probably already a
separate table for ProductCategory, which can serve as the Shrunken Dimension.
Static Dimensions:
Static dimensions are not extracted from the original data source, but are created within the
context of the data warehouse. A static dimension can be loaded manually for example with
Status codes or it can be generated by a procedure, such as a Date or Time dimension.
Types of Facts Additive:
Additive facts are facts that can be summed up through all of the dimensions in the fact table. A
sales fact is a good example for additive fact.
Semi-Additive:
Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table,
but not the others.
Eg: Daily balances fact can be summed up through the customers dimension but not through the
time dimension.
Non-Additive:
Non-additive facts are facts that cannot be summed up for any of the dimensions present in the
fact table.
Eg: Facts which have percentages, ratios calculated.

Factless Fact Table:


In the real world, it is possible to have a fact table that contains no measures or facts. These
tables are called Factless Fact tables.
Eg: A fact table which has only product key and date key is a factless fact. There are no
measures in this table. But still you can get the number products sold over a period of time.
Based on the above classifications, fact tables are categorized into two:
Cumulative:
This type of fact table describes what has happened over a period of time. For example, this fact
table may describe the total sales by product by store by day. The facts for this type of fact tables
are mostly additive facts. The first example presented here is a cumulative fact table.
Snapshot:
This type of fact table describes the state of things in a particular instance of time, and usually
includes more semi-additive and non-additive facts. The second example presented here is a
snapshot fact table.

Anda mungkin juga menyukai