Anda di halaman 1dari 2

Talend Tutorial Task Aid >

Configuring Joins in tMap

This tutorial uses Talend Open Studio Data Integration version 6

1. Configure the join model


a. In the jointMap Job, to open the tMap component wizard, double-click the tMap_1
component.
Note: Clicking the tMap settings button will display a list of parameters to configure your
input or output flows. One of the settings available for input flows allows you to change Join
Model from the default Left Outer Join to an Inner Join.
b. To change the Join Model property, click the default setting Left Outer Join, and then click
[...] that appears next to Left Outer Join. In the Options window, click Inner Join, and then
click OK.
Note: When you change the default settings, a red dot with the number 1 on it appears on the
tMap settings icon. This indicates that you have changed one parameter of the default tMap
settings.
c. Close the tMap wizard and run the Job.
In the Job Designer, observe that a total of 1682 rows of data from the left input are
processed by the tMap component. However, only 142 rows appear in the output file. This is
because the inner join only produced matches for 142 rows, resulting in the rejection of the
other rows.
You can validate the rejection of other rows by viewing the moviesComplete output file. In the
file, observe that all movies will have the name of the directors.

2. Create a new output in the tMap component to collect the


inner join rejects only
a. Open the tMap_1 component wizard and create a second output component named
joinRejects.
A blank output flow is created.

Talend takes the complexity out of integration


Based on open source Scalable Future-proof Predictable cost
Visit www.talend.com Follow us on Twitter @Talend
Talend Tutorial Task Aid >

b. To add movieID, title, releaseYear, url, and directorID fields to the output component,
select the five fields from the movies component and drop them on the output
component.
c. In the joinRejects output file, click the tMap settings icon.
d. To change the Catch lookup inner join reject property, click the default setting false, and
then click [...] that appears next to false. In the Options window, click true, and then click
OK.
Note: By changing the Catch lookup inner join reject property to true, you can catch all the
lines of data that were rejected by the inner join in the new output.
e. Add a tFileOutputDelimited component to the Job Designer and link the joinRejects
output of the tMap_1 component to the tFileOutputDelimited_2 component.
f. To configure the output component, in the Component view of the component, specify
the path and name for the output file. Also, include a header row in the output file and
run the Job.
In the Job Designer, you can observe that out of 1682 rows of the input data, 142 rows appear
in the joinedOutput output, and the 1540 rejected rows are collected in the joinRejects
output.
You can also view the joinRejects output file and see all the movies that were rejected by the
join. These are the movies that do not have directorID in the movies file, plus those that have
directorID in the movies file that are absent in the directors file.

Talend takes the complexity out of integration


Based on open source Scalable Future-proof Predictable cost
Visit www.talend.com Follow us on Twitter @Talend

Anda mungkin juga menyukai