Anda di halaman 1dari 16

Index

What are psets?

Creation of psets.

What is analysis_level parameter?

Achieving data lineage for generic graphs using psets.

Physical datasets and logical EME datasets.

Parameters to handle parallel running jobs calling the same graph.

Capturing job statistics details in the EME when using generic graphs.

02/12/15

by Manish Shekhar - Infos


ys

What are psets and how are they created

Creating a set of input parameter and value pairs (psets).


You do the above, using the Input Values Editor in the Edit menu, which allows you
to specify a set of values for the graph's formal parameters, then save it as a
separate .pset (parameter set) file in any of the directories under the private sandbox.
Steps:
a. Select Edit Input Values... from the GDE menu.
This appears same as the graph parameter editor, with two columns in it, the
parameter name and value.
b. For each formal parameter enter the required value in the value field.
c. Then select File Save As and save the same value set as <graph
name>.pset under the private sandboxs pset directory.
Note: The editor defaults to the project's mp directory as the location of the new .pset
file you need to navigate to pset directory in the sandbox.

02/12/15

by Manish Shekhar - Infos


ys

02/12/15

by Manish Shekhar - Infos


ys

02/12/15

by Manish Shekhar - Infos


ys

Along with the existing formal parameters of the generic graph,


define a formal parameter called analysis_level and set its value to
none.

02/12/15

by Manish Shekhar - Infos


ys

Check in the generic graph from common sandbox to the EME.

02/12/15

by Manish Shekhar - Infos


ys

Dependency analysis will not be performed on the generic graph


due to analysis_level parameters value.

02/12/15

by Manish Shekhar - Infos


ys

Each separate input values set you create in this step represents a separate instance
of the graph. To enable the Job Tracking of the generic graph, for different such value
sets, simply check these .pset files with different input value sets into the EME data
store.

This graph instance represented by the .pset file is analyzed and saved in the EME
data store as a graph object. For .pset file to be analyzed set analysis_level
parameters in each parameter set to expand. This was mandatory in Abinitio V-13.
NOTE: Abinitio V-14 automatically expands the psets when they are checked in.

02/12/15

by Manish Shekhar - Infos


ys

Achieving data lineage for generic graphs using psets.

Distinct values of logical EME datasets are passed from different psets to the same
generic graph. This is done to achieve data lineage. When psets are checked in they
are expanded and dependency analysis takes place. Different instances of the
generic graph will show up in EME with unique values of logical datasets.

02/12/15

by Manish Shekhar - Infos


ys

EME view of distinct instances of generic graph:

As above different data lineage are achieved in two instances of the same graph in EME.

02/12/15

by Manish Shekhar - Infos


ys

Physical dataset names overwrites the logical EME dataset names passed
from psets. Physical dataset names are set and then passed while
executing the graph from within the wrapper via pset.
For e.g. exporting physical datasets

Calling graph passing parameters

02/12/15

by Manish Shekhar - Infos


ys

Handling concurrent running multiple instances of a graph


AB_JOB_PREFIX
To avoid problems with multiple instances of a graph being run concurrently in the same
directory, you can make the AB_JOB value unique by exporting the AB_JOB_PREFIX
configuration variable. For e.g.

AB_JOB_PREFIX should be assigned any dynamic value. In the e.g. above it is


assigned to process id (PID=$$). Alternatively date timestamp in YYYYMMDDHHMISS
format can also be assigned to it.
Setting this parameter makes sure that AB_JOB will now resolve to $
{AB_JOB_PREFIX}${AB_JOB} and thus recovery files also will get created with different
names.

02/12/15

by Manish Shekhar - Infos


ys

Capturing job statistics details in the EME when using generic graphs
AB_AIR_JOB_GRAPH
Specifies the graph/application being run so that it may be linked to the job object.
-

When a generic graph is called the job statistics are stored in the EME under the name of
the generic graph. This causes confusion and discrepancies when tracking stats in EME
because a generic graph may be used in multiple projects. The objective is to store job
statistics under the pset name so that they can be correlated with the logical use of the
generic graph.

- This parameter needs to be set in the calling script/program to have a generic graph reposit
tracking to the .graph (pset version) of the graph.
-

If the graph is generic then you should set AB_AIR_JOB_GRAPH because you want the
job to be associated with pset instance of the graph which does the specific task according
to values passed through pset.

02/12/15

by Manish Shekhar - Infos


ys

In Coop Sys 2.14 and above

Benefits
Job statistics will be reposited with the logical use of the graph
The statistics will be accurately reported by the appropriate job group or project
Performance improvement in graph execution time.

02/12/15

by Manish Shekhar - Infos


ys

Please read the below document for more detail :


-

/opt/abinitio/abinitio-V2-15-5-0/doc/EME_Developer_Guide.pdf

/opt/abinitio/abinitio-V2-15-5-0/doc/EME_Reference.pdf

02/12/15

by Manish Shekhar - Infos


ys

THANK YOU

02/12/15

by Manish Shekhar - Infos


ys

Anda mungkin juga menyukai