Wrong and Misguided Business Decisions Failure to meet regulatory requirements Negative impact on customer relationships High costs of reworks and fixes All in all: negative impact on bottom line
Data Governance
Technical
Business
High demands from the Data Governance structure and Data Stewards Potential for them to become a bottleneck or simply be ineffective
Marketing/Sales
EUR 3.4 mil for PI loans only Est. several million EUR EUR 1.65 mil for PI loans only Est. 40% of IT budget
Determine where you are IQ Assessment Get used to Data Profiling Revisit your down-stream ETL processes Automate Data and Business Logic Profiling Implement in gradual stages MDM Architecture Enhance your MDM by more general Rule-based Engine capable to handle transactions in real time
Adastra IQ Assessment
Thoroughly investigates current data processes Delivers an objective Information Quality scorecard based on a quantitative assessment of organizations data and information environment Allows organizations to compare industrys best practices against their companys IQ standards and implementations Allows organizations to assess improvements in the quality over time Is the first step towards IQ ongoing improvements and establishement of IQ conscious culture Optional Business Impact Analyses
Source Data
Domain Analysis Business Rules Analysis Nulls / Blanks Analysis Uniqueness Analysis Relationships Analysis Pattern / Mask Analysis Non-Standard Data Analysis Change Management Processes Policy and Procedures
IQ Processes
Data Stewardship On-Going Data Validation Analysis Measurement of Data Quality Improvement Error Discovery Analysis Data Reconciliation Analysis Linkage of Quality and Reward Project Management Methodologies Change Management Processes Policy & Procedures
81 100
Processes and procedures are aligned with industry best practices to improve and maintain information quality. All methodologies are clearly documented and enforced through data stewards to which accountability is held for information quality. Most processes and procedures are aligned with industry best practices to improve and maintain information quality. All methodologies are clearly documented and enforced through data stewards to which accountability is held for information quality. Most processes and procedures are aligned with industry best practices to improve and maintain information quality. All methodologies are clearly documented and enforced through respective owners. Accountability and responsibility for information quality is not firmly assigned. Processes and procedures are in place to improve and maintain information quality. Methodologies are sparsely documented and enforced through respective owners. Accountability for information quality is unassigned. Few processes and procedures are in place to improve and maintain information quality. Set methodologies are undeveloped to streamline these processes. Accountability for information quality is unassigned.
Top Tier
61 80
Acceptable
41 60
21 40
1 - 20
Determine where you are IQ Assessment Get used to Data Profiling Revisit your down-stream ETL processes Automate Data and Business Logic Profiling Implement in gradual stages MDM Architecture Enhance your MDM by more general Rule-based Engine capable to handle transactions in real time
DQ Tools
Many tools on the market I makes good sense to pay attention to the following aspects:
Data Connectivity Data Profiling Features Validation and Measurement Reporting Features Metadata Specific Capabilities Performance Security Infrastructure Compatibility Usability Completeness of Vision Total Cost of Ownership
Determine where you are IQ Assessment Get used to data profiling Revisit your down-stream ETL processes Automate Data and Business Logic Profiling Implement in gradual stages MDM Architecture Enhance your MDM by more general Rule-based Engine capable to handle transactions in real time
DQ ETL Components
Adastra has developed a comprehensive set of ETL components designed to address DQ related data processing within ETL workflows Our ETL component suite integrates with three major ETL vendors (Informatica, IBM, Ab Initio) The list of components include:
Operational Reconciliation Data Validation Error Logging
Operational Reconciliation
As a data extract is received, it is necessary to verify that such data is indeed valid for processing Our generic components perform the following verifications:
The extract follows the order of execution The extract is not a previous repeat The extract is complete The extract belongs to the expected period The extract truly contains the number of records produced by the source system There were no transmission errors or inappropriate data manipulations performed on the extract
These components are performed immediately upon the landing of the extract, in order to allow any serious data issues to be addressed timely and outside of the critical window
Data Validation
Utilizing Data Profiling tool to generate Data Validation rules This saves time and guarantees a proper flow and integrity of processes In the absence of this capability, we have generic components that are metadata driven and are leveraging confirmed knowledge obtained from Data Profiling process Alternatively the same can be achieved by a more sophisticated technologies (i.e. Ataccama Data Quality Center) in a fully automated, controlled and metadata driven way outside of the standard ETL process
Error Logging
In the event that any data quality issues are identified from the previous step, those are logged This is a fundamental step to support the following aspects:
Document all data quality issues before any data cleansing is performed (e.g.: applying a default) Support measurements and reporting Support feedback to the source systems for possible data corrections
Determine where you are IQ Assessment Get used to data profiling Revisit your down-stream ETL processes Automate Data and Business Logic Profiling Implement in gradual stages MDM Architecture Enhance your MDM by more general Rule-based Engine capable to handle transactions in real time
DQ Management Cycle
DQ Reporting Architecture
Data Sources
Data Quality
Engine
Cleanse
Match Enrich
Scorecard/Monitor
Packaged Applications
Exception Reporting is also available and will notify the respective user groups when specific events happen, such as Data Quality going below a specific threshold
High level overview of key data areas A management tool for DQ Manager/Business Sponsor of DQ Program Based on defined Business rules Tracking against defined KPIs
A management tool for Data Steward/Business Owner of a given data entity Compound KPIs in selected categories User-defined KPIs, report structure and target DQ levels Using pre-defined business rules
Analytical tool for Data Stewards/ Technical Analysts Detailed breakdown of a given KPI Allows Data Stewards/ Data Quality Managers to take action
Allows Data Stewards/ Data Quality Managers to take action, eg.: To apply validation on input Change existing business process Correct a malfunctioning ETL/interface from one of the systems, etc
Ability to support complex, hierarchical business rules Configurable by users, no Coding High performance to execute defined rules on tens of millions of records Ability to execute business rules in realtime
Determine where you are IQ Assessment Get used to data profiling Revisit your down-stream ETL processes Automate Data and Business Logic Profiling Implement in gradual stages MDM Architecture Enhance your MDM by more general Rule-based Engine capable to handle transactions in real time
Consolidation Approach
Transaction Systems MDM Hub Solution
Data Quality Metadata Cleansing Front End Standardization Identification Unification
Accounting
Dictionaries Etalons
Billing
The master data is physically stored in a central repository It is cleansed, standardized, de-duplicated and unified in a batch mode The master data repository forms a golden record for all downstream systems The master data will be as current as the latest batch run The operational systems continue to maintain their version of the master data With this approach only the downstream processes benefit from the master data. This may include reporting, analytics, marketing campaigns, data mining, etc. Often the consolidation approach is implemented as an extension of the EDW environment Data Marts
CRM
ERP
Data Integration
Registry Approach
Transaction Systems MDM Hub Solution
Data Quality Metadata Cleansing Front End Standardization Identification Unification
Accounting
Dictionaries Etalons
Billing
Only the master data identifiers are stored in the repository together with their relationships and deduplication groups The rest of the master data is stored in its original location in the operational systems The MDM Hub maintains a set of rules for reconstructing and assembling the master data at runtime The master data retrieved is always up to date The performance may not be good for large amounts of master data accessed due to runtime data federation The operational systems continue to maintain their version of the master data With this approach only the downstream processes benefit from the master data. This may include reporting, analytics, marketing campaigns, data mining, etc. Data Marts
CRM
ERP
Data Integration
Coexistence Approach
Transaction Systems MDM Hub Solution
Data Quality Metadata Cleansing Front End Standardization Identification Unification
Accounting
Dictionaries Etalons
Billing
The consolidation often evolves in a coexistence approach The master data is physically stored in a central repository. It is cleansed, standardized, de-duplicated and unified in a batch mode. The master data will be as current as the latest batch run The master data repository forms a golden record for all downstream systems and some upstream systems The difference with the consolidation approach is that the data is published and some of the operational systems may synchronize their data with the master data The master data is synchronized across multiple systems
ERP
Data Integration
Accounting
Dictionaries Etalons
The master data is physically stored in a central repository. It is cleansed, standardized, de-duplicated and unified in a batch mode as well as at runtime The master data repository forms a golden record for all downstream systems and some upstream systems Some of the upstream systems give up the maintenance of the master data to the MDM Hub. They directly access the transaction hub for all master data management
Billing
ERP
Data Integration
Determine where you are IQ Assessment Get used to data profiling Revisit your down-stream ETL processes Automate Data and Business Logic Profiling Implement in gradual stages MDM Architecture Enhance your MDM by more general Rule-based Engine capable to handle transactions in real time
Accounting
Dictionaries Etalons
Billing
ERP
Data Integration
Thank You
CANADA
CZECH REPUBLIC
SLOVAKIA
GERMANY
BULGARIA
Adastra Corporation Le Parc Office Tower 8500 Leslie St. Markham Ontario, L3T 7M8 Canada info@adastracorp.com www.adastracorp.com
Adastra, s.r.o. Nile House Karolinsk 654/2 Praha Czech Republic info@adastra.cz www.adastra.cz
Adastra GmbH Bockenheimer Landstrasse 17/19 Frankfurt a. Main Germany info@adastracorp.de www.adastracorp.de
Adastra Bulgaria EOOD 29 Panayot Volov str., 5th floor Sofia Bulgaria info@adastracorp.com www.adastracorp.com