Anda di halaman 1dari 20

WHITE PAPER

Migrating Off the Mainframe


The Approaches, Techniques, and Tools Organizations Need to Successfully Migrate Data to Open Standard Relational Database Management Systems

This document contains Confidential, Proprietary and Trade Secret Information (Confidential Information) of Informatica Corporation and may not be copied, distributed, duplicated, or otherwise reproduced in any manner without the prior written consent of Informatica. While every attempt has been made to ensure that the information in this document is accurate and complete, some typographical errors or technical inaccuracies may exist. Informatica does not accept responsibility for any kind of loss resulting from the use of information contained in this document. The information contained in this document is subject to change without notice. The incorporation of the product attributes discussed in these materials into any release or upgrade of any Informatica software productas well as the timing of any such release or upgradeis at the sole discretion of Informatica. Protected by one or more of the following U.S. Patents: 6,032,158; 5,794,246; 6,014,670; 6,339,775; 6,044,374; 6,208,990; 6,208,990; 6,850,947; 6,895,471; or by the following pending U.S. Patents: 09/644,280; 10/966,046; 10/727,700. This edition published July 2006

White Paper

Table of Contents
Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 The Business Challenges of Mainframe Migration . . . . . . . . . . . . . . . . . . .3 The Technical Challenges of Mainframe Migration . . . . . . . . . . . . . . . . . . .4 The Seven Approaches to Mainframe Migration . . . . . . . . . . . . . . . . . . . . .4 Data Migration Project Challenges, Methodologies, and Tools . . . . . . . . . .6
Mainframe Data Migration Project Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Data Migration Methodologies and Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8

The Solution: Single, Unified Enterprise Data Integration Platform . . . . . . .9


Data Profiling Capabilities for Identifying and Analyzing Source Data . . . . . . . . . . . . . . .9 Universal Data Access Capabilities for Accessing Source Data . . . . . . . . . . . . . . . . . . .11 Built-In Data Transformation and Correction Capabilities to Address Data Quality in Legacy Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14 Single, Unified, Metadata-Based Data Integration Platform to Support the Data Migration Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14

Conclusion and Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16

Migrating Off the Mainframe

Executive Summary
For more than 40 years companies have deployed mission-critical business applications on the mainframe. Many of these applications have been built for both non-relational database management systems (DBMSs), as well as for relational sources of data on the mainframe, such as DB2. Yet recently, according to Gartner, the installed base for pre-relational database management systems has been declining. The useful life of pre-relational mainframe database management system engines is coming to an end because of a diminishing application and skills base, and increasing costs. Although the installed base for pre-relational DBMSs is shrinking, the market share numbers from Gartner Dataquestshow that the revenue is increasing. This is due primarily to increased prices from the vendors, currency conversions and mainframe CPU replacement. In real numbers, the revenue is dropping as the number of customers and licenses decreases.1 Many companies have migrated mission-critical applications off the mainframe onto open standard relational database management systems (RDBMS) like Oracle for a variety reasons limited application support from independent software vendors (ISVs) and a shrinking resource base, for example. Regardless of the reasons why your business has elected to move off mainframe, once the decision has been made, your IT organization needs to know about the approaches, techniques, and tools to successfully migrate to a more modern application landscape or open standard RDBMS like Oracle. This is where this white paper can help. This white paper examines both the business and technical challenges of migrating off the mainframe. It outlines the seven mainframe migration approaches that IT organizations can use to develop their migration strategies. It explores common data migration project methodologies and tools, and suggests ways to convert a serial approach to migration into a more effective, iterative process. Finally, this white paper describes how IT organizations can use Informatica enterprise data integration software to effectively migrate off the mainframe to more modernized systems. While most of the practices discussed in this paper apply to migrations from mainframe-based legacy applications to any relational database management system, this paper focuses specifically on migrations to Oracle environments.

1 Mattern, Thomas and Matthias Haendly. "ESA: A 2005 'Business-Savvy' Take on SOAs," Integration Developer News, February 9, 2005.

White Paper

The Business Challenges of Mainframe Migration


Mainframes typically run mission-critical applications that have been in production for two to three decades. Once the decision has been made to migrate off the mainframe, maintaining this continuity is one of the primary business challenges to address. The mainframe migration strategy should ensure the continuity of the new application and, in the event of failure, rollback to the mainframe application. This approach requires data in the mainframe application to be synchronized with data in the new application. Complicating mainframe migration is the fact that while mainframe applications tend to be interdependent, businesses often move applications one at a time to mitigate risk. Many of these applications often include hundreds, even thousands, of homegrown COBOL, Assembler, PL/1, or Natural programs, for instance. Businesses find it challenging to prioritize the order in which these applications are to be moved off the mainframe and ensure that the order meets both business needs and minimizes risk in the migration process. Once a specific mainframe application is being migrated, the next challenge is deciding which business processes will be migrated. Many analysts argue that introducing new business processes into an organization is much more costly than the migration of the technology itself. Mitigation of this risk is an essential component of any migration plan. Many companies have business processes that reflect the way their systems work. When migrating an application off the mainframe, many business processes do not need to be migrated. Even among the business processes that need to be migrated, some will need to be moved as is, and some will need to be changed to accommodate the new application. During a mainframe migration, many companies take the opportunity to examine the business processes they have followed for many years and modernize them to support their current and future business requirements. In many cases, legacy processes linger because of technology limitations that, while valid a decade ago, are no longer relevant. A technology migration provides organizations with the opportunity to reevaluate, streamline, and update these business processes. Data is the foundation of the modernization process. The application, business logic, and work flow can all be migrated, but without a clean migration of the data, companies will not meet their business requirements. A clean data migration involves data that is: Organized in a usable format by all modern tools Optimized for an Oracle database Easy to maintain

Migrating Off the Mainframe

The Technical Challenges of Mainframe Migration


The technical challenges of migrating an application off the mainframe reflect both the vast differences in programmatic and data formats, as well as the sheer complexity of systems that have been continually evolving in-house for decades. Most applications that maintain mainframe data (e.g., VSAM, IMS, IDMS, ADABAS, etc.) typically do so in non-relational formats, such as multiple record types in one file, hierarchical structures, networked structures, or a combination of any of these structures. IT organizations face a variety of technical challenges in migrating off the mainframe, including: Incompatible file formats and structures. The complexity and variety of non-relational database management systems represent a daunting challenge for organizations attempting to correlate their format and structure to a relational system. Architectural elements unique to the mainframefor example, flexible file definitions that facilitate data files with multi-record formats and multi-record types in same data setcompound this problem. Correlating these unique formats and vastly different structures to a relational model is typically a tedious process that often requires significant labor, as well as extensive expertise in both the legacy and relational environments. While incompatible file formats and structures complicate the migration effort, it is still possible to reconcile them during migration. Organizations need a way to transparently extract both legacy and relational data and to present this data consistently across the enterprise. Data and referential integrity. Maintaining referential integrity when moving from a nonrelational to relational model presents considerable challenges. Organizations need to ensure that all of the parent-child relationships across mainframe files and records are appropriately mapped to RDBMS tables. They also need to define how looping data structures and substructures are mapped to relational tables. They need a way to extract non-relational data in tact and present it in a graphical design environment so that developers can easily identify, correlate, and maintain these non-relational links to their relational equivalentsthereby dramatically simplifying the procedure and ensuring data integrity across the move. Performance. IT organizations need to create an Oracle schema, which includes mapping mainframe keys to Oracle primary and secondary keys, to maximize performance. When organizations migrate off the mainframe, they need to map mainframe keys to Oracles concept of primary and secondary keys. Mainframe data may be organized in an order of key value that makes sense for that particular mainframe system, but would not make sense in the Oracle target system. In addition, on the mainframe multiple iterations of related repeating fields are denormalized and housed in one record. They are not split out into related normalized tables, as they would be housed in Oracle. Both these scenarios could negatively impact performance on an Oracle RDBMS.

The Seven Approaches to Mainframe Migration


Since migrating off the mainframe is not an easy process, organizations need a comprehensive mainframe migration strategy to address all the variables. This section presents seven different approaches that organizations can use to develop their mainframe migration strategy. It should be noted that these approaches are not mutually exclusive. In many cases, organizations should employ multiple approaches. Every mainframe migration project is unique and will involve some subset or hybrid of these seven approaches. Regardless of the approach or combination of approaches used, IT organizations need both a robust data migration methodology and data migration toolset to migrate off the mainframe onto the Oracle system.

White Paper

1. One-Time Bulk Offload. This approach is typically used for testing and staging environments. Data is migrated off the mainframe as a one-time data movement event, often completed during a lean period like a weekend or the early morning hours. This approach requires considerable advance planning, especially since the entire bulk data load has to be moved within a short window of time. Typically, this approach is used for the initial data load for testing the migration, and after all the testing has been done and before moving the system to production.

2. Incremental Delta Offload. In this approach, data is migrated off the mainframe in batches. After the initial data movement, the goal is to bring over changes made to the mainframe system data on a periodic basis (e.g., daily, weekly, monthly). The challenge of this approach is to identify the changes made on the mainframe and selectively extract just the changes.

3. Bi-directional Replication Synchonization. In this approach, two production systems the mainframe system and the Oracle system run in parallel with data on each system and replicate data on the other. The challenge of this approach is to support both batch and realtime bi-directional integration, since in many cases, both systems will be running for years before the mainframe is shut down. Its important to note that in this scenario, business decisions would have to be made well in advance of implementing this replication scenario to determine the master/slave relationship in this bi-directional transaction. Otherwise, unpredictable updating could occur to either the source or target system, or both.

4. Physical Federation. This approach involves multiple data sources (e.g., VSAM, DB2, IMS, Datacom, ADABAS, etc.), which must be read and joined to produce a single view of the data inside an Oracle RDBMS. Data is still stored in the respective mainframe data stores, but the Oracle system becomes the single version of the truth. This is a popular approach when the packaged application replacing the mainframe may not have all the functionality of the mainframe, or when parts of the mainframe system are so complicated that they cannot be replaced for years to come. This approach facilitates phased migration of mission-critical infrastructure. Companies keep their mainframe systems, but by pulling the data into an Oracle system, they put in place a service-oriented architecture (SOA) for integration with the rest of the enterprise.

Migrating Off the Mainframe

5. Virtual Federation. This approach is identical to the physical federation model, but instead of loading all the data into an Oracle RDBMS, the data from all the mainframe sources is joined virtually to provide a just in time single view to the consuming applications or users. This approach is sometimes called Enterprise Information Integration (EII).

6. Oracle Transactions on Mainframe. In this approach, the Oracle system becomes the primary system of record for business process execution, but some critical business functionality still resides on the mainframe. New transactions are first processed on the Oracle system, and then related mainframe system updates are executed by initiating batch jobs, or by such on-line transaction systems as Customer Information Control Systems (CICS) or Information Management System/Transaction Manager (IMS/TM), formerly known as IMS/Data Communications (IMS/DC).

7. Mainframe Transactions on Oracle. In this approach, functionality moves slowly to the Oracle system, but the mainframe still remains the primary system of record for business process execution. Since functionality and data have been moved to the Oracle system, there are still CICS, IMS, and/or batch mainframe transactions that need to access data in the Oracle database.

Data Migration Project Challenges, Methodologies, and Tools


While data migration is essential to the success of an Oracle RDBMS implementation, the role of data migration in the project often overlooked and underestimated. The common assumption is that tools exist to extract the data from the mainframe and move the data into Oracle, or that data migration is something a consulting partner will handle. Often project teams tasked with data migration focus solely on the timely conversion and movement of data between systems. But data migration is not just about moving the data into Oracle; its about making the data work once within Oracle. This means that the data in the Oracle application must be accurate and trustworthy for business users to readily transition from their legacy mainframe applications to adopt an Oracle system. Research has shown that software implementations are put at risk when data migration is not thoroughly considered and planned. According to recent research, more than 80 percent of

White Paper

software implementation projects fail or overrun their budgets and schedules. Of the projects that are overrun, half exceed timescales by 75 percent and two-thirds exceed the overall project budgets. A major reason why these failure rates are so high is because data migration is considered a minor, one-time event during the overall implementation. Migration is not an industry-recognized area of expertise with an established body of knowledge and practices, nor have most companies built up any internal competency from which to draw. Organizations need to understand the unique challenges of migration projects and adopt an appropriate migration methodology to address and overcome these challenges

Mainframe Data Migration Project Challenges


The top six challenges associated with mainframe data migration projects are: 1. Identifying and analyzing source data. Often there is insufficient understanding of data and source systems. The required data is spread across multiple source systems, not in the right format, of poor quality, only accessible through little-understood interfaces, poorly documented, missing source code, burdened by superfluous logic, or sometimes missing altogether. Identifying and analyzing source data in mainframes is even more complicated since mainframes house custom applications developed over decades that often incorporate hundredssometimes thousandsof individual COBOL, Assembler, PL/1, or Natural programs, for instance. 2. Accessing source data. According to a recent survey of more than 350 firms, the typical organization relies on more than 50 core business applications, and companies with more than $1 billion annual revenue have as many as 500 systems. Regardless of whether there are five, 50, or 500 source systems to migrate, the question needs to be answered as to how this will be accomplished. Organizations need to determine how mainframe source data will be accessed before it is migrated. Simple extraction and upload often proves to be unrealistic due to the volume of source systems and the availability of legacy application resources, as well as the quality and the format of the data. 3. Addressing data quality in legacy applications. Data migration teams need to understand and accept that there may be dirty data in the mainframe system. Data quality can be compromised as a result of how the data has been entered, maintained, processed, and/or stored. To address data quality issues when migrating off the mainframe, data migration teams should consider the datas existence, validity, consistency, timeliness, accuracy, and relevance. For example, relevance may mean that data that is relevant in the mainframe system will not be needed on the target Oracle-based system. 4. Preparing and loading data into the target system. The target system is often under development at the time of data migration, and the requirements often change during the project. Complicated target data validations. Many target systems have restrictions, constraints, and thresholds on the validity, integrity, and quality of data to be loaded. 5. Supporting the data migration lifecycle. Data migration is not a one-time effort. Legacy mainframe systems are usually kept alive after new systems launch. Synchronization is required between the old and new systems during this hand-off period. Also, long after the migration is completed, companies often have to prove the migration was complete and accurate in order to comply with regulations like Sarbanes-Oxley and Basel II. 6. Accommodating behavioral changes. Technical changes, especially to systems that have been in production for two to three decades, cannot be made in a vacuumthese changes have an impact on the people in your organization. Migration projects that dont accommodate for associated adjustments in the behavior of the employees, partners, and customers are prone to failure. Migration teams can accommodate behavioral changes by either ensuring a high degree of user transparency (i.e., user-transparent integration) or by providing for the concurrent synchronization and operation of both legacy and new systems over a period of time (i.e., synchronization) so that the most appropriate behavioral practices can be developed.

Migrating Off the Mainframe

In summary, during the upfront analysis of the source mainframe data, most of the assumptions about the data are proved wrong. Since sufficient time is rarely planned or allocated for analysis, any mapping specification from the mainframe to Oracle is hardly more than an intelligent guess. Based on the initial mapping specification, extractions and transformations run into changing target data requirements, requiring additional analysis and changes to the mapping specification. Validating the data according to various integrity and quality constraints also typically poses a challenge. If the validation fails, the project goes back to further analysis and then additional rounds of extractions and transformations. When the data is finally ready to be loaded into the Oracle system, unexpected data scenarios often break the loading process and send the project back for more analysis, more extractions and transformations, and more validations.

Data Migration Methodologies and Tools


Migration projects are commonly and mistakenly thought of as a serial, four-stage process: 1. Analyze the source data 2. Extract/transform the data into the target formats 3. Validate/cleanse the data 4. Load the data into the target However, the problem of this serial project methodology is that it does not support the iterative nature of migrations. Further complicating the issue is inadequate technology. Often technology used for data migration consists of general-purpose tools repurposed for each of the four stages. For example, spreadsheets or SQL scripts are used for data analysis, Cobol code for extraction of mainframe data to flat files, transformation or application integration tools to convert the data, and a quality assurance (QA) testing tool to test and load the data. These disconnected or siloed tools only serve to exacerbate an already inappropriate project methodology. The ideal approach for successfully managing a data migration project is cyclical. It allows IT organizations to analyze the data, extract and transform the data, validate the data, load it into targets, and then, most importantly, repeat the process until the migration is successfully completed. This cyclical methodology enables target-driven analysis, the validation of assumptions, designs to be refined, and best practices to be applied as the project progresses. This agile methodology uses the same four stagesanalyze, extract/transform, validate, and loadbut the four stages are but also interconnected with one another. Figure 1 illustrates how migration can be converted from a serial process into an iterative process.
Analyze Extract/ Transform Validate/ Cleanse Load

Analyze

Extract/Transform

Load

Validate/Cleanse

Figure 1: The Data Migration Methodology Should Be Converted from a Serial Process into an Iterative Proces

White Paper

This iterative approach to data migration is best achieved by using a single, unified toolset or platform that leverages automation and provides functionality that spans all four stages. In an iterative process, there is a big difference between using four different tools for each stage and one unified toolset across all four stages. When IT organizations use one unified toolset, the results of one stage can be easily carried into the next, enabling faster, more frequent and ultimately fewer iterationsa key to success in a migration project. A single platform not only unifies the development team across the project phases, but also unifies the separate teams that may be handling each different source system in a multi-source migration project.

The Solution: Single, Unified Enterprise Data Integration Platform


So how do organizations address the business, technical, and methodology challenges associated with migrating off the mainframe and onto an Oracle system? The answer is by using a single, unified enterprise data integration platform for data migration. Informatica provides a single, unified enterprise data integration platform that is ideal for migrating data off the mainframe into Oracle systems. Informatica PowerCenter is a single, unified enterprise data integration platform that enables companies and government organizations of all sizes to access and integrate data from virtually any business system, in any format, and deliver that data throughout the enterprise at any speed. Available with PowerCenter, Informatica PowerExchange provides on-demand access to data in all critical enterprise data systems, including mainframe, midrange, and file-based systems. PowerExchange helps organizations leverage mission-critical operational data by making it available to people and processes without requiring manual coding of data extraction programs. Data migration teams can realize significant benefits from using PowerExchange to access mainframe and legacy data and make it available in when they need itbatch, incremental updates, or in real time. Both Informatica products provide powerful capabilities to help overcome the challenges associated with migrating data off the mainframe and into an Oracle DBMS. These capabilities include: Data profiling capabilities for identifying and analyzing source data Universal data access capabilities for accessing source data Built-in transformation and correction capabilities for addressing the quality of data in legacy applications Single, unified data integration platform to support the data migration lifecycle

Data Profiling Capabilities for Identifying and Analyzing Source Data


While the objective of moving data from the mainframe to an Oracle system seems straightforward, complications arise when legacy migration translates to n number of distinct business applications running on different platforms and data stores, and the context and relationship of the data may not meet or match Oracle requirements. Data profiling is the analysis of data to understand its content, structure, quality, and dependencies. During Oracle implementations, data migration teams typically try to profile legacy data manually. Manual data profiling ranges from spot inspections of actual legacy

Migrating Off the Mainframe

applications or sample data extracts, to analysis via custom-coded reports or elaborate and intertwined spreadsheets. These data profiling methods typically sample data in a few key fields to get a sense of what the data is like in these columns, but the results are often inaccurate and incomplete. An inadequate toolset and manual approach to profiling often leads to a data migration project which underestimates the scope, schedule, and resources required to properly analyze source data systems. Figure 2 shows how a much more even distribution of project resources over the key project phases (e.g., analysis, build, and test) can promote savings. Relying on the build or development phase to identify and fix data issues can increase the cost by ten times.

Typical Project Effort

Ideal Project Effort

Analysis 10%

Test 30%

Analysis Build Test

Analysis 40% Build 30%

Test 30%

Build 60%

Figure 2: Proactive Analysis of Source Data Saves Both Time and Money

PowerCenters data profiling capabilities provide comprehensive, accurate information about the content, quality, and structure of data in virtually any operational system. Organizations can automatically assess the initial and ongoing quality of data regardless of its location or type. With its comprehensive data profiling capabilities, PowerCenter: Reduces data quality assessment time with easy-to-use wizards and pre-built metric-driven reports that comprise a single interface for the entire profiling process Addresses ongoing data quality in legacy applications with Web-based dashboards and reports that illustrate changes in data content, quality, structure, and values over time Ensures end user data confidence by automatically and accurately profiling any data accessible to PowerCentervirtually any and all enterprise data formats Figure 3 shows an example of a PowerCenter data profiling report. The report shows how PowerCenter automatically infers the primary and foreign key relationships across three tables in a legacy application. Its important to note that PowerCenter data profiling can profile any data source that PowerCenter can natively access, including mainframe tables.

PowerCenters data profiling reports help migration teams determine if the legacy data has quality issues and how to properly address them.

Figure 3: PowerCenter Profiling Report Infers Primary Key and Foreign Key Relationships between Multiple Legacy Application Tables

10

White Paper

PowerCenters data profiling capabilities help migration teams to do much more thorough analysis than manual profiling of the legacy systems. The platform provides the tools to automatically scan all records across all columns and tables in a source system and dynamically generate reports that make it easy to understand the true state of the data. These reports help the migration teams help migration teams determine if the legacy data has quality issues and how to properly address them. Data profiling is important both before (i.e., upfront source system profiling) and after (profiling the converted data for the Oracle application environment) migration. PowerCenters capabilities enable the profiling of data pre- and post-migration, validating the readiness of the mainframe data for Oracle.

Universal Data Access Capabilities for Accessing Source Data


Analysis of legacy data is essential for creating accurate data migration mapping specifications with relevant data conversion requirements. However, a complex, inefficient migration process still lies ahead if the data migration team relies exclusively on manually extracting data from each legacy data source. According to a report from The Data Warehousing Institute (TDWI), on average, organizations extract data from at least 12 distinct data sources. This average will inexorably increase over time as organizations expand their enterprise application landscape to support more subject areas and groups in the organization. Many mature and established applications are still maintained on mainframe platforms. A significant percentage of data for a mainframe migration will need to be extracted from these systems, but the fact that much of the mainframe data is not stored in a relational format leaves the migration teams relying exclusively on mainframe developers to extract and replicate data. In addition to mainframe data formats, a multitude of other data formats are also prevalent and considered to be of enterprise access significance. Based on a 2003 TDWI survey of the types of data sources that ETL programs process, enterprise data may reside in XML files, Web-based data sources, payloads from message queues, as well as unstructured data formats such as Microsoft Excel and Adobe .pdf files , as shown in Figure 4. The ability to readily access all enterprise datastructured, unstructured, and semi-structuredis vital to successful data migration.

Data Sources
Relational databases Flat files Mainframe/legacy systems Packaged application Replication or change data capture utilities EAI/messaging software Web XML Other 4% 15% 12% 15% 15% 39% 65% 81% 89%

20

40

60

80

100

Figure 4: Enterprise Data Resides in a Variety of Sources and Formats

Migrating Off the Mainframe

11

Source Virtually Any and All Data Formats


With PowerCenter, data migration teams can source directly from a mainframe non-relational data source (in addition to getting to DB2 mainframe data) as if it were a relational database. PowerCenters data access capabilities offer migration teams the flexibility to source these softer forms of data which traditionally would be left up to manually interpretation and processingor worse, left unaccounted for in the migration process. PowerCenter provides universal data access, allowing the data migration team to source virtually any and all enterprise data formats, including: Mainframe data Structured data Unstructured data (e.g., Microsoft Word documents and Excel spreadsheets, email, binary files, .pdf files, etc.) Semi-structured data (e.g., industry-specific formats such as HL7, ACORD, FIXML, SWIFT, etc.) Relational data (e.g., DB2, Oracle, Microsoft SQL Server, etc.) ERP (e.g., SAP, PeopleSoft, Siebel, etc.) and file data Message queues (e.g., TIBCO, IBM MQ Series, JMS, MS MQ, etc.) Figure 5 shows the breadth of PowerCenters data access capabilities.
Real-Time Data Sources
TIBCO IBM WebSphere MQ JMS SAP MSMQ WEBM Web Services

Enterprise Software Sources


Mainframe AS/400 JDE PeopleSoft Siebel SAP SAS Essbase Lotus Notes

Unstructured Data
PDF Word Excel Vertical Standards (e.g., HL7, SWIFT, ACORD) Print Stream BLOBs Any proprietary data format/standard

Informatica PowerCenter

Across the Firewall/WAN

Open and Relational Data Sources


Oracle IBM Microsoft Sybase Informatix Teradata Flat Files XML Web Logs

Remote Data Access

Remote or Outsourced Business Applications

Figure 5: PowerCenter Provides Universal Data Access

The flexibility to access all types of enterprise data in a single data integration platform offers significant advantages over hand-coded data migration approaches, including: Increased productivity. With the ability to centralize data access and management, PowerCenter frees data migration teams from having to maintain and be dependent on a cumbersome, time-consuming process where programs are developed to extract and stage data for each source of legacy data. Reduced risk. Sources of data for Oracle DBMS implementations tend to be dynamic. Extracting data from a client/server-based legacy application today does not insulate the team from future requirementsfor example, having to migrate over mainframe and mid-range applications from applications resulting from a corporate merger or acquisition. PowerCenter reduces the risk of both current and future data migration efforts by providing access to a broad range of enterprise data formats.

12

White Paper

Unlock Complex Non-relational Data without Coding


PowerExchange provides both the interface and engine to ensure successful migration from mainframe data to newer relational systems. PowerExchange provides an intuitive graphical user interface that allows developers to access, manipulate, and better integrate complex nonrelational data residing on the mainframe. This functionality includes the ability to import existing metadata (e.g., COBOL or PL/1 copybooks, or Natural DDMs) directly from the source to be leveraged in the migration strategy. This interface is both codeless and universal, thereby eliminating the need for lengthy training and implementation, regardless of the source platform. Once the data has been identified and its relationship interpreted, PowerExchange provides direct access to some or all of the data in the source system without requiring IT staff to manage multiple interfaces, install special drivers, write scripts, install gateways or implement new communications protocols. PowerExchange provides the data in batch for initial load, change for incremental updates, or real-time for environments where migration will occur while both systems are in production for months or years. Figure 6 shows how PowerExchange can move data in real-time, change data capture, or bulk modes.

Legacy

PowerExchange
Real Time

Target

Mainframe

Oracle

Change

DB2, IMS, IDMS

Batch

Figure 6: PowerExchange Accesses Mainframe and Provides a Choice of Latency to Deliver Data When Needed

Simplify Management of Disparate File Formats and Structures


PowerExchange simplifies the management and organization of disparate file formats and data structures by providing a single platform with a ubiquitous and transparent access to numerous newer and legacy systems. This is critical for developers migrating from non-relational to newer relation systems, where the tedious translation of these formats and structures is often cited as the most problematic aspect of the migration project. PowerExchanges navigator console performs seamless extraction of all major mainframe file formats, while maintaining their associated structures. Regardless of the source type, this console represents mainframe file formats in a consistent manner. This means that developers can spend time designing form and function of their new environment, instead of interpreting the meaning of their old one.

Migrating Off the Mainframe

13

Built-In Data Transformation and Correction Capabilities to Address Data Quality in Legacy Applications
The Informatica product suite helps data migration teams by enabling the team to focus on the data and not code. PowerCenter provides a single, unified, scalable enterprise data integration platform with a robust library of transformation and data services capable of handling all data conversion on any mainframe data migration project. By leveraging PowerCenters codeless and wizard-driven approach for Oracle data conversion, teams can focus more on the business rules and data, and less on the code.

Ensure Data and Referential Integrity


Mainframe migration projects are often stalled at the interpretation and translation of data and referential integrity. Understanding the referential child and parent relationships of a mainframe file or set of files is often a tedious and complex undertaking for development teams that may be more familiar with relational tables, or, perhaps not well-versed in either approach. PowerExchange automatically identifies all relevant referential relationships in the mainframe data files and represents them in a manner that can be easily understood and maintained across the migration. By automatically identifying the relationships of non-relational mainframe data and intuitively representing them to developers, PowerExchange ensures that even novice developers can maintain data integrity across the migration.

Focus on New System Performance


By simplifying the identification, extraction, integration, and manipulation of disparate sources with an intuitive and universal interface, PowerExchange allows developers to spend their time focusing on improving overall performance of the new system instead of having to ensure the accuracy of dataa tedious process. Once issues like the mapping of mainframe keys to Oracle primary and secondary keys has been resolved, for example, developers can spend time focusing on the most efficient schema, instead of trying to ensure basic operation.

Single, Unified, Metadata-Based Data Integration Platform to Support the Data Migration Lifecycle
When data migrations projects are driven by teams that are focused exclusively on the target system, not in the end-to-end data migration process, a common outcome is the code, load, and explode phenomenon. This occurs when developers code the extraction and conversion logic thought to be required for migration, then attempt to load it to the target business application, only to discover an unacceptably large number of errors due to unanticipated values in the source data files. They fix the errors and rerun the conversion process, only to find more errors, and so on. This ugly scenario repeats itself until the project deadlines and budgets become imperiled and angry business sponsors halt the project. PowerCenter breaks this code, load, and explode cycle. PowerCenter provides all the capabilities that are essential to support the data migration lifecycle from a single, unified

14

White Paper

platform based on a metadata-driven architecture. Figure 7 shows the flow and transformation of data, using PowerCenter, from the mainframe to an Oracle system.

Figure 7: PowerCenter Lineage Diagram Demonstrates the Flow and Transformation of Data From the Mainframe to Oracle RDBMS

The foundation for all of PowerCenters data integration components is the shared metadata. When changes are made anywhere in the profiling, data access, data conversion, or loading process, PowerCenter enables immediate visibility into those changes. With its metadata-driven architecture, PowerCenter promotes faster and more flexible iterations in the data migration lifecycle. Figure 8 shows how PowerCenter is used for migrating data.
5

XML, Messaging, and Web Services


2 3

FIREWALL

Reusability/Team Productivity

Analyze/ Profile

Extract/ Transform
6

Validate/ Lead

Packaged Applications

Iterate

Relational and Flat Files

Access source systems/data

Access target/data Execute Migration

Target Application
9

Synchronize

Mainframe and Midrange

10

Audit/Lineage

Informatica Data Integration Platform

Figure 8: PowerCenter Is the Ideal Platform for Migrating Data

PowerCenters metadata management capabilities provide visibility across the entire data migration processfrom sourcing legacy applications and cleansing the legacy data, to preparing it in the format required for upload into an Oracle DBMS. PowerCenter enables data lineage problems to be traced at a metadata level.

Migrating Off the Mainframe

15

PowerCenter helps data migration teams trace and prove how data has been converted and moved. The enhanced data visibility and tracking helps organizations comply with reporting requirements. These capabilities also help with user adoption, instilling new Oracle application users with confidence that legacy application data has in fact been converted and moved from the mainframe. Furthermore, PowerCenter alleviates the politics associated with data migration projects. Data migration activities, whether related to legacy mainframe applications or the target Oracle application, can be centralized within a single, unified data integration platform. This promotes effective and productive communication between legacy mainframe and Oracle resources, and between technical and functional resources.

Conclusion and Next Steps


Mainframe data migrations are complex. They should not be approached as singular event. The top six challenges associated with migrating data off the mainframe are: 1. Identifying and analyzing source data 2. Accessing source data 3. Addressing the quality of the data within the legacy applications 4. Preparing and loading data into the target system 5. Supporting the data migration lifecycle 6. Accommodating behavioral changes The best way to overcome these challenges is to rely on Informatica enterprise data integration software. Both PowerCenter and PowerExchange offer data migration teams powerful capabilities to meet each of the five data migration challenges. The capabilities include: Data profiling capabilities for identifying and analyzing source data Universal data access capabilities for accessing source data Built-in transformation and correction capabilities for addressing the quality of data in legacy applications Single, unified data integration platform to support the data migration lifecycle Furthermore, PowerCenter and PowerExchange allow data migration teams to leverage all these capabilities from a single, unified data integration platform. This increases productivity, ensures scalability, and reduces risk. Now that you have a solid understanding of the challenges around mainframe data migration and how Informatica enterprise data integration software can help you overcome them, what is your next step? Contact Informatica to find out how our enterprise data integration software can help your next mainframe migration project. To find out more, please visit us at www.informatica.com or call us at (800) 653-3871.

16

White Paper

Migrating Off the Mainframe

17

Worldwide Headquarters, 100 Cardinal Way, Redwood City, CA 94063, USA phone: 650.385.5000 fax: 650.385.5500 toll-free in the US: 1.800.653.3871 www.informatica.com
Informatica Offices Around The Globe: Australia Belgium Canada China France Germany Japan Korea the Netherlands Singapore Switzerland United Kingdom USA
2006 Informatica Corporation. All rights reserved. Printed in the U.S.A. Informatica, the Informatica logo, and, PowerCenter are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be tradenames or trademarks of their respective owners.

J50837 6691 (07/11/06)