High-Level Overview of a Data Migration Approach

 

Data migrations typically receive a lower priority than other phases of a system implementation.  This is unfortunate considering data is the lifeblood of any organization.  An enterprise can have the latest technology solutions, the most efficient business processes, but if the information which flows through these is corrupt, the results will be also.  To achieve maximum value from data as a business asset it is necessary to have the very best version of data to begin with, and to do that you need a firm understanding of the data migration process.

The data migration approach represents the life cycle of migrating data from one or more source systems to one or more target systems.  Data migration occurs with the implementation of new target systems and is typically different from steady state ETL processes, which usually populate data warehouses or keep heterogeneous system landscapes in sync.

The data migration also usually occurs in cycles or mocks, which serve as practice runs for the final Cutover and Go-Live.  Below is a high-level outline of the steps involved.

Analysis and Discovery

The first step requires data team members to work with the functional and technical resources of the project to determine all of the source systems, file systems, applications, etc which will be in scope.  Typically these efforts involve analyzing source system documentation and gathering metadata.  This information is crucial when developing the overall data migration strategy, determining the scope of work, ensuring business rules and conversion logic have been established for all data sources, and that all data sources have been identified.

Extract and Profile

In this step data team members use data profiling tools to analyze and profile data in the source systems.  This systematic examination typically includes column profiling, column value frequency distribution, and data quality assessment, which reveal the quality, content, and structure of source data.  The information revealed from these activities helps the data team understand the amount of effort required to cleanse and conform the data, and also aids in the development of business and conversion rules.

Cleanse

Data migration specialists use the results from the Profile step to aid in cleansing the data.  Cleansing the data includes de-duplication, matching, and merging of business partners and products.  Source-to-target field mapping documents developed by business analysts and data team members include business rules and conversion logic to guide the development of cleansing transformations.  The deliverables from the Profile step ensure ETL developers encounter no surprises when they begin their work.

Validate

Before data is loaded into the target systems it must be validated by functional team members and business users to ensure the data conforms to all business requirements and is complete and correct.  This process is referred to as pre-load validation and involves generating pre-load validation reports.  The ETL developer, functional team member and business users will conduct review sessions and fix any mapping issues before loading.  The business will sign-off on the pre-load report and the ETL developer is ready for loading.

Load

After the business has signed off on the pre-load validations the ETL developer can begin loading data into the target system.  Depending upon the target system the developer can use several methods for loading data.  For SAP, data can be loaded using LSMWs, IDocs, or SAP Data Services.  Data loads must typically occur in accordance with the Cutover and Go-Live phases of the target system.  Master data is loaded into the system before dependent downstream objects can be loaded, because these objects reference the master data.  Transactional data is usually loaded during a tight time window to ensure business operations are fully functional when the system goes live.

Reconcile

Once the data has been loaded into the target system various teams need to ensure the data has been loaded successfully.  The functional teams will enter the target system and execute test scripts to ensure the system has been configured properly, but also that data has been correctly loaded.  The data team will generate post-load reports and compare these against pre-load reports.  Statistical samples will also be taken and evaluated by team members to ensure records have been loaded accurately.  The business are the ultimate owners of the data and must sign-off on the post-load validations.

Leave a comment