Data Can’t Wait: Start Planning Today

Leading Practices for Proactively Mitigating Data Risks in SAP S/4HANA Implementations

Any major packaged software implementation such as SAP inherently includes a certain level of risk. The key is to recognize and identify those risks early and build risk mitigation firmly into the project plan. For SAP customers that want to continue to run SAP software into the future, over the next five years they will all need to adopt the next-generation SAP S/4HANA platform and migrate their system data to the new core system, whether that be on premise or in the cloud.

As I’ve witnessed since the inception of SAP S/4HANA and through interactions with many colleagues throughout the SAP ecosystem, across many different industries and geographies, one of the main risks that often derails the SAP S/4HANA implementation journey centers around preparing, cleansing, converting, and managing the data. In this article, I present some proven and practical approaches that you can leverage during your SAP S/4HANA implementation to significantly reduce program risks associated with the data conversion process.

The essence of my advice can be summed up with these three leading practices:

  • Focus on pre-planning activities well in advance of the initial data conversion.
  • Ensure that data quality, management, and governance foundations are established early and that initiatives are continuously ongoing.
  • Leverage the data conversion phase to address other key requirements such as jumpstarting self-service analytics.

To begin, let’s review why I suggest it’s a leading practice to focus on advance planning and scoping.

Start Data Conversion Pre-Planning Early: Practice Makes Perfect

Empirical data from similar SAP-based transformation programs that Infosys has led strongly supports the case that SAP S/4HANA data conversion planning should start at least six to 12 months before you begin your SAP S/4HANA program; since the data track has few hard dependencies with the core SAP S/4HANA implementation tracks, there is no penalty in starting the data conversion process as early as possible. As you begin planning for your SAP S/4HANA data conversion, there are three main steps you need to take.

Step 1: Validate the Data Scope

The earlier you can convert and validate historical data, the more likely the blueprint phase, process design workshops, and prototyping activities will yield higher productivity and value because “seeing the correct data” immensely helps functional teams and business process leads to understand the full context of the new, to-be world.

During the planning process, identifying and validating historical data conversion requirements will be important in finalizing data conversion scope. As part of this conversion planning step, there are three main tasks you should complete.

  1. Identify historical conversion requirements for business master data, technical master data, and transactional data. For example, these requirements could include customers that have done business with the company within the last 12 months or have an open financial relationship, all active accounts, all open receivables within the last seven years, or all bankruptcy or red-list accounts.
  2. Set the personally identifiable information (PII) data protection strategy that will need to be considered during the conversion. This PII topic is very complex as it involves people, processes, and technology and has many considerations for how data should be handled as it’s moved to and from SAP applications. From a technology perspective, through the SAP HANA platform and third-party tools, custom data masking, scrambling, and encryption capabilities are readily available including SAP-based authorization and object privileges. You should also consider the people and governance aspects including a complete audit trail, reporting, and governance via flexible data policy configuration management to define what to “mask” by timeframe and business process. Just as importantly, perhaps, you should define processes for how data is accessed before importing it and after exporting it from the SAP system. Similar data masking, access, and work procedures should be in place to address data protection the moment it resides outside the confines of the SAP system.
  3. Define the data archiving requirements now and retention policies to be in place after the data migration is complete. Obviously, the less data that must be converted into the new system the better. Hence, you should understand data archiving requirements to avoid unnecessary work. For example, if your new SAP S/4HANA system requires the past 10 years of customer transactional history, historical data beyond this time period should not be converted. Understanding key data retention requirements and corresponding policies will help ensure your new SAP S/4HANA system is continuously optimized for performance, database size, storage cost, and backup/recovery cycles. SAP solutions like SAP Information Lifecycle Management can help automate business rules and governance processes such as data archiving and data retention. By controlling when SAP system data can be moved from production to a secondary database that can be accessed in the future if needed, SAP ILM enforces statutory- or company-specified data retention periods so that data is not destroyed prematurely or can be retained longer than the retention policy due to legal cases. In cases where legacy systems need to be decommissioned, data retention and reporting requirements can be enforced on these types of systems as well.

Also, during the planning stage, you should define and finalize the various templates required for data conversion activities — such as data conversion and data quality strategy, data discovery, data source summary and legacy fit-gap, data profiling and data cleansing activity tracker, data mapping sheets and functional specifications, data conversion results, and data validation and data reconciliation templates. Leveraging pre-defined templates from previously successful programs will help ensure consistencies in documenting various data processes, facilitate governance, and ensure all the data artifacts are tied directly to the eventual SAP S/4HANA implementation in terms of lineage and transparency.

Step 2: Ensure Your Conversion Environment Is Ready

In preparation for initial data conversion activities, you should make sure the appropriate environments are set up. The following two points should be considered when setting up your preliminary data conversion environment:

  • Have a target data repository/application available that you can use as an intermediate environment until the final SAP S/4HANA landscape is ready. One of the leading practices that will be discussed in the upcoming section is to set up a scalable data repository that you can use for initial data conversion activities as well as jumpstarting enterprise self-service data and analytics. This is preferable to setting up a temporary, throw-away environment for the purposes of initial data conversion only.
  • Establish the process of getting production data snapshots from the identified source systems. After your target data repository is up and running, it is important to establish a consistent process and cadence for identifying source system data (required tables and fields) and loading it into the repository. Having reliable production data snapshots available daily will facilitate the repeatable process for loading new, updated, and modified records into the database and keeping a detailed audit trail of daily changes. Be sure to complete this foundational step successfully before beginning data conversion activities            

You should also ensure the availability of proper data extraction, transform, load (ETL) tools, as well as having the right data profiling, quality management, and governance activities in place. Historically, there have been many different vendor platforms and tools encompassing these capabilities. The recent trend has been to consolidate and standardize these capabilities into a single platform such as SAP HANA, enterprise edition, which reduces the need for a wide variety of skill sets and resources required and provides more seamless integration of these capabilities to reduce overall time to development/deployment. A later section will describe an SAP S/4HANA data program where the SAP Agile Data Preparation application was implemented, which brought together information management and analytics capabilities into a single, integrated SAP solution — from data extraction, transformation, loading, profiling, quality management, governance, and workflow to UI5-enabled reporting and dashboards.

Step 3: Start the Data Conversion

Once the target environment is ready to go, you can then start performing actual data conversion activities for select business and technical master data as well as associated transactions to validate the end-to-end data conversion execution framework. Below are the key activities you should perform next:

  1. Perform data extraction and load the data into the target environment’s staging area (Step 2 provided a jumpstart on this process).
  2. Transform the data from the legacy structures to a more normalized data model with proper parent-child relationships, with dimensional and transactional groupings and hierarchies. This activity will be aided by working with business process subject matter experts (SMEs) who also have a good understanding of the underlying data.
  3. Begin documenting entity relationships of business dimensionality (such as products, customers, and regions) and transactions (such as orders, bills, service requests, and maintenance items).
  4. Establish a data audit and controls process to gain visibility into important data statistics (such as record counts from source to target, duplicity, data quality measures, data changes over time, and historical data snapshots).
  5. Make relevant sample data available to business process SMEs for initial validation and eventually to support process workshops during the blueprint phase of the program.

The learnings from the blueprint phase will provide validation of initial conversions as well as additional insight into changes required to finalize data conversion routines for the build/development phase. The prior data transformation steps are crucial in preparing the data for final transformations and successfully loading it to the target SAP S/4HANA data model. As an added benefit, it will also jumpstart the building of logical reporting data model to facilitate self-service data and analytics.

The aforementioned five activities taken as a whole form the basis for the jumpstart of mock data conversions that will reoccur throughout the SAP S/4HANA program life cycle to ensure the desired quality and performance is met during final cutover. To fully realize successful outcomes, you need to lay a strong foundation for data quality, management, and governance, which the next section details.

Laying the Foundation for Data Quality, Management, and Governance

Ensuring that quality data, validation activities, and governance processes are in place to properly manage the data is no small feat. As a leading practice, there are four key considerations for laying a solid foundation for your data conversion process and ongoing data quality, management, and governance.

Consideration #1: Understand the Current State of Your Data

Understanding the current state of data (or data profiling) is one of the most critical activities before actual data conversion can begin. Perform the following data profiling activities to provide the requisite details for more fully understanding your legacy data:

  • Identify and facilitate the documentation of source system data.
  • Discover metadata of the data sources, including value patterns and distributions, overlapping/duplicates, key candidates, foreign-key candidates, business rules/logic, and functional dependencies.
  • Identify and categorize the source system data tables as master or transactional data.
  • Initially identify data conversion objects — such as reports, interfaces, conversions, enhancements, forms and workflow (RICEFW) objects — for example, partner, partner relationship, contract account, contract, communication/notification preference, document, bills, payments, work orders, dunning/collection, among others.
  • Map the source system and its tables to the above-named RICEFW objects.
  • Facilitate the creation of a legacy fit-gap document to identify gaps during the data mapping activity.
  • Discover the state of data quality and specific issues.

If data profiling isn’t given the proper care and attention it needs, the SAP S/4HANA data conversion can consequently be adversely affected. Data profiling establishes a tangible baseline in terms of the overall health of the data, helps define concrete activities to improve the data quality, facilitates data mapping, transformation, and modeling efforts, and sets clearly defined goals in terms of your data quality and ongoing data management and governance program.

Consideration #2: Address Data Quality Issues Early — in the Source Systems

As mentioned, data profiling will provide a wealth of information regarding the state of your data quality and help identify specific issues. Examples of common issues that might be uncovered during data profiling include: differing customer naming conventions or format, missing data attributes for customers (such as identification type, account status, and business type), missing address elements (such as city and zip code), address standardization issues, and invalid or incorrect entries.

Once these issues are identified, data quality rules should be created, executed, documented, and validated within data conversion jobs and revalidated for loading additional data sources and new incremental data. For example, say there was a data quality rule to eliminate duplicate customers, but when additional legacy data sources were loaded, duplicate customers reappeared. Detailed analysis uncovered that the data quality rule was not accounting for unique combinations of customer name, address, and phone number to distinguish a unique customer as it should. Upon implementing the fix, the rule properly culled out unique customers and loaded successfully into the database. 

Existing data quality issues such as this should be addressed immediately, in the source systems, by engaging the respective application owners, process experts, and data stewards. This step will ensure that existing data quality issues will be resolved at the source and will not propagate to downstream and target systems. This step will also provide a good test for the newly created data quality rules to ensure that these issues are not reoccurring.

Consideration #3: Integrate and Align Key Business Processes for Data Validation

Another key leading practice is to perform validation of the initially converted data by assessing it against key business processes. This is achieved by introducing a framework that aligns overall data quality to key business processes — for example, evaluate process perfection (i.e., maintenance notification) to object perfection (required objects such as equipment number, asset type, address, location, and status) and to field perfection (data issues such as wrong location — invalid latitude/longitude combination, duplicate equipment number, and old address).

This framework ensures that data issues are logically correlated to end-to-end business processes and objects required to successfully complete these processes. Another example is the process for creating and sending out customer billing. To achieve process perfection, all required objects (such as customer name, account number, billing address, service address, premises, and meter number) must be accounted for. For each object, issues such as duplicate customer names, incorrect addresses, and premises may be present. The implications of these issues may be bills sent out to wrong customers, with incorrect premises and inaccurate tax calculations based on wrong jurisdiction assignment. Only by addressing data issues at the field level do corresponding objects and business processes align to result in process perfection.

Consideration #4: Create a Repeatable Process for Ongoing Data Management and Governance

Having a repeatable process for continued data management and governance will set you up for success beyond the initial data conversion phase. To create this repeatable process, I recommend you take the following five actions:

  1. Form a data quality and remediation plan, which includes a prioritization of data remediation activities.
  2. Create a data readiness map for specified business domains.
  3. Identify and assign key data stewards by business process.
  4. Build data impact and lineage reports to help quickly diagnose the root cause of data issues.
  5. Create a data quality scorecard for key business rules to continuously monitor and improve data quality.

The outputs of these steps will enable you to get a jumpstart on your ongoing data management and governance activities including establishing consistent, repeatable processes and actionable insights. Establishing these actions is critical because, without them, any initial improvements in data quality will erode over time and leave your enterprise data quality in worse condition as more new data sources are added through new initiatives, mergers, and acquisitions.

Leverage the Data Conversion Phase to Address Other Key Requirements and Introduce New Capabilities

With the data conversion phase properly planned for and jumpstarted, customers can leverage this foundation to address other key requirements and introduce new capabilities such as creating a scalable enterprise data repository and gaining a head start on their self-service analytics journey.

By leveraging the learnings from data profiling of legacy sources — including understanding the relationship between data entities, data transformations, business rules, and data quality issues — you can create logical data models, user-friendly semantic layers, and analytics templates to drive self-service reporting and data analytics (see Figure 1). 

Andrew Joo figure

Figure 1 Initial data conversion activities can address additional requirements and spawn new capabilities

By employing the new, in-memory capabilities of SAP HANA, enterprise edition and the SAP Agile Data Preparation application, your self-service data and analytics journey can be realized more quickly in an agile fashion. For example, on a recent business transformation program, we leveraged the SAP S/4HANA data planning and conversion phase to get a head start in other key areas. By leveraging SAP HANA, enterprise edition and SAP Agile Data Preparation, we developed a multi-pronged solution to deliver:

  • A scalable data foundation to de-risk and accelerate an upcoming SAP S/4HANA implementation
  • A single platform to profile and cleanse legacy customer and billing data, to stage and transform data, and to integrate, harmonize, and align data with the future SAP S/4HANA data model
  • Capabilities for scalable integration and consumption of future data sources across various applications and third-party systems while building a foundation layer for the future SAP S/4HANA solution
  • Ongoing data management and governance capabilities in terms of data profiling, smart data cleansing and business rules framework, data quality metrics dashboards, and data stewardship processes
  • Logical single-source-of-the-truth customer data models (i.e., customer service, billing, credit, and collections), a federated semantic layer, reporting and analytics templates, and key performance indicators to accelerate self-service customer reporting and analytics
  • Self-service analytics templates including data accessed by users in business-friendly terms via simple drag and drop and seamless building of queries encompassing both SAP and non-SAP data
  • A best-of-breed methodology of an SAP HANA-based data repository consumed by leading reporting and analytics tools
  • A scalable data integration and management framework for integrating other key data sources like meters, assets, work management, GIS, the general ledger, customer sentiment, and social media
  • A foundation for hot, warm, and cold data management and a storage platform for real-time, regularly accessed, historical, and archived data, as well as future integration to near-line storage solutions and Apache Hadoop-based platforms

A future article will provide the detailed steps you can take to enable these capabilities to add significant business value on top of the SAP S/4HANA data planning and conversion phase.


By leveraging the leading SAP S/4HANA data practices discussed in this article, SAP customers can proactively mitigate data risks in their implementations. To get an idea of what might be possible at your company, consider the previously mentioned business transformation program that was able to drastically reduce its SAP S/4HANA program implementation risks and accelerate the implementation timeline. After we delivered fully converted and reusable customer data; pre-built end-to-end data integration, management, and governance processes; and self-service reporting and analytics models — as well as upgraded its infrastructure from an aging mainframe to an SAP HANA-based platform — this client achieved the following key benefits:

  • Substantial cost savings of up to 25% for future related projects by reducing or removing the dependency on data-requirements and data-gathering tasks
  • Significantly reduced future operations and maintenance costs by shortening the plan/analyze phase of future related projects
  • A tenfold increase in self-service customer reporting and analytics capabilities and time to value/insight (as compared to the previous data and analytics platform)
  • A 100x improvement in performance (as compared to the previous data and analytics platform)
  • Significantly decreased technical debt (several million in savings) in terms of the cost of legacy applications, databases, infrastructure/hardware decommissions, reductions, and consolidations
  • Ongoing data quality management and governance processes for future sustainability        

Now it’s your turn to take these leading practices back to your company and put them into action on your upcoming business transformation and SAP S/4HANA journey. To learn more about these leading practices and gain additional details mentioned in this article that can help get you a head start as you begin planning for your SAP S/4HANA implementation, please feel free to reach out to me directly via email ( or via SAPinsider ( where I serve as an executive advisor.


Andrew Joo
Read More

Andrew Joo is the Enterprise Performance Management (EPM) leader for North America within the SAP Business Analytics Center of Excellence at IBM Global Business Services. He has more than 16 years of deep strategy, financial, cost, management, and technology consulting experience, encompassing leading industry firms (Big 4, system integrators) and a multitude of technologies (SAP, Microsoft, Oracle), industries (private and public sector), processes and methodologies (PMI, Agile, SDLC), and roles (functional and technical). In addition to serving as a thought leader and subject matter expert in the field of financial advisory, EPM, BI, and Information Management disciplines, he has been integral in project delivery, practice development, business development, and industry and community outreach efforts. He has pioneered unique methodologies and go-to-market solutions using an integrated EPM (EPM-BPC) and BI paradigm. The aforementioned have been featured in key industry events and publications, including SAP Sapphire and SAPinsider’s Financials and Reporting & Analytics conferences. He holds an MBA in strategy/marketing from Rice University, an MS in management information systems, and a Bachelor’s in finance. He is the author of 100 Things You Should Know about SAP NetWeaver BW, full of time-saving tips and tricks and step-by-step instruction.