While it is obvious that having accurate data is critical to the business, companies often neglect to maintain the accuracy of the data on an ongoing basis.
Bad data has accumulated as a result of many years of systems operation.
Data cleansing is the process of identifying these incorrect elements and to either correct them or remove them.
Companies discover dirty data when performing significant change or application modernization.
Typical dirty data:
Incorrect dates fields
Old address and/or incorrect zip code
Saturated data (e.g. value beyond a measurement limit)
Misspelling, extra space, etc.
Performing data cleansing is crucial prior to any application amendment. MOST offers data cleansing as an integral part of its standard modernization methodology. Our automatic tool set, OnTarget, supports the systematic cleansing of data, improving data quality and reducing time to market.
What We Do
At the heart of the cleansing process is the systematic assessment of data elements using MOST's tools. During this process the types of data errors are identified and grouped into specific rules. In addition specific abnormal data elements are also identified. Following this process, and with the support of the customer's input, the correction methods are defined.
Once the correction processes are established the data is extracted and run through the correction processes.
At the end of the process regression testing must be executed to ensure that the data has been changed correctly and the application can perform its functions. This regression testing can be executed using MF-Test to speed up the process and to ensure completeness of the testing.
In Summary the following are the cleansing steps:
Assessment using MOST's tools
Define process rules
Extract production data and apply rules
Perform regression testing
The following diagram illustrates the general approach to the data cleansing process: