Data Integrity for Sustainment Approach

AGR, LLC has a deep history with our existing management and technical team in successfully accomplishing data management/integrity initiatives for sustainment programs. We have been involved in over 500 enterprise Data Integrity and Conversion efforts including over 100 for Federal Customers and Prime Contractors with the federal government. Our senior principles defined much of the body of knowledge that many COTS software companies along with major US corporations use today. Our team has an unsurpassed record of delivering on-time and on-budget with highest quality on these projects. We developed the data quality requirements for enterprise systems that are the standards enacted by successful companies today.

Our Approach:

Phased Data Cleaning and Migration Process Approach

Phase I – Gather and Assess Inventory of any Existing Data Cleanup Activities

This upfront activity allows our team working with customer personnel to gather any previous data integrity programs that the agency has sponsored over the past 5 years. This includes but is not limited to the following:

  • Automated and manual data analysis
  • Practices, Policies and Procedures for data accuracy auditing
  • Quality processes previously undertaken for root cause and corrective action actions
  • Organization responsibilities for Data Entry, Oversight and Accountability
  • Data sensitivity for mission accomplishment
  • Personnel qualification requirements for data handling

This inventory will run through a level 1 data value test at the system and operating department level being digitally cataloged. This includes all supporting documents. As part of the past activities inventory process any external audit results should be reviewed, such as IG or financial activities and sifted for data integrity problems. This activity should include any past systems engineering studies/analysis regarding data archiving and records retention. SAP’s Solution Manager is used to highlight metrics and to create effective processes in classifying, collecting, cleaning and migrating data.

A Data classification process is undertaken where data is categorized into the following classification types:

Type I Data Mission critical data inclusive to support system of record
Type II Data Operations critical data inclusive to support operations processes
Type III Data Operational information data inclusive in support of operational metrics, planning, planning and control

This data classification allows a prioritization of data cleansing and corrective actions both systemically and for ongoing functional operations. These data classifications are then divided into static and dynamic data to further support the following:

  • Root Cause Analysis – Ongoing data errors from transactional processing versus master records causing ongoing errors from repetitive processing.
  • Corrective Actions – IT mechanical assist fixes, operational process and operating instruction fixes, education and training, or a combination of the above.

Our team will use a binomial random sampling technique to establish a statistical baseline for performance and progress measurement. We have found through implementations over the past thirty years that this baseline setting (using statistically significant sampling) reduces operational emotions out of discussions on data integrity starting points, reportable numbers, and progress to objectives. Counting tolerances will be utilized for data elements that should require tolerances to fall within statistical error frequency.

Detail data elements will be arrayed by classification and sub referenced by originating/fed to systems. Type I Data errors will be prioritized for first-fix items requiring root cause/corrective action data quality analysis for all. We will use our Microsoft tools offline after extracting the data details using whatever extraction tools the customer has for the respective storage media. Our team will have SQL, Oracle, and flat file extraction skills to pull and create a meta-data catalog for cleansing action.

A master schedule will be created with a top-down break down that allows all data elements to be analyzed and corrected. This effort will also be used to track progress to required dates similar to our successful data cleansing effort for USSOCOM SOFSA CLS that allowed our team to clean up indicative data for processing in 50% of the scheduled time allotted for 20% of the budgeted cost. This scheduling technique allowed for three level deep parent-child data relationships and dealt with bad data values, missing data values, and duplicate data.

Our approach involves institutionalization of data maintenance for manual data origination and automated data transfer. We also identify broken legs for required interfaces from master to transactional data and the establishment of single points of accountability for data governance. We have utilized SAP tools where they were made available by the customer to yield long term data integrity in situations for the US Navy using Navy ERP (SAP) and also the commands using legacy systems. This effort led to the utilization of a data management hub to facilitate interfaces that scrubbed and normalized data between systems. This approach allowed us and the customer to build upon a long term data integrity strategy to facilitate functional and systems operation.

We will utilize a Data Development Profile that allows all required data to be entered prior to systems conversion processes and be built upon in future systems cut-over. This approach was used in the development of NASA LEOS sustainment system that insured data quality and functional systems utilization. It allowed our team to support the quantifying of the amount of data items to be entered and the prioritization of data requirements in systems cut-over and ongoing operations. We were accountable for operating the system through four years of sustainment and operations globally. We believe this effort would allow us to support a data development report for the IFMIS system for both static and dynamic data that could be utilized through the second and third phases of the project.

Phase II – Data Cleansing

We utilize an abbreviated AGILE methodology that recognizes that the scripts used to develop data automated data cleanup efforts are single use programs. The scripts we have developed over the years to cleanse data are based on SQL technology and go through a unit and string test cycle to assure performance and accuracy. We use an automated testing tool that is Boolean based to test for both attributes of performance and accuracy while externally testing the scripts. With our approach we keep an automated log to allow for rapid regression testing.

An ‘Is/Was’ file keeps statistics that allow for user intervention and multiple customer quality toll gates. This allows for a historic artifact that is compliant with total government audit requirements to assure data validity and was a key tool in accomplishment of our data cataloging task at SOFSA CLS to validate both the process and validity.Many of our customers over the years have not had a rigorous data archiving process other than save everything online. This approach has led to systems degradation and data development that did not keep up with changing processes and systems releases. We have developed strategies for controlled online and planned off line storage that meet audit retrieval requirements and maintain a cleaner production system. Each of these approaches may be a little different depending upon the data being archived and the retrieval requirements including contract audits, mission critical decisions on past activities, critical pricing analysis, and Heuristic modeling on past activities for future decisions.

Our team uses audit/update routines in our scripting to allow for short and long term data cleansing maintenance. These routines are constructed so some or all of the updates can be accomplished with or without manual intervention and can require different sign off levels. Regardless of what the auto switching of these routines do to the nature of our customers (for instance turbine blades for aircraft engines) we build and maintain an auditable detail history with intelligent reconciliation reports that are derived directly from the source cleansing ADP data.

Phase III (Optional) – Data Migration

We develop conversion programs that are modular in nature to support our Multi-Trac Pre-Production piloting process. Our approach uses automated “all data” qualitative and quantitative data reconciliation matching. We also use statistical sampling as a quality control check on the ETL process to validate programmatic activities accuracy.

We utilize new process documentation to create ‘As Is’ and ‘Should Be’ documentation taking into account metrics for data accuracy as part of the detail departmental procedures in building the models. Our team has used different tools to accomplish these models depending on the customer tool requirements and existing process libraries.

Our approach uses our Pre-Production piloting to create a step up analysis of load timings to create a detail conversion schedule for precise cut-over timing. This technique assures not only detail cut-over schedules but also reduces the element of load inaccuracies and overlaid data.

Our approach to data migration and required data accuracy organizational change involves a two-step process of upfront education and detail training. The key to this approach is the ability to communicate the ‘why’s’ to the data owners and users and then the ‘how’s’ to execute the objectives. We have spoken to using statistical process analysis to take the inaccurate data emotion out of the process, we use this this same technique to constantly keep performance to required data accuracy metrics in front of both detail personnel and leadership. These metrics are set off industry standards and their measurement should be used for ongoing performance evaluation of personnel responsible for data integrity including promotions and compensation. This approach has been used in both professional and skilled labor positions in both union and non-union shops. In project implementations we have used friendly competitions among functional organizations as a way to push the project forward while measuring and promoting data integrity.

We have taken best practices derived from years of implementations in the private government contractor sector and moved them as a baseline into data dependent enterprise projects with the Air Force, FAA, Navy, NASA, and USSOCOM SOF CLSS. We have found these best practices have been compliant with Federal Government regulations and introduced new business processes that have improved all aspects of their system utilization and success.