As data warehousing has matured, many organizations are focused on improving the quality of their data. Often the added complexity needed to implement checks to monitor the quality of data, particularly for systems without predefined meta data, can be time-consuming, costly, and difficult.
The Data Quality Profiler (DQP) was developed to work in parallel with existing Extract, Transform and Load (ETL) platforms to provide an easier way to implement a data quality strategy. DQP allows companies with large-scale, complex data systems to quickly analyze data, find anomalies, and prevent inclusion of suspect data into the warehouse until a predefined action has been taken.
How it Works:
DQP uses a combination of user-defined checks and statistical models to create a continually evolving signature of your data that accounts for normal and recurring anomalies. If the data source does not match the current data signature, an alert is sent out and the potentially harmful data is prevented from being loaded. This can save a considerable amount of post-cleansing and reload time depending upon the size of your data load. DQP can be used throughout the data integration cycle and is relatively simple to deploy.
Features:
Structural Analysis – Compares data source metadata against expected structure
Pattern Analysis – Looks at the data to see if format matches declared format
Statistical Analysis – Compares data source against evolving historical data signature
Value Range Analysis - Minimum, average, maximum, domain of values, distribution (deciles, quintiles etc)
Frequency Counts – Looks at variance in the number of inserts, updates, errors, and data sources against historical or pre-defined values.
Benefits:
Better manages service level agreements
Speeds compliance regulation by ensuring data quality prior to data entering system
Saves time by preventing load and subsequent removal of bad data and reload of correct data
Allows comparison of time-based (seasonal, day of month, etc.) and growth trends to anomalies that otherwise appear to be potential data quality issues.
Supports all major ETL platforms
Provides high-performance, parallel architecture
The key to any successful information quality project is a consistent, unified approach to data improvement within the enterprise. It is critical to implement a defined process, measure it consistently, and ensure that quality is always a top priority. DQP from Kinetic Networks can help you get there. Contact Kinetic Networks at 415-358-5100 or visit its website at www.kineticnetworks.com.