Data Quality can be seen as the golden ticket for using data effectively. It indicates how much of one of your company’s most valuable assets, data, can be trusted and used for intended purpose. Managing this quality, called Data Quality Management, assures that your data remains fit for purpose. Performing poor Data Quality Management does not necessarily imply bad Data Quality. But I can assure you, without any Data Quality Management, you're navigating blind, exposing your business to increased risks along the way. Poor Data Quality Management leads to increased costs, wasted resources and even regulatory problems. An issue that will certainly not solve itself by onboarding even more data.
Of course, to manage something you need to measure it. Generally, for Data Quality, we use Data Quality Dimensions. Which ones? That again depends on the needs of your organisation. Most commonly six dimensions are used: Completeness, Accuracy, Timeliness, Validity, Uniqueness and Consistency. But remember, the sky is the limit.
How do you structure your massive data inflow around these few dimensions? First, the focus of Data Quality initiatives should only be on your critical data elements. Is big data dead? No! But Quality initiatives should be clearly positioned in the organisation. Identifying critical data elements shapes the scope of your initiatives.
Expert note:
How can you identify your Critical Data Elements (CDEs) ? Several techniques can help:
Match Data with Critical Business Processes: Identify which data elements are essential to your key business processes and assess the associated risks
Assess User and Team Interactions: Investigate how many users and teams interact with the data on a daily basis
Evaluate Regulatory Risk: Determine the regulatory risks associated with poor data quality for each element
Consider Associated Costs: Identify data elements that have high costs associated with their management and usage
Rank your data based on these criteria, from high to no impact. The elements with the highest impact should be prioritised for careful management.
When you spot data issues, don't resort to an ad hoc, reactive approach. Updating records manually won't cut it and will even work counterproductively. Simply updating records in a manual way after spotting an error in a report or Data Quality project will not tackle the root-causes of your issues. To take it a step further, you might not even be able to spot all issues in your general day-to-day activities.
How to effectively solve issues? Most organisations are in need of a clear Data Quality Framework to support their journey in a structured and effective way. A Data Quality initiative needs to be strategically positioned within your organisational landscape. Here at Keyrus, we follow a framework that implements four phases to tackle your Data Quality Journey in a structured way: Analyse, Strategize, Cleanse and Monitor.
Begin with the Analyse phase - understand your data without drowning in code-based details. An enormous amount of data is flowing into your organisation, but quite often Data Consumers do not have a thorough understanding of the data and processes. A profiling exercise can easily unveil Data Quality issues, all in line with strategic data goals.
Expert note:
A straightforward technique to assess the quality of your data is through a VIMO analysis. Start by selecting a critical dataset for review and understanding the associated business requirements. Translate these requirements into scores across four categories: Valid, Invalid, Missing, and Outliers.
For example, a Belgian chocolate manufacturer might analyse Customer Lifetime Value. A value of €1000 could be considered valid (as Belgians consume a lot of chocolate), while a value of -€1000 would be invalid, and €10000 would be an outlier (since even Belgians don’t consume that much chocolate).
Applying this analysis even to a subset of your data can quickly reveal its accuracy.
After analysing it is time to move into the Strategize phase, where you will design scalable and maintainable solutions for your problems. The Strategize phase analyses the root-cause of the data problems found. You will focus on the preventive and corrective actions to be taken. All transform into business rules which will make your data fit for purpose.
Expert note:
Don't overlook the process! Poor Data Quality is often the result of non-optimised business processes within a company. For example, fields may be left empty because they are not mandatory, or data may be manually copied from one tool to another. These practices can introduce errors and inconsistencies. Ensuring that business processes are well-defined and optimised is crucial for maintaining high-quality data.
The Cleanse phase brings business rules to life, transforming them further into reusable transformation rules. The transformation rules are deployed according to the data strategy’s defined cleansing location. An effective Data Quality solution reduces the time spent on manual cleansing, allowing more time for strategic tasks.
Expert note:
It's crucial to clearly define where data cleansing should occur to prevent operational or analytical issues. For example, removing duplicate customers might please the marketing team but frustrate the finance team.
Cleansed data can be delivered downstream, to users and systems, upstream, feeding back into the source or stored separately in an analytical cleansed layer.
Finally, the Monitor phase keeps your Data Quality ship sailing smoothly. It mostly consists of a Data Quality dashboard consistently resolving issues across the organisation.
Expert note:
Don't forget to consider data over time! Monitoring your Data Quality scores can provide valuable insights into the effectiveness of your efforts.
When stakeholders can see and feel the impact of these efforts, it creates a positive feedback loop that encourages further initiatives and improvements across the organisation. Regularly showing results can help build momentum and drive Data Quality projects.
The cherry on top? A Data Governance layer with clear roles and responsibilities driving your Data Quality Initiatives.
The 80/20-rule is a classic among data consumers, where 80% of the time is spent on preparation of big data quantities while only 20% is used to gather insights. With an accurate Data Quality Management this configuration can be flipped drastically reducing the time-to-value of projects, lowering operational costs and increasing the success rate of projects.
The benefits aren’t just operational. Data Quality Management will increase the general trust of data in your organisation. Enabling you to take strategic decisions in a more confident way, all generating additional analytical capabilities. Say hello to competitive advantages!
Does your organisation already experience the full benefits of high-quality data and an accurate Data Quality Management? Do not let poor Data Quality hold your business back. Embark on your Data Quality journey today!
Written by Robbe Caron Any questions? Feel free to reach out to robbe.caron@keyrus.com