It is sometimes difficult for companies to improve their data quality. They think working on quality means implementing new solutions, and struggle defending the time and budgets required for such initiatives. Data stewards are also reluctant to spend time on tasks if they don’t see the benefits.
A more efficient way of improving data quality is to start from what is available and help the people involved to build strong habits in order to make step by step improvements. Data quality requires a combination of people (data stewards and data encoders, for example) taking action, and also automated processes that help to support them. When defining targets and possible remediations, we can help users improve data quality by performing certain tasks such as filling in forms completely and accurately for new data, removing one version of a customer present twice and correcting an existing phone number.
How to favour data quality habits?
When starting a data quality initiative, the most important thing to do is define where we are and what we want to achieve. Based on this target, we define a set of quality rules, and from them a set of clear remediation actions. If these actions are simple enough, like capitalizing last names in a contact form, they are applied in an automated way. The more complex rules should be included in an employee’s daily job activities.
Habits take time and effort to build and motivation to be applied by a person. But once set up, they can be repeated at low cost. For example, the habit to brush one’s teeth before going to bed is not naturally performed by children. It takes effort from the parents to make them acquire this habit, but once done so it becomes part of their every day routine in adulthood.
In order to support a habit, we can cycle through a loop of four steps described in “Atomic Habits” (Clear J., 2018)1
Cue: Make it obvious.
A habit is more likely to be repeated if its occurrence is obvious, and its content is clearly understood. A child who is asked to brush their teeth only on certain days is more likely to skip this activity than another child whose parents remind them to brush their teeth every night.
When implementing data quality remediations or validations, data errors should be presented obviously to users allowing them to spot issues quickly even if they have less experience. And validations should be calculated to prevent finalizing data that do not meet the minimal requirements.
Craving: Make it attractive.
Any action is more likely to be accepted if it is attractive or combined with something that must be done. Most toothpastes have special tastes for children, making the brushing more attractive.
Including data quality together with something that should occur anyway, or encourage a validation, brings a user to review data in a better way.
Response: Make it easy.
The more preparation there is prior to carrying out an action, the less likely a person will perform it.
We should try to keep the cost of a habit as small as possible, by automating whatever can be automated, and focusing on the most import part of the task. The fewer preparation steps there are to see quality errors or remediate to them, the more likely a data steward will perform them.
Reward: Make it satisfying.
Rewarding a desired behaviour encourages the performer to repeat it in the future. The habits to enforce should be enjoyed for the executant to feel good about doing and reproducing it.
Data quality needs to be measured and presented in a nice way to provide satisfaction to data stewards who spend time on cleaning tasks.
How can IT tools help build data quality habits?
The regularity of practicing a habit is more important than the value of the tools used when doing it. Buying an expensive electric toothbrush will not by itself make a child do it regularly, but will help improve the result of the action and the time required to do so.
In the same way, using dedicated data quality tools improves the efficiency of the data quality initiatives. But it is not the main factor defining it.
Among others, we can spot three ways to support data quality habits for starting this initiative:
- Internal Quick Wins: Defining rules, preparing, and creating visualizations to track them, or adding visual validation rules results in the front end. They can be prepared using back-ends, databases, ETL, and reporting tools that the company already owns. These can be used to present data quality in a pleasant way and automatically rectify any spotted issues.
Using internal tools allow to cover the four steps described above at lower cost. People involved with data entry and cleansing see issues more obviously. They can be helped by automated processes to improve the results and immediately see the outcome.
- MDM Tools: Master Data Management tools are focused on data validation, cleansing, and quality improvement. They come with strong data analysis, visualization, and data entry features that help the data stewards to visualize the issues and remedy them in a centralized and very efficient way. Using these tools makes the cleansing habits easier to implement for the data stewards.
- Data Quality Tools: The most advanced way to improve data quality is to use a tool focused on monitoring, rules automatization and issues remediation. They also include advanced quality scoring, and sometimes gamification mechanisms to provide direct rewards to their users. Including such a tool in the company landscape improves the efficiency of people working on data quality improvements.
So, are you interested in building data quality habits within your companies, for enhancing your analysis possibilities?
Any questions? Contact vincent.payrat@keyrus.com
1. Clear J. , Atomic Habits : An easy & proven way to build good habits & break bad ones, Random House US , 2018