How much does poor data quality cost?
We live in a world of big data. A world of big (bad) data. Every day millions of us are contributing to 2.5 quintillion bytes of data. Individuals, businesses, organizations, and governments are relying on constant access to data. For the individual that may simply be an email or a Facebook post; for a business, it's critical information that it depends on for success.
Unfortunately, the mere existence of data in a company’s databases is no guarantee of business success as much of the data collected is poor quality data.
What is poor quality data?
Bad data is data that is inaccurate, outdated, incomplete, irrelevant, or duplicated. The proliferation of bad data is a not entirely unexpected result of the sheer volume and complexity of data being generated and the rapid development and adoption of new technologies.
The well-known adage ‘garbage in, garbage out’ gives a hint at the devastating cost of bad data.
The state of data
A 2017 study by Thomas C. Redman at Data Quality Solutions and Cork University Business School, found that the vast majority of data is terrible. The study involved 75 executives who identified the last 100 data records their departments had done. Shockingly, almost half of the newly created data records had critical errors.
The researchers concluded that ‘’the vast majority of data is simply unacceptable, and much of it is atrocious.’’
Since relatively few organizations have dedicated data quality teams, it’s fair to assume that most enterprise data remains of poor quality.
What is the cost of poor quality data?
Almost a decade ago Gartner signaled a looming bad data problem. A 2013 survey by the company revealed that poor data quality costs companies more than $13.3 million a year. That was then, imagine what that loss has grown to in the meantime.
In 2016 IBM estimated the cost of poor quality data to be a whopping $3.1 trillion in the US alone.
Poor quality data is costly. Research by Royal Mail Data Services revealed that organizations believe inaccurate customer data costs them, on average, six percent of their annual revenues.
Data quality is so unreliable that data scientists spend 60 percent of their time on cleaning and organizing data sets. Other employees waste up to 50 percent of their time on mundane data quality tasks, according to MITSloan.
For example, inaccurate name and address recording often result in duplicate records. A duplicate customer record costs about $10 to correct, and $100 to fix after it has caused a problem. Depending on the size of the business and the number of inaccurate recordings, that can add up to a huge cost.
According to the B2B Marketing Data Report from Dun & Bradstreet, 41 percent of companies cite inconsistent data collected by technologies like CRM’s and Marketing Automation Systems as their biggest challenge.
Even correctly recorded data doesn’t remain reliable as data is not static. MarketingSherpa estimates that 25-30 percent of data becomes inaccurate every year. Any data point may change – a street name or a position at a company – when that happens, good quality data deteriorates into outdated data.
Bad data consequences
Poor-quality data leads to poor decisions. A decision based on incorrect information can have far-reaching consequences for an organization. The promise of business insights that data scientists are to derive from large data sets will not be realized if the data is unreliable.
A report from Dun & Bradstreet found that 22 percent of businesses had inaccurate financial forecasts, while 17 percent of organizations lost money because they offered too much credit to a customer due to a lack of adequate information.
Deteriorating customer relation
Inaccurate information leads to personalized communications that are embarrassingly incorrect and alienate customers. There is nothing more annoying than receiving an email that is clearly not meant for you. Irrelevant advertising and marketing communication alienates customers and leads to poor customer relations, and lost sales.
Poor-quality data causes inefficiencies in business processes that depend on accurate information. Once inaccurate datasets are discovered, employees may be put to work to manually correct data and remove replicated data. This is a costly waste of time and resources. And a loss of time that could have been spent on tasks that can earn the company money.
Research has found that ‘’poor data quality hurts employee morale, breeds organizational mistrust, and makes it more difficult to align the enterprise.’’
Employees who routinely work with poor quality data tend to lose interest in their work. According to Gallup, disengaged employees have 37 percent higher absenteeism, are 18 percent less productive and show 15 percent lower profitability. In terms of dollars lost, that comes to a loss of a third of a disengaged employee's annual salary.
Poor data quality also means that it is difficult to have trust in the company data, and this, in turn, may cause employees to be reluctant to commit to projects based on such data. In addition, only 16 percent of managers fully trust the accuracy of the data on which they base many of their important decisions.
When an organization’s poor data results in compliance issues, it can cost the company millions in fines not to speak of consumer distrust.
Poor data can lead to incorrect deliveries, missed appointments, billing errors, unintentional messages, and more.
You only have to go online to see how frustrating this is for customers. Customer reviews can be very damaging to a company’s reputation. The effect of negative reviews is far more impactful than that of positive reviews - it takes 40 positive customer experiences to reverse the damage done by a single negative review.
Poor data doesn’t reveal opportunities. For instance, poor-quality data on new prospects will not reveal who to approach and in what manner in order to turn the lead into a paying customer.
According to a report from Dun & Bradstreet, almost 20 percent of businesses have lost a customer due to using incomplete or inaccurate information about them. A further 15 percent said they failed to sign a new contract with a customer due to the same reason.
Benefits of clean data
The Experian Global Data Management Report highlights the following benefits of investing in quality data:
Improvements in personalized customer communications
Improved employee efficiency
Revenue growth, improved sales conversions
Progress in linking data from different databases
Delivering data projects on time and on budget
Improved marketing campaign efficiencies
The study found that when organizations invest in a data quality solution, they experience benefits across many areas of their businesses.
Data cleaning, also called data cleansing, or data scrubbing, is an important step toward making sound business decisions based on quality data.
Data cleansing solutions
Data cleaning software
Data cleansing is the process of editing or removing data that may be invalid, incorrect, incomplete, conflicting, duplicated, or outdated. Scrutinizing zillions of data manually is not practical, even for an army of data scientists. And because manual data cleansing is also prone to human error, many organizations have resorted to using data cleansing tools.
These tools automate and standardize the cleansing process. There are many data cleansing solutions available that help companies assess and improve the quality of their data. A robust tool can connect to all of a company’s data sources so nothing is left out of the process.
Vendors in this market offer either end-to-end data management solutions or solutions focussed just on cleaning data.
Ensure correct data at the source
What if, instead of having to clean bad data, you could start out with good data? Rather than creating quality data by finding and correcting errors in poor quality data sets, employees responsible for creating data could take care to input the data correctly in the first place.
A good way to avoid errors at the point of entry is to standardize the data entering process. Creating a standard operating procedure (SOP) for entering new data will ensure that only quality data is entered into the database. When entries are standardized, it’s also easier to spot errors and duplicates.
Data entry accuracy can be improved by using the latest software to enter the data and keeping high data entry standards.
According to MITSloan, companies that have focused on fixing the sources of poor data, have had great success. They have shown that at least 80 percent of errors can be eliminated and that companies can save up to two-thirds of the cost of poor data.
The amount of data being created is increasing all the time and not all of it is quality data that organizations, governments, and businesses can rely on to guide important decisions. The cost of poor data quality is enormous and necessitates robust data cleansing practices. It also requires a rethink of data entry practices.
Poor quality data is much more than a few typos or rows of empty spreadsheet cells. It can lead to executive decisions that have far-reaching unwanted consequences for the enterprise and vast numbers of people.