What Exactly is Bad Data?
Bad data. It sounds simple; it’s just inaccurate data or data that goes to the wrong place, right? Not quite. Even true data can be bad data. It may even be correct in every way— but duplicated or in the wrong field or simply not what you’re looking for. This is indeed bad data. Those small glitches in the system are where huge mistakes can arise. In today’s world that relies so heavily on data, bad data needs to be monitored to prevent it from spiraling into countless financial, operational, and reputational damage.
Where does it come from?
Bad data can arise from many situations, such as inputting errors or poor organization. Before we can address the effects of bad data, we have to address where it comes from. Here are a few examples of where bad data can come from:
- Duplicate data: one account that occupies multiple records in the database
- Inaccurate data: information that has not been entered correctly or maintained
- Inappropriate data: data that has been entered in the wrong field
- Irrelevant data: data that is not useful towards the purpose of the company’s goal
- Missing elements: refers to empty fields that should contain data
- Non-conforming data: data does not have a uniform method/appearance
- Poor data entry: misspellings, typos, transpositions, and variations in spelling, naming, or formatting
How is Bad Data Dealt With?
The first step is knowing that there is a problem. Most companies don’t recognize the data quality issue because a KPI obfuscates the underlying data problems. They make wrong decisions, only to realize that the underlying data was missing or wrong. This makes step 1 in dealing with bad data: situational awareness. Let’s get your data quality exposed to you.
Typically, the next data fix is to treat the symptoms— within a data warehouse, for example. However, it would be much more beneficial to fix the problem at the source. Getting exposure to the data quality at the raw data level means that companies will be able to not only see symptoms but get exposure to the core problems.
Bad Data Leads to Poor Decision Making
The consequences of bad data depend on the industry and the source the data is going to. One of the worst consequences of bad data is poor decision making. Decision making is a hard enough process as is. Decision makers need to feel confident when making these decisions and be able to trust the data they are using. The insights gathered from data are not just important; they’re crucial building blocks for organizations.
An important observation:
“A decision can be no better than the information upon which it’s based,” (Chan, 2020).
- Invalid reports → validate and fix errors
- Higher consumption of resources and maintenance costs
- Lower productivity
- Reputational damage- including negative publicity on social media and dissatisfied sales and distribution channels
- Reflects adversely and lowers customer confidence
- Lost time and revenue
ERRORS IN PRODUCT/SERVICES OFFERED:
- Lower customer satisfaction and retention
Did we mention the cost? Bad data can cost a company copious amounts of time, revenue, and money. According to an IBM study in 2016, businesses in the U.S. lose $3.1 trillion annually due to bad data.
Enterprises across the spectrum have data quality problems with massive business impact. Research by Gartner finds that the average loss for a company with bad data is $15 million.
- Target- $941 Million loss – Caused by supply chain issues with data quality in Canada go-live
- Asos- 62% drop in stock price – data quality issue caused automated inventory system failure, cost of $30M
- Rent the Runway – 5% delay in customer shipments – from a software upgrade bug (causing corrupted data). Systems down for 2 weeks, issued $200 cash refunds to customers.
- Goldman Sachs – £34.1M – fine for failing to provide accurate and timely reporting related to 220.2M transaction reports.
- UBS – £27.6M- fine for failing to provide accurate and timely reporting related to 135.8M transaction reports.
So how do we keep our bad data…good?
Let’s take a closer look at the Qualytics 8. What are the Qualytics 8? These are the 8 fundamental categories to assess data quality.
Achieving all of these 8 categories of assessments can be extremely difficult. The best way to complete useful and successful data checks is to apply a tool that is based in machine learning and real-time surveillance— like Qualytics. This gives your organization an opportunity to stop bad data in its tracks, before it becomes part of the target data sets. In this new age of modernization, data is your most important and valuable resource. Companies cannot afford to not be using the best of the best when it comes to cleaning and monitoring your data. Qualytics is the missing layer of the trusted enterprise data ecosystem.
Qualytics is the complete solution to instill trust and confidence in your enterprise data ecosystem. It seamlessly connects to your databases, warehouses, and source systems, proactively improving data quality through anomaly detection, signaling and workflow. Check out the other blogs in this series to learn more about how you can start trusting your data. Let’s talk about the 5W1h of your data quality today. Contact us at firstname.lastname@example.org.