Bad Data: Big Problem

What Exactly is Bad Data?

Bad data. It sounds simple; it’s just inaccurate data or data that goes to the wrong place, right? Not quite. Even true data can be bad data. It may even be correct in every way— but duplicated or in the wrong field or simply not what you’re looking for. This is indeed bad data. Those small glitches in the system are where huge mistakes can arise. In today’s world that relies so heavily on data, bad data needs to be monitored to prevent it from spiraling into countless financial, operational, and reputational damage.

Where does it come from?

Bad data can arise from many situations, such as inputting errors or poor organization. Before we can address the effects of bad data, we have to address where it comes from. Here are a few examples of where bad data can come from

  • Duplicate data: one account that occupies multiple records in the database
  • Inaccurate data: information that has not been entered correctly or maintained
  • Inappropriate data: data that has been entered in the wrong field
  • Irrelevant data: data that is not useful towards the purpose of the company’s goal
  • Missing elements: refers to empty fields that should contain data
  • Non-conforming data: data does not have a uniform method/appearance
  • Poor data entry: misspellings, typos, transpositions, and variations in spelling, naming, or formatting

How is Bad Data Dealt With?

The first step is knowing that there is a problem. Most companies don’t recognize the data quality issue because a KPI obfuscates the underlying data problems. They make wrong decisions, only to realize that the underlying data was missing or wrong. This makes step 1 in dealing with bad data: situational awareness. Let’s get your data quality exposed to you. 

Typically, the next data fix is to treat the symptoms— within a data warehouse, for example. However, it would be much more beneficial to fix the problem at the source. Getting exposure to the data quality at the raw data level means that companies will be able to not only see symptoms but get exposure to the core problems.

Bad Data Leads to Poor Decision Making

The consequences of bad data depend on the industry and the source the data is going to. One of the worst consequences of bad data is poor decision making. Decision making is a hard enough process as is. Decision makers need to feel confident when making these decisions and be able to trust the data they are using. The insights gathered from data are not just important; they’re crucial building blocks for organizations.

An important observation:

“A decision can be no better than the information upon which it’s based,” (Chan, 2020).

BUSINESS INEFFICIENCIES:

  • Invalid reports → validate and fix errors
  • Higher consumption of resources and maintenance costs
  • Lower productivity

MISTRUST:

  • Reputational damage- including negative publicity on social media and dissatisfied sales and distribution channels
  • Reflects adversely and lowers customer confidence

MISSED OPPORTUNITIES:

  • Lost time and revenue

ERRORS IN PRODUCT/SERVICES OFFERED:

  • Lower customer satisfaction and retention

Did we mention the cost? Bad data can cost a company copious amounts of time, revenue, and money. According to an IBM study in 2016, businesses in the U.S. lose $3.1 trillion annually due to bad data. 

Enterprises across the spectrum have data quality problems with massive business impact. Research by Gartner finds that the average loss for a company with bad data is $15 million.

  • Target- $941 Million loss – Caused by supply chain issues with data quality in Canada go-live
  • Asos- 62% drop in stock price  – data quality issue caused automated inventory system failure, cost of $30M
  • Rent the Runway – 5% delay in customer shipments – from a software upgrade bug (causing corrupted data). Systems down for 2 weeks, issued $200 cash refunds to customers.
  • Goldman Sachs – £34.1M – fine for failing to provide accurate and timely reporting related to 220.2M transaction reports.
  • UBS – £27.6M- fine for failing to provide accurate and timely reporting related to 135.8M transaction reports.

So how do we keep our bad data…good? 

Let’s take a closer look at the Qualytics 8. What are the Qualytics 8? These are the 8 fundamental categories to assess data quality

Completeness
availability of required data attributes
Coverage
availability of required data records
Conformity
alignment of content with required standards & schemas
Consistency
how well the data complies with required formats / definitions
Duplication
redundancy of records and/or attributes
}}
Timeliness
currency of content representation as well as whether data is available / can be used when needed
Volumetrics
volume data per row / column / client / entity over time
Accuracy
relationship of the content with original intent

Achieving all of these 8 categories of assessments can be extremely difficult. The best way to complete useful and successful data checks is to apply a tool that is based in machine learning and real-time surveillance— like Qualytics. This gives your organization an opportunity to stop bad data in its tracks, before it becomes part of the target data sets. In this new age of modernization, data is your most important and valuable resource. Companies cannot afford to not be using the best of the best when it comes to cleaning and monitoring your data. Qualytics is the missing layer of the trusted enterprise data ecosystem.


 

Qualytics is the complete solution to instill trust and confidence in your enterprise data ecosystem. It seamlessly connects to your databases, warehouses, and source systems, proactively improving data quality through anomaly detection, signaling and workflow. Check out the other blogs in this series to learn more about how you can start trusting your data. Let’s talk about the 5W1h of your data quality today. Contact us at hello@qualytics.co