The Qualytics 8: Coverage

How Better Data Coverage Can Impact Your Business

What is Data Quality and the Qualytics 8? 

We spent a lot of time dealing with data quality issues before taking a stance and building Qualytics. Through our experiences, we have ultimately identified and defined the Qualytics 8 as the key dimensions that must be addressed for a trustworthy data ecosystem. We believe that to achieve comprehensive data quality, data should be assessed in 8 main categories.

This is part 5 of a series of blogs that explains each of the Qualytics 8. Review other blogs in this series here. 

Completeness
required fields are fully populated
Coverage
availability and uniqueness of expected records
Conformity
Alignment of the content to the required standards, schemas, and formats
Consistency
the value is the same across all datastores within the organization
Precision
your data is the resolution that is expected - How tightly can you define your data?
}}
Timeliness
data is available when expected
Volumetrics
data has the same size and shape across similar cycles
Accuracy
your data represents the real-world values they are expected to model

What is Data Coverage in Data Quality?

Data coverage means that all the right data is available and included. Having full data coverage doesn’t necessarily mean that the entire data set is fully exhaustive or that every value is accessible, but rather that the data is available for a necessary purpose. 

For example, a mailer has been sent out to a list, however for some reason everyone in the “VIP” dataset was not included and did not receive the email. What could have caused this list from being withheld? Maybe the VIP dataset has restricted access. It is also possible this dataset lives within another dataset that was not included in the mailing. This disconnect of data is a lack of coverage.

Why is Data Coverage Important for Data Quality?

Better data coverage improves productivity by enabling enterprises to make data-driven decisions. For example, a product manager trying to calculate the year-over-year percent increase of quarterly sales without the same sales records from the previous year would not be able to make the correct calculation. The product manager would have to utilize the data that is available instead due to the lack of coverage of the data. Without all the necessary values of the sales data, the manager may make bad or ineffective decisions about improving a product or driving sales.

When is Data Coverage Most Important?

An important part of having the right data coverage is knowing your business and its data. While it may be cumbersome (or impossible) to collect every data point in order to have full data coverage, enterprises can instead clearly scope the collection of data required for the use case based on research about the relationships between existing data attributes. This ensures that all data is collected for a purpose, and enables enterprises to define data quality rules that ensure the data is collected with full coverage the first time.

Enterprises should understand the coverage of the data they are looking at to make sure they are not making decisions or taking action based on a subset of the data. 

When prioritizing data coverage, consider 

  • What types of decisions might I make incorrectly if I were basing it on incomplete data?
  • What decisions need to be made, or actions taken, based on the data?
  • What decisions or actions are unable to be made if the data is missing or unavailable?

How to Check Data Coverage and Fix Problems 

Data coverage will most often come to light when enterprise decisions are not able to be made, or actions are not able to be taken, due to the lack of coverage of data. 

As companies and teams work towards instilling a culture of data confidence, creating the business case for data quality improvement is key – and addressing coverage is a great way to uncover business goals that are being impacted by lack of data to make decisions. 

According to Gartner,

“Ironically, one of the primary reasons for unsuccessful business cases for data quality improvement is because they focus on data quality. To be successful, business cases must address the key components necessary to achieve the business goals, such as financial performance, operational performance, legal and regulatory compliance, and customer experience. Linking data quality to these metrics is critical.”

 

Focusing on the business case for data quality improvement and business needs, such as forecasting sales and sending VIP customers mail, is critical to understanding the coverage that is needed and identifying the coverage problems and gaps that need to be addressed.

How Does Qualytics Address Coverage?

Before deciding that there isn’t enough data to use for data analysis, or planning to manually re-populate your data set, consider using data quality software to assess your data store and use machine learning techniques to infer and re-populate missing data points. 

For example, the Qualytic’s Platform uses proprietary algorithms to infer rich metadata through deep profiling of the historic data in the data store. The Qualytic’s Platform uses inductive learning, along with unsupervised learning methods to automatically infer data quality rules. This can help to improve the coverage of data stores based on existing, historical data without the need for time-consuming manual data entry or new algorithm development.

Qualytics is the complete solution to instill trust and confidence in your enterprise data ecosystem. It seamlessly connects to your databases, warehouses, and source systems, proactively improving data quality through anomaly detection, signaling and workflow. Learn more about the Qualytics 8 factors in our other blogs here – AccuracyTimelinessConformity, Consistency. Let’s talk about your data quality today. Contact us at hello@qualytics.co.