The Qualytics 8 – Duplication

What is Data Quality and the Qualytics 8? 

We spent a lot of time dealing with data quality issues before we decided to take a stance and build Qualytics. Going through our experiences, we have ultimately identified and defined the Qualytics 8 as the key dimensions that must be addressed for a trustworthy data ecosystem. We believe that to achieve comprehensive data quality,  data should be assessed in 8 main categories.

The Qualytics 8

Qualytics uses these 8 fundamental categories to assess data quality.

Completeness
availability of required data attributes
Coverage
availability of required data records
Conformity
alignment of content with required standards & schemas
Consistency
how well the data complies with required formats / definitions
Duplication
redundancy of records and/or attributes
}}
Timeliness
currency of content representation as well as whether data is available / can be used when needed
Volumetrics
volume data per row / column / client / entity over time
Accuracy
relationship of the content with original intent

Duplication Explained

Duplication is the redundancy of records and/or attributes. There are many sources of data duplication, such as messy data, and data ops complexity/misconfiguration. It can also be as simple as duplicate data entry into a source system. Another example of duplicate data is having more than one record per entity,  based on the type of information provided. This occurs when information can be entered in multiple ways, such as addresses with differentiation in “drive” versus “dr.”  A common scenario dealing with duplicate data would be Marketers using a CRM tool:

They may find multiple records for the same account which leads to inaccurate reporting, faulty metrics, and declining sender reputation.

But what is the Cost of Duplication?

The costs of duplicating data are high and run through every aspect of a business:

  • Wasted Marketing Money
    Example:
     a direct mail campaign that uses duplicate data may cost double —or more—due to sending multiple pieces to the same person. 
  • ETL and Labor Costs 
  • KPI/Reporting/Audit Issues
    Example: Basing Reporting on numbers that aren’t correct – End of the year 
  • Customer Service Engagement

How does the Qualytics Data Firewall Address Duplication?

In order for an organization to prevent duplication from happening, Data Quality must be addressed through every aspect of your enterprise data ecosystem. De-duplication is the process of finding duplicate data and merging the best data. The Qualytics Data Firewall automatically identifies the unique fields in your enterprise data and will detect & respond to the introduction of duplicate values for such fields. 

While duplication is a common source of untrustworthy and inaccurate data, it is essential to address each of the Qualytics 8 for a trustworthy data ecosystem. Businesses can then take that Quality Data to reach accurate, timely, and data-driven decisions.

Qualytics is the complete solution to instill trust and confidence in your enterprise data ecosystem. It seamlessly connects to your databases, warehouses, and source systems, proactively improving data quality through anomaly detection, signaling and workflow. Learn more about the Qualytics 8 factors in our other blogs here – Accuracy, Duplication. Let’s talk about your data quality today. Contact us at hello@qualytics.co.