Menu

Data Quality is a key factor to Business Intelligence success. There is a common saying in analytics circles which is “Garbage in, Garbage out”. This refers to data quality. If your data is poor, then the reporting and decisions made from those reports will be as equally poor. Data Quality is a common issue in Business Intelligence and most people will be able to identify and acknowledge this. What do we mean by data quality? In this article, we will take a closer look at some of the characteristics that make up data quality. These characteristics could be the difference between poor and good data quality or may even help you identify where your data needs improving.

The Characteristics of Data Quality

  • Quantity
  • Historical
  • Uniform
  • Categorical
  • Low-Level Granularity
  • Clean
  • Simple
  • Lineage

Quantity

We always hold more data than we need. However, on the rare occasion where a question takes us down an unexpected route the more data we have available to us the better chance we have of finding the answers.

Historical

Although a lot of reporting is based on current day, a lot of insight can be gained from historical information such as what was our sales trend over the last year, or what is the % growth on this period last year. The more history we have the more chance we have of better understanding the performance of today.

Uniform

Data needs to be uniform, especially with historical data. As an example imagine you are analysing revenue by product line. Imagine you had a product which was classified under one product line 12 months ago however this product then moved under a new product line 6 months ago. If I want to compare revenue trends by product line or even current month vs this time last year your data sets would be different. We need to have an ability to show a current view of historical information so that what we report is consistent.

Categorical

Data can be formed into two categories Quantitative and Qualitative. Quantitative data are measures such as Revenue, Gross profit etc. Qualitative data is categorical items sometimes known as dimensions. The more categorical items we have such as Product Name, Colour, Type the more chance we have of finding out why something is happening.

Low-Level Granularity

Data should be held at the lowest level that something might need to be analysed at. Most of the time this information won’t be used as reporting is typically done on aggregated data sets. Most reporting will show numbers at a high level such as revenue by product type or product line. Very rarely will we need to examine the data at sales order line level. On the rare occasion that we do need this information at least, it is available for analysing and without this transparency our decision-making capabilities may be affected and it may lead to discoveries that the aggregated data did not show.

Clean

Clean is what most people think of when referring to data quality. However, this is just one of the characteristics. We must ensure data is accurate and complete. Without this, the decisions we make from our reporting will be flawed.

Simple

Ensure data is displayed in Business terminology. Do not express data in ID’s or Codes that are system generated as this makes no sense to anyone. Viewing these codes will cause confusion and create unnecessary questions about the data which could be avoided.

Lineage

It is often useful and sometimes essential to know the source of our information. We need to know where it came from and its entire journey from source to reports including any manipulations and calculations used to display the information.

Do you need to improve your data quality? Contact our expert business intelligence consultants who can assist you.

February 8, 2017

Author Bio

Matt Ranson

Matt Ranson

Matt has grown into the senior consultant role and is currently leading our team at BAE Systems. His enthusiasm and determination to provide our clients quality solutions holds him in high esteem with anyone that works with him.

View All Posts