The Federal Bureau of Investigation (FBI) recently announced at AFCEA Bethesda’s Data Analytics Breakfast that it is to seek better analytics as the agency moves to cloud in 2019. Like most national security organisations and bureaus, the FBI collect and store masses of data every day, but when it comes to getting value out of the data, a whole new set of challenges arise, including the need to address big data debt.

Speaking at the breakfast, the FBI’s chief data officer, Maria Voreh said the bureau has so much data that it is “near impossible” for the agency to get value out of it quickly enough.

“The amount of data I’m getting in, and the amount of sense that we can make of it, there’s a huge data gap delta, and that’s our data debt,” Maria Voreh said. “My job is to make sense and reduce that data gap.”

“To be able to share broadly with other partners or to meet a joint U.S. mission, we’ve got to move our data to the cloud,” Voreh said. “These siloed, on-premise systems aren’t going to cut it. Our budgets aren’t getting any better, so buying more computers and putting them into a warehouse somewhere isn’t going to happen. We’ve got to invest in the cloud, we’ve got to move our critical systems so that we can get the data out of them.”

However, the problem of big data debt isn’t just isolated within national security organisations. It’s a big data challenge for many organisations and a big data threat to many startups keen to adopt new technologies and systems. In reality, organisations all over the world, big and small, are investing in digital transformation initiatives and as a result, data debt is becoming an ever increasingly prevalent challenge.

What Is Big Data Debt?

As big data continues to emerge and new technologies and systems get introduced to support modern data structures, hordes of data get produced. But, when we discover that the data being produced is largely incompatible with existing analytical infrastructures including data warehouses, ETL and BI systems, we learn that many organisations are collecting substantial big data debt.

How Do I Measure Big Data Debt?

Startups, in particular, will need to account for the unplanned costs that stem from the use of non-relational data management technologies. In order to estimate the amount of big data debt being accrued in their organisation, and to convert this into a monetary figure, data analytics startup, Dremio, have created a useful, free to use tool. Access the tool here.

The tool, created by Dremio takes into account the following:

  • The amount of source data stored in non-relational systems.
  • The number of source systems.
  • The number of data analysts using the data.
  • The number of data scientists using the data.

Our Top Tip For Addressing and Preventing Big Data Debt

Preventing a build up of big data debt in the future is as important as addressing any accrued big data debt, today. Our top tip for addressing and preventing big data debt is to ensure that a decent data governance programme underpins all data related activities in your organisation. A strong data governance programme will help to specify the teams and processes necessary for on-boarding new data sources and will help to avoid invalid data points, duplicated records and missing data; all signs of big data debt.


Posted in Big Data

May 9, 2018