The Possible Risks and Probable Rewards of Dark Data

Gartner Research came up with the term “Dark Data” a few years back, defining it as “information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing).”  The definition goes on to make things even more complex noting “Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value.”

Most dark data can be classified as unstructured data as files and data created (or updated) by humans (think Word documents, PowerPoint slide decks, etc.). These documents are harder to categorize and assign classification rules as each author has their own point of view and writing style, not to mention each industry and job role to some degree possesses its own nomenclature. Dark data files may contain great insights and untapped ideas, but may also include social security numbers, credit card numbers and other confidential and sensitive content. These files often take up TBs of space and no one in your organization may have the slightest idea of what these files actually contain, only that they need to be stored.

You may ask, “Why should I care?” Sometimes it’s best to leave what’s in the dark in the dark. However, ignorance is not always bliss, especially when it comes to issues like compliance and data governance, where inadvertent oversight can be downright risky or costly for your organization.

According to one story in Beckers Hospital Review, the cost can be steep: to the tune of $50,000 per HIPAA violation or up to $1.5 million per calendar year per identical violation. Meanwhile, the Ponemon Institute’s findings claim the average per capita of a data breach in the US in 2014 amounts to $201. So your organization’s economic livelihood is definitely one major reason to care about dark data and to seek ways to shed light on what may be hidden in your dark data.

There’s also another side to the dark data equation. Sometimes, what’s hidden in the dark can deliver untold riches for those who find it. Think about explorers of shipwrecks in the murky depths of the ocean or those who seek gold deep within mines. The quest to bring the treasure to the surface is often hard and arduous, but the findings are well worth the effort for those who succeed.

That’s the other side of dark data…the wealth of insight and intelligence that can be gleaned from stored files – including customer records, email correspondence, raw survey data, notes, old versions of relevant documents and the like that may have been stored and forgotten but which may be used to gain a competitive edge.

You have to shed a light on ALL your dark data to understand what needs to be made more secure and what needs to be further utilized. It’s hard but it’s something leading organizations are increasingly investing to better protect themselves and better utilize the intellectual capital and the hard work of past and present employees.
Learn What's Lurking in the Dark (Data) SlideShare


Need some help understanding the risks of dark data and some of the steps you can take to create order from chaos? Review our new SlideShare “Learn What’s Lurking in the Dark (Data)” to find out more.

  Like This
Jeff Boehm

Jeff Boehm

Jeff Boehm was the vice president of marketing at DataGravity for 2 years. Jeff brought more than 20 years of experience with a rare combination of marketing skills, organizational leadership and technical background to DataGravity, having shaped the BI and search markets working for industry pioneers and disrupters.