The key to unstructured data value: metadata and context

Unstructured data has enormous potential value for organizations. But the key to realizing this value is context. Without it, this fast-growing IT resource might end up being nothing more than a mishmash of meaningless data.

A big challenge many organizations face with unstructured data is that it’s being generated at an explosive rate. So while a business or IT user might be searching for the proverbial needle in the haystack—a given piece of information that provides the answer to a problem—more and more hay is being dumped onto the stack.

To build context, start with metadata

Fortunately, context can help users quickly find what they need even as unstructured data builds, so they can make better business decisions or resolve a particular issue. To addressing unstructured data’s context, IT teams need to use metadata to help identify how a piece of information fits into various data silos.

Metadata makes data discovery possible. Unstructured data often helps users gain new and unexpected insights that they might not initially have been seeking, and this process starts with metadata – a set of attributes or data points that can be used to describe or classify an object. For instance, if you’re looking at a house, you could describe it by architectural style, size, age, color, location or any number of other attributes. In the case of business content, you can assign various attributes, such as type of content, length, styles, intended audience, file format, etc.

Text analytics is a key discipline that can help users gain new insights from metadata. This encompasses several techniques, such as the below, that identify the topics, languages and other bits of information that give unstructured data its essential context:

  • Tracking. This answers such questions as, where did this data come from? How fresh is it? Can we trust its sources? Inspecting. Users need to parse through content to find useful metadata. Some information, such as fields that contain personal information including an individual’s date of birth or Social Security number, can pose a security risk if it goes undiscovered.
  • Accessibility. The value of text-based unstructured data relies on users being able to find and use the data. The ultimate goal is to increase the usability of unstructured data, in part through identifying other users who can also derive value from it.
  • Visualization. Tools that provide visualization of the data can help users view information from entirely new perspectives, so they can gain new insights and quickly identify patterns from large data sets.
  • Consolidation. Although there will always be some unstructured data from websites, social media and other sources, it makes sense to create an on-premise, consolidated store of unstructured data that can be properly managed.

Growing stores of unstructured data can help enterprises get an edge on their competitors. But they can only realize the true value of this resource if they apply context to such data. By analyzing metadata and applying data auditing tactics, companies can gain business insights they didn’t even realize were possible.

Apply unstructured data insights to improve efficiency, reduce rework and uncover insights across your business.

  Like This

David Siles

David is Chief Technology Officer for DataGravity. Prior to becoming CTO, David served as vice president of worldwide field operations at DataGravity. Previously, he was a member of the senior leadership team at Veeam Software. He also served as CTO and VP of professional services for systems integrator Hipskind TSG. A graduate of DeVry University, he is a frequent speaker at top tier technology shows and a recognized expert in virtualization.