It’s Your Data, You Need to Know What’s in It

It’s a pretty interesting time to be involved in IT Infrastructure, especially storage, an area I feel passionate about. I anticipate everything will change over the next few years. These changes won’t be driven by catalysts that have historically advanced storage technology, specifically performance and capacity. These metrics will remain important and you will see innovation here, likely led by companies such as Pure Storage and Nutanix, as well as by companies no one has heard of yet. One can expect that NetApp and EMC will continue to keep pace as well.

The next wave of storage advancements will be on the intelligence axis front. You see this happening elsewhere in IT Infrastructure, for example Palo Alto Networks, reinvented the role of the Firewall. FireEye has changed how companies defend against cyberattacks. The storage industry is the most conservative, in terms of adopting change. It happens, but it needs a catalyst to force the change. The pain point must be sufficiently large that lack of change isn’t an option.

The complexity and the risk of not rethinking storage’s role in IT Infrastructure has reached a tipping point. Companies are seeing their unstructured data grow at historic rates that are not sustainable. There is very little knowledge of what is actually being stored. In many midmarket organizations today, there is no way to know where there is value in the data or extract that value, or equally important, to minimize the risks lying in the data. Frankly, interacting with your data is like living the U2 song “I still haven’t found what I’m looking for.”

There are three major pain points that will force storage solutions to change:

  1. Ignorance is no longer bliss.
    There is very little insight into the value and risks lying in unstructured data. Ignorance is no longer bliss. Ignorance can put companies at a competitive disadvantage. It can expose companies to data privacy issues, or compliance violations. The first challenges is that organizations are lacking a 360-degree view of their data, an MRI of data if you will, so you can see what you have and where. The second is to have a way to more easily search and discover existing content so you can leverage it and understand trends in the data. The third is to extend the definition of data protection in storage to include more than physical protection and include protecting against data privacy violations including sensitive information including personally identifiable information (PII) as well as disclosure of company confidential information. One just has to take a quick glance at reported data thefts in 2014 to see that companies of all sizes are struggling with protecting their information. It’s not just Fortune 100 companies that need to worry about data breaches.
  2. Security is every layer’s responsibility.
    So where in the IT Infrastructure should these capabilities reside? I would say everywhere if we are talking about security. When talking about securing data, it needs to start at the point of storage, storage arrays need to be involved. Yes, there are Access Control Lists (ACL) on the content, but this isn’t sufficient to understand who has access to sensitive data. Since ACLs don’t take into account the content of the data, which can be ever-changing. ACLs are also complex and difficult to manage as the content grows and moves around the network.
  3. Storage management must evolve to meet data needs, not storage needs.
    For years people have tried to move managing storage into the network. For example when I started EqualLogic there were a number of companies trying to do storage virtualization at the network layer. None of those companies succeeded because placing an appliance into the data path was expensive and complex in terms of performance, time and cost. The winners in that evolution moved to storage virtualization, not the storage array. EqualLogic was one of the first to do this. There have been some emerging companies who have expanded on this innovation, like Tintri and Nimble Storage.

So what’s next? Storage must become aware of the data it’s storing. It must be able to answer questions about the data across people, content and time. When we started DataGravity we knew this was the next step in the evolution of storage. This new category of storage, “data-aware storage” will be the required standard going forward for all unstructured data, and in the future structured data. The tenets are pretty straightforward.

The tenents of data-aware storage are that data security, information discovery, data visualization and more can be accomplished at the point-of storage.

The tenets of data-aware storage are that data security, information discovery, data visualization and more can be accomplished at the point-of storage.

As this category of storage evolves you will see a great push to incorporate more intelligence into the storage array. These should be as much a part of the storage array as snapshots, thin provisioning and the other standard data services. Why? It’s your data, you should know what’s in it in order to extract all the value it represents to your business, while limiting your exposure.


Paula Long

Paula Long is the CEO and co-founder of DataGravity. She previously co-founded storage provider EqualLogic, which was acquired by Dell for $1.4 billion in 2008. She remained at Dell as vice president of storage until 2010. Prior to EqualLogic, she served in engineering management positions at Allaire Corporation and oversaw the ClusterCATS product line at Bright Tiger Technologies. She is a graduate of Westfield State College.