Data Analytics Trends: Is Visualization a Double-Edged Sword?

It’s been a while in the making, but here’s my fifth and final article in my series about trends in data analytics. These have been fun to write; the topics are complex, but important. These posts have been resonating with organizations struggling to make sense of and make use of analytics.

There is no denying that information visualization is an incredibly powerful way to help us make sense of complex data. The promise of visualization is simple. Humans are great at identifying visual patterns, but nearly all data worth trying to understand are large in quantity and exist as sets of numbers or words – not images. Expecting a business analyst to manually scroll through datasets and draw reliable conclusions is a losing proposition. Graphic representations of data become even more important as we attempt to compare datasets (e.g., How were sales this year versus last year?) or try to link them together (e.g., Of customers who purchased, how many participated in a webinar, downloaded a white paper, or came to our booth at a conference?).

A whole industry has been built around the ability to easily visualize data – that’s what nearly half of the big-data hype is about. At first, visualization tools required a Ph.D. to operate. Complex, command-line tools would produce static timelines and graphics, which would be inserted into PowerPoints or email. Today, we have interactive graphics driven by data that is collected, cleaned and linked in real-time. A careful watcher of the big data technology industry would notice that there has been a marked shift away from batch-based processing toward near-real-time or stream processing, with real-time query engines integrated. In large part, this is driven by the increasing desire to visualize data and see results as they happen, not after a daily or weekly batch process to prepare the results.

At the same time, visualization tools have been getting easier to use and end-users are being handed the reins. This is a great thing; business analysts are best positioned to put the derived insights to work. On the other hand, we’re expecting a lot more of business analysts, often without giving them the training and information they need to use them effectively.

Data provenance – the history of where data came from, how it was cleaned, manipulated, linked and combined – can also be described as metadata. This metadata must be maintained at every step along the way, and it must be communicated to business analysts before they attempt to visualize any data. For example, if I’m trying to figure out how well our marketing department is generating leads compared to last quarter and last year, I need to make sure I’m looking at the right information – qualified lead volume, excluding invalid entries and system tests – over the right time period. If I get wrong or incomplete data for even one of those time periods, the conclusions I draw could be misleading.

In the same vein, data quality over time is another factor that must be understood and controlled for. In the example above, imagine that we didn’t work out all the kinks in the lead tracking system until mid-last year. The data collected before the issues were resolved might still be informative, but it should be taken with a grain of salt, because its quality and completeness are questionable.

This is the crux of why visualization is a double-edged sword. It provides great power, but demands great respect for – and understanding of – the data being visualized. Many companies that rush to integrate visualization tools overlook these challenges and find their projects aren’t as successful as they could or should be.

Here’s a litmus test for your organization if you are a user of visualization tools: ask two different people to produce a visual of the same dataset and draw a conclusion from it. Compare how close their answers are, and then decide whether you need to invest more in your metadata infrastructure and user training.

Are you on the cutting edge and seeing highly successful visualization projects internally? Or are your visualization efforts lacking sharp focus?

Share this on Twitter @ http://ctt.ec/5U5hW

For more on why companies are democratizing access to data insights for business users, check out “2014: The Year of Getting More Answers From Your Data.”

  Like This

Steve Kearns

Steve is the Director of Product Management for DataGravity, focused on defining and delivering Data Intelligence. He has spoken at conferences around the world about the power of search and analytics and has worked with many of the worlds most successful companies and government agencies implementing these technologies.