Big Data vs. Good Data: Don't Be Fooled

I often hear two misconceptions about big data. First, people sometimes think, “I have a large amount of data, so I have big data,” and second, “It’s too expensive to truly leverage all of this data to drive decision making.” Neither is true, and with a little clarification, you can easily discern what truly useful big data looks like, and the difference between big data and good data.

What Makes Big Data Good
When considering the usefulness of data, I like to focus on the 4 Vs: volume, variety, velocity, and veracity. A data set must exhibit the first three components to generally be considered big data, but it’s the last one (veracity) that passes my test on whether the big data is also good data. Many companies have a large amount of data (volume) with different data points based on the mode of freight (variety). Additionally, the data is created at a rapid speed (velocity). Still, there’s a crucial element missing before that data is good data. If you don’t have the last V (Vveracity), then you just have a large amount of data that cannot deliver any insight or intelligence.

When Big Data Fails
When you have big data and it is not good data, a company may only be utilizing a dataset’s velocity to make wrong decisions faster. I talk to a lot of companies whose freight cost analysis has revealed a large savings opportunity. The data is showing there are a large percentage of shipments that are not using the optimal carrier in their routing guide. Fixing this can save a few million dollars. The company determines they must upgrade its TMS in order to have the functionality required to optimize their freight shipments.

To see the importance of veracity, take this business case that was created with a high projected ROI. The company spends a million dollars upgrading its TMS. About six months after it is complete, they look at the results expecting to see huge savings on freight. What they find is their freight cost hasn’t changed and none of those projected savings materialized. Why? Because the data they used was missing veracity, the last V. The data was incomplete, which made it inaccurate, and it failed to provide useful intelligence. The large projected savings was due to an outlying factor like a particular type of order (back order or a service order) being expedited to the customer, instead of accurately representing their whole shipping situation.

Harvesting Intelligence with Accurate Data
If we focus on Veracity, and a company has made the large dataset accurate, then it can be harvested for a wealth of intelligence. This is when the fifth V can be introduced, which is value. Good data provided from your logistics consultant or software will show its value by creating a competitive advantage for your company. As we saw in the example above, big data without veracity added no value for the company as their projected savings failed to materialize.

The role of logistics software or consultants in aggregating data is to automate the process and system. If the data supplied requires you to further cleanse the data to ensure it is accurate, you may be receiving big data, not good data. You shouldn’t have to analyze or cleanse it further; the transparency of the data should make it very clear as to what the root cause of the problem is and what the solution is.

Trust is ultimately the most important aspect of your data. Until you can trust your data on a large scale, your company cannot truly leverage data to create a valuable business asset. Until you have good data, don’t be fooled into making decisions you will later regret.

Shannon Vaillancourt is the president and founder of RateLinx, a customized global logistics software and consulting firm with the only software and process to standardize carrier, shipment, track & trace and order data into one dataset. He is a frequent speaker and contributing author regarding the use of data in logistics management. Visit www.ratelinx.com for more information.

Big Data vs. Good Data: Don't Be Fooled

Follow