In recent years, data analytics technology has given businesses a window into valuable streams of information from inventory statuses to customer purchasing habits. These data acquisition successes along with the low cost of disk storage have led some companies to stockpile large amounts of data just for the sake of jumping on the big data bandwagon. Has “big data” become just another buzzword to be applied without discretion?
Don’t be one of those companies that brags that their volumes of data offer a competitive advantage without ensuring that your data is valuable to begin with. According to Experian Data Quality, 75% of businesses are wasting 14% of revenue due to poor data quality. How can you tell if your data is good or bad?
Turning bad into good
Businesses tend to take the “big” part of big data and run blindly with it, collecting anything and everything they can from numerous sources. Instead, they should plan and structure their data collection strategy to weed out the junk so they can focus on the gems.
When data scientists let just any old data into their collection, they end up spending most of their time cleaning, processing and structuring data with laborious manual and semi-automated methods. More time spent cleaning data means less time analyzing it, making big data a big problem for many organizations.
One thing companies can do to ensure data quality is to fix the problem at the source where the data is captured. Here are a few simple techniques that can be applied upstream during data collection:
Some of these techniques are easier said than done because many companies can’t make updates to the data capture programs for a variety of reasons. They may be legacy applications or vendor-hosted and outside the control of the company.
When this occurs there are techniques that can be used downstream to clean the data. Here are a few of those post-data collection techniques:
Preventing ugly data
The processes for managing and improving data are not just technical in nature – they involve a human element that requires its own procedures for how data is used, stored, accessed and changed. This is where data governance comes into play. Data governance involves the tools, business processes, and people who handle data, and it can help to prevent data from getting ugly.
If your enterprise handles large quantities of data, you should consider forming data governance committees within the organization. Ideally, these committees will be composed of leaders from a large swath of departments (not just IT) because data governance requires leadership and investment from across the organization.
Gaining widespread organizational commitment embeds the importance of data quality into your operations and culture. Committees should be charged with the tasks of implementing business processes to measure and track data entry, setting goals for improvement, and holding employees accountable for keeping data standards high.
Data governance brings proactive action to the equation with a system for routinely checking, correcting, and augmenting your company’s data before it becomes an ugly problem. Combined with the technical solutions described above, data governance can help BI managers make predictions or support critical business decisions with confidence knowing their knowledge is built upon a foundation of clean and accurate data.
When it comes to data, bigger is not necessarily better. Instead of jumping on the big data bandwagon, plan out your overall data strategy so you will end up with clean, usable data with which you can make smart decisions.