You can’t have that data! It’s not perfect yet
By Kimberly Nevala, Director, SAS Best Practices
Having a hard time balancing the insatiable, growing demand for more, bigger data faster with concerns about quality and appropriate usage? For many organizations, the historical inclination is to use data quality as a justification for denying access. We can’t give that to you! The data is not complete. We don’t know if it’s accurate. There are known issues and gaps. You know it’s not perfect…right?
Too often, the most restrictive and exacting quality requirement becomes the standard operating procedure (SOP) for all data. The data we use to calculate publically reported metrics must be closely managed and held to a rigorous standard. Therefore, shouldn’t all the data we use be held to that same standard? Often our answer is “Yes.” The right answer is “No.”
Consider a cross-country road trip from my home in Minnesota to Seattle, Washington. For most of the journey having a basic method to confirm you are consistently heading west – more or less – is the only absolute requirement. The closer to Washington and a specific destination the more finesse and detail are mandatory. Degrees matter in the heart of the city. Not so much in the middle of Montana. Trust me. I’ve been there.
The same goes for the quality of data used for the majority of reporting and analytics within an organization. Certainly there are situations in which “perfect” data is required. But the threshold needs to be based on intended usage. For strategic decision making and many types of analysis, timely, consistent and directionally correct information is more important than 100% confidence. The more operational and tactical the decision the more important precision becomes.
A Standard Operating Practice (SOP) for DQ
A good SOP might be “the data is good enough until proven otherwise.” Shocking? Perhaps. Does this suggest that data quality is passé? Not at all. In fact, this supposition is based on the premise you have a systematic method to measure and report on quality. Not as a mechanism to restrict access and usage but as a guidepost for all comers. This is actually not only a data quality best practice it is bona fide data governance.
So go ahead. Give 'em the data. It’s likely they are getting it (or worse, making it up) already. If you’ve measured the quality make the rating public. Don’t know how good or bad the data is? Slap a warning sticker on it. Utilize data governance to guide and enforce appropriate use and advocate for data improvement. With a little education, the vast majority of users will surprise you. They’ll make the right decisions about the decisions they make using the data.
The other upside? The user community will naturally take on the role of campaigning for investment in improved data quality. The old adage still stands: good data’s not cheap. Put the business in the hot seat of championing which improvements are germane. Again, you might be surprised what they decide.