7 Ways The U.S. Dept. Of Veterans Affairs Tackled Its Big Data Big Challenges4:00 PM EST Thu. Jul. 18, 2013
The U.S. Department of Veterans Affairs (VA) had a big problem with big data.
Just ask Dat Tran, deputy assistant secretary for data governance and analysis at the VA, and the keynote speaker at the Massachusetts Institute of Technology's seventh annual Information Quality Symposium, taking place this week in Cambridge, Mass. As the second-biggest federal agency in the country, the VA has a lot of data -- over 11 petabytes of it, to be exact. But the problem, according to Tran, was the quality of that data. There were repetitive data entries, a lack of interoperability between systems and no "authoritative source" when it came to customer records.
But Tran and his team recently set out to change all that, launching a massive data quality initiative across the VA. Here, according to Tran, are the department's top lessons learned along the way.
They say the first step to solving a problem is recognizing the problem exists. That's exactly what Tran did inside the VA.
He opened his keynote by illustrating how siloed and messy the VA's data ecosystem really was. In addition to the sheer volume of the VA's data -- 9.8 million medical enrollment records, 2.06 million home loan records, 49 million records in a veteran benefits master file -- that data was often hosted in disjointed systems that couldn't communicate with one another. On top of that, data was captured and stored in a variety of different ways, and often done so through error-ridden, manual processes.
"We don’t have a 360-degree view of our customers, [who are] the veterans, family members, and active duty service members that use VA for benefits or services," Tran said.
Tran said the best way to identify the root cause of repetitive data entries is to step back, and map how data is captured and stored within each line of business process.
"The minute the client walks through the door, [we asked], 'What information do we capture on that client, what system do we store it in, and what system do we pass it on to?'"
Tran said the biggest mistake organizations can make when conducting a data quality project is to just look at data from a pure "data and systems standpoint." Instead, they should look at the bigger picture, identifying each time that data is touched during day-to-day business processes. This, he said, allows them to see how data flows throughout their organization, and better pinpoint where inaccuracies or repetitive entries may occur.
Tran said one of the biggest lessons learned from the VA's data quality project was that data and information are not one and the same. Data only qualifies for information, he said, if it's accurate and up-to-date. If it's not, why even store it at all?
Tran noted, for instance, that the VA was storing "thousands" of social security numbers that began with five zeros. But these numbers, he said, were eventually confirmed as invalid by the social security administration, which said they were never even distributed in the first place.
"Just having the data in the system is not good enough," Tran said. "You need to discover whether or not that data is useful."
Tran stressed that data quality is not just an IT problem. It's something that the business should be invested in, as well, and to ensure that investment, IT needs to communicate the value of high-quality data to executives and stakeholders alike.
But, when doing so, don't overdo it with the technical talk. Rather than discussing data quality from a systems or architectural perspective, make it clear to executives that data quality is a business -- not just a technical -- priority.
"I know folks like to talk about architecture," Tran said. "But one of the things I have learned over the years, working in this arena, is that when you rush out there and start talking about architecture, you lose the business folks."
One of the best ways to ensure executives understand the value, and ultimately buy into, a data quality initiative is to create a role that acts as a liaison between IT and the business. Think of this role, Tran said, as a chief data officer, or somebody with deep technical understanding who also can communicate clearly and effectively with members of the C-suite.
"You just need to have this position established," Tran said. "The CDO, from my perspective, is kind of like a hybrid between technology and the business. You cannot be, in my view … a good chief data officer unless you understand your business needs. You cannot just be a technical person or a data person. You need to understand how that data is used."
When speaking to senior executives about a data governance or quality initiative, avoid using the term "project," Tran urged. This phrase, he said, suggests a concrete beginning and end. Data governance, instead, should be viewed as a constant, ongoing initiative within an organization.
"You cannot describe [data governance] to management as a project, because a project has a starting point and an end point. If you are truly going to have a data quality or information quality culture, it is a never-ending journey," Tran said. "So don't sell it as a project, because people will expect it to end at some point, and then move on to the next step."
When it comes to business intelligence (BI), Tran has two pieces of advice: Keep it simple, and don't be afraid to think outside the box.
He said BI initiatives tend to go awry when they become too complex, are bogged down with massive data sets, or are trying to answer too many questions at once. Instead, Tran said, try to answer one question at a time.
"What policy question are you trying to answer? What problem are you trying to solve? That is a critical thing that folks need to [ask] when it comes to business intelligence," Tran said.
Secondly, he added, don't be afraid to get creative with BI, and try not to limit yourself to a single BI or analytics tool. You may be surprised what's out there. "There is no one silver-bullet [BI] solution," Tran said.