The longstanding barriers to data sharing between healthcare organizations have been breaking down over the past decade, removing a major obstacle to partnerships between payers and providers. But major challenges remain, as data management only continues to grow in complexity and healthcare organizations try to make sense of a flood of information. On the latest episode of our “From the Trenches” podcast, we speak with Cotiviti chief analytics officer David Costello, who will join Sumant Rao, senior vice president of performance analytics, to deliver a presentation at the 13th Annual RISE Nashville Summit later this month titled, “Drowning in Data: Practical Approaches to Data Management to Power Your Analytics.”
Learn more about our presence at RISE 2019, including new market data Cotiviti will present on the convergence of quality improvement and risk adjustment optimization within health plans.
About the podcast:
From the Trenches is a healthcare podcast from Cotiviti, a leader in healthcare data analytics, exploring the latest trends in healthcare quality and performance analytics, risk adjustment, payment integrity, and payer-provider collaboration. Check out all our episodes in your browser, or subscribe on your smartphone or tablet with Apple Podcasts, TuneIn, Google Play, and Stitcher.
About our guest:
As chief analytics officer, David Costello works closely with product, consulting, data operations, and client service teams to ensure that Cotiviti's proprietary analytics help our clients meet their business objectives and remain relevant in the industry. Before joining Cotiviti, David was chief analytic officer at Press Ganey, where he was responsible for building a set of analytic products that allowed hospitals and provider groups to enhance the patient experience, solidify their reimbursements, and identify areas for process improvement and revenue enhancements. David also previously served as senior vice president of consumer segmentation and engagement strategies at Health Dialog. He holds a B.A. in business administration and sociology from Northern Michigan University and a Ph.D. in sociology from the University of Delaware. He also served in the United States Air Force.
Podcast transcript
What has fundamentally changed about the way healthcare organizations collect and aggregate data over the past decade?
There's been a monumental change in the way people view data from just 10 years ago. 10 years ago, there were questions all the time about getting rid of data. How do I eliminate data? How do I streamline data? Data storage was always a challenge. It was a challenge from an analytic standpoint. It was a challenge from an organizational standpoint. There were a couple reasons for this.
One was that the cost of maintaining data was quite high. With the advent of cloud technology and cheaper storage, that concern has somewhat gone by the wayside. Second is the analytic tools out there today, especially with the onset of artificial intelligence and machine-learning, the two new buzzwords in the industry today, has really come to the forefront. People are saying “give me more data, because more data tends to lead to more insights.” It may or may not, be true, but that has certainly been the argument.
This fundamental change about how to bring in data and keep as much of it as you can brings on its own complications problems. How do we bring it in? How do we manage this data? How do we wrangle the data to make sense to provide deeper insights to our clients? That's a challenge for us, as well as for many organizations.
Today's world differs a lot from those in the past where in the past we dealt primarily with structured data. Structured data almost exclusively within the premises itself. For example, at Cotiviti, we would be expected to look at claims only, and try to glean the insights from claims data, very structured data. We'd put it into a structured format, and we'd provide results back to clients. That is not enough today.
Unstructured data both within the organization and outside the organization needs to be brought in to enhance the overall data that organizations like us house. How do we manage that, both the unstructured and structured data, is the challenge, and getting it to still meet the demands of, “I need my answer not tomorrow, but now.” That's the challenge that organizations face.
What type of architecture can help healthcare organizations best manage their data?
One of the critical aspects of the way design is done is around how data is brought in. The idea of having a single ingestion engine coming in—but in most organizations today you have a place where data sits, and then data is pulled from it. That is a duplication of data across the organization. This duplication of information across the organization poses serious security risks for organizations, causes significant cost overruns due to storage, and causes duplication of data, which causes information to not be synced up.
The data lake is all part of this strategy about bringing data into one location, and having it reside in one place so you have a single source of truth. I'm not talking about a big data warehouse. This is about a place where data exchanges and is shared with other parts of the solution. Solutions come out and access that information, then come back and provide information back to our clients. This is a way that architectural companies are looking to design data.
I think no longer will we hear about, “can I build this big data repository out there?” That's the old technology, the old way of thinking of things. The data lake is the new strategy about bringing things in, having common IDs to then facilitate the applications that sit on top of it. A single source of truth—much faster in terms of processing time, and there's clarity in terms of where the data are and who owns it.
What role do data scientists play in performing data management well?
I believe that data scientists play a really critical role in determining the business questions that need to be addressed and answered. That to me will always be the role. However, I do see the role of data scientists changing, and morphing, into what I would call a hybrid between a statistician and a machine-learning expert.
In the past, the data scientist was a traditionally trained statistician that had a really good understanding of the underlying mechanisms about how analytic models work, and how statistical properties are maintained. Machine-learning experts, AI experts, came out of the computer science world and the engineering world, but are very well attuned to understanding how technology works.
This blending together of the data scientists, and the machine-learning, is going to be a hybrid that is going to work, because the data scientist needs to ensure that proper methodologies are maintained. The machine-learning is really important because of this intersection between the theory and practice—that window is shrinking. That marriage between the two is critically important. I believe that the future is that these two roles will morph together into something that is different, something that is better, and something that all organizations are going to be looking to hire. When you go out to data science programs 10 years ago, there was no such thing. Today, almost all of the top schools have a data science program, and the data science program is around teaching the fundamentals of statistics, as well as the fundamentals of big data and machine-learning.
Where do healthcare organizations tend to go wrong the most frequently in designing their data management frameworks?
I'm going to use three “toos” here. Too structured, too large, and too dependent on on-premise. Let's address them in order.
Too structured—too often organizations, especially large organizations, are not able to move quickly because of the legacy systems that they are encumbered with. They have just very, very structured data, and that's how it works. That's how it always has worked, and how it always will work. When we talked previously about bringing in different types of reference data, or unstructured data, they don't know how to handle it because of the huge system that they have in place today. What they tend to do is say, “how can I take that unstructured data and make it structured? How do I take a world that is not organized and make it organized?”
In reality, that disorganization in the world out there—that in and of itself is of interest, especially to data scientists. That world is not always as structured, and it's not always clean. Data scientists learn from the way unclean and unstructured data work. It's the messiness of the world itself. That's really where things work.
Too large—we think of things as the bigger, the better. Big is good. More is better. If everything needs to be tied together, things work fine by themselves, but they also need to be able to be brought together in a relational way, which is why the data lake versus the traditional data strategy of a big datamart, is the new way of thinking of things. It's about relationships, and trying to understand them rather than building one large organization and one large data collection repository.
And then, they're too dependent on on-premise. The rest of the world has taken full advantage of cloud-based technology. Healthcare will need to go there as well. We do understand the importance of keeping protected information, PHI, and we understand the role of healthcare organizations in protecting that information. That being said, the de-coupling of that information from the core data, and the core information that needs to be understood and transformed—that can all be done in the cloud. We need to start looking at that technology, because that's the way we're going to be able to meet the requirements of the marketplace, which is speed to answer.
Podcast music credit: "Inhaling Freedom" by Nazar Rybak, via HookSounds.