Thursday, June 12, 2014

Healthcare Analytics: Landscape & Directions

Analytics in healthcare has been called everything from the only path to reduced costs & improved outcomes to a distraction that will cost substantial money & energy. As always (well, almost always), the actual fact of the matter lies somewhere in between these extreme views. In this post, I’ll look at how analytics are currently being used in healthcare & how they could be used, what the advantages & impediments are & make some predictions – I am supposed to be a futurist after all.

First, a definition – What do I mean by analytics? I mean the systematic analysis of data focused on answering a specific question or set of questions. Analytics is not report generation, nor is it an IT function, a software package or a technical methodology. It is a way of thinking about an organization’s goals that makes use of highly focused analysis of data, not just “big data”, but relevant data. Any analysis that is done must be aligned with an organization’s goals & strategies. Otherwise it may not produce results that lead to actions relevant to the organization.

An analytic adoption model for healthcare organizations, developed by a group of industry experts[1] has been proposed to allow the evaluation of data warehousing & analytics efforts. It consists of the following levels:

     0.  Fragmented Point Solutions 
   1. Integrated Enterprise Data Warehouse
     2. Standardized Vocabularies & Patient Registries
     3. Automated Internal Reporting
     4. Automated External Reporting
     5. Clinical Effectiveness & Accountable Care
    6.  Per Case Payment & the Triple Aim
     7. Per Capita Payment & Predictive Analysis
     8. Per Unit of Health Payment & Prescriptive Analysis


This framework emphasizes the functions associated with successful data warehousing as well as the relationship of payment with various dimensions of providing healthcare: per case, per capita & per unit of health. Is this an interesting way of looking at the evolution of analytics. Looking at part of the current landscape will help answer that.

I say part of because so much work is currently being done that it’s difficult to get the picture as a whole. What we can get is some dimensions or areas of focus & some examples in each area (each area has many, many more examples)[2]:
  •           Point-of-Care Recommendations – generally large amount of clinical data analyzed to provide best match to individual patient characterization for delivery of best practice diagnosis & treatment recommendations, not rule-based, but empirical
    •       Mayo Clinic – 5M clinical records (approximately 15-25PB) are analyzed to provide best practice treatment for individual patients in real time
    •      Beth Israel Deaconess Medical Center – 2M+ clinical records are analyzed to provide best practice treatment for individual patients in real time
    •      Kaiser Permanente – 9M clinical records over 10 years (approximately 45PB) are analyzed to provide best practice treatment for individual patients in real time, natural language query system in use
    •      Partners Healthcare – combined clinical, operational & financial data, best practice recommendations made at time-of-encounter
  •        Outcome & Population Characterization
    •         Intermountain Healthcare (partnered with Deloitte) – 90M patient records, two analytic applications developed, Outcome Miner: derives factors that contribute to outcome at the individual level & Population Miner: derives relationship(s) between treatments & outcomes at the population level
    •       McKinsey – “next 5% analysis”, analysis of 30M commercial claims to determine “micro-segments of patient population that will allow the identification of the top 6% of patients with regard to cost, assignment of care managers
  •        Predictive Modeling
    •       ExpressScripts – 1.5B prescriptions/year, use of predictive modeling to determine which patients are most likely to not use prescription as indicated, suggests proactive interventions
  •        Research
    •       Mt. Sinai Medical Center (partnered with Ayasdi) – used unique analytic method developed by Ayasdi (topological analysis) to evaluate the entire e. coli genome (including 1M DNA variants) to determine bacteria’s response to different antibiotics
  •        Operations Optimization
    •       Oregon Health & Science University – PAR (periodic automatic replenishment) levels for 4000 infusion pumps established & pump utilization & inventory tracked & optimized

Point-of-Care recommendations are far & away the most numerous applications while research & Operations Optimization appear to be the least. Point-of-Care systems seem to fall at Level 5 (clinical effectiveness & accountable care) on the adoption model. Very few efforts seem, at this time, to fall in the levels above Level 5, which implies that work on price & cost has not yet been emphasized.

So, if point-of-care, analysis of outcomes & predictive modeling of various kinds are the current areas of analytic focus, what areas might be interesting & productive that are not being emphasized today? I have been surveying a variety of healthcare organizations (informally) by talking to people at conferences, meetings etc., & this is what I’ve been hearing.
  •           Can various forms of trend analysis be combined with geo-locational data to provide insight into very local conditions, for instance: Are specific diagnoses concentrated locally & if so are they associated with specific clinical characterizations?      
  •      Can trend analysis of changes in population served be combined with larger scale demographic data, for instance: Are large-scale demographic trends driving trends in numbers of patients, ethnic grouping of patients etc.? Are larger-scale demographic trends going to influence whether healthcare organizations & specific locations should be invested in, for instance: In an area where the overall population is decreasing rapidly, should clinics, health centers of hospitals remain open? What factors, other than demographic trends, should influence these decisions?
  •            Is there a relationship between cost of care & cost of outcome on a per patient, per provider &/or per location basis?
  •       Can data on service utilization & demographics be used to model service utilization trends for planning purposes?
  •       Are there bottlenecks in clinical & operational workflows that affect quality of care & outcome (it’s not clear that the data for this analysis is generally available, although several business process modeling methodologies could be used to address the issue)?

These are just a few of the many areas of inquiry & specific inquiries people have talked to me about. Only the largest healthcare organizations I’ve spoken with (Partners Healthcare, Vanderbilt Medical Center etc.) have focused on point-of-care recommendations as an analytic goal. This is, in part, because only these organizations have access to the ultra-large clinical data sets that are optimal for this type of analysis. Even where data aggregates, such as data warehouses, have been developed by medium & small sized organizations, the organizations have tended to focus on specific operational & financial questions as sustainability is their biggest issue. This generalization, like all such generalizations, must be taken for what it is worth – a generalization from not very much data.

What, then, are the impediments to the use of analytics as I have been describing it. There are the obvious ones of lack of resources, lack of expertise, lack of experienced personnel etc., but for small-to-medium size healthcare organizations, the largest impediment I have seen is the lack of an approach to providing appropriate data for analysis. Many organizations are using conventional data warehouse & extract techniques that at a minimum require semantic & syntactic normalization as well as transformation & standardization of the data in order to have meaningful results. I have seen, reviewed & participated in a number of projects over the last year in which results were not usable because data were aggregated without this work being done (in many cases by an outside contractor hired by the healthcare organization). I have also participated in a project where the definition of core elements like “encounter”, “outcome” & “provider” could not be determined from the data & could not be agreed upon by the project participants[3].  As we say, “Garbage in, Garbage out.[4]

Even those organizations that are using more contemporary analytic methods, such as Hadoop-based analytic stacks, often have trouble with use of data for lack of experience & expertise. This will improve as more organizations move to these methods & more people are trained in their use.

The second major impediment that I have seen while working with healthcare organizations is a lack of understanding of what analytics is & can do – this is especially true in reference to the difference between analytics & reporting. All healthcare organizations do reporting from their clinical & practice management data: what quality measures scored highest in the practice? Lowest? What departments had the highest numbers of patients etc.  Analytics is different in that we are trying to explore less obvious, & in some cases non-obvious & unintuitive relationships in the data, often in very large data sets. I have had many people that I work with on this ask me if “analytics” can improve their reporting of quality measures. The answer is yes, if you are looking for underlying factors affecting performance, but not if you a simply trying to get better results on quality performance from your data. This is mainly an experience & education issue.

Of course, these issues are a result of many of the obvious impediments: lack of resources, expertise & so on…

So where are we on healthcare analytics? It seems clear (to me at least) that the use of analytic techniques to explore clinical & operational data will become more & more important in the next several to the extent that if a healthcare organization is not developing this expertise & using it to try to optimize clinical & operational efforts, that organization will fall behind in the effort to meet the Triple Aim of improving the experience of care, improving the health of populations & lowering the cost of care per capita. Organizations can use pattern matching in ultra-large data sets to provide improved diagnosis & treatment planning at point-of-care which is of core importance to the patient, but they will also have to begin to explore the relationships between cost & provision of care for both individuals & populations in order to begin to lower such costs, & begin to do predictive modeling & trend analysis in order to be able to optimize utilization of scarce resources & to sustain their operations. The inclusion of methodologies from other disciplines such as business process modeling for workflow optimization, other modeling & simulation techniques for optimization of efforts such as: CPOE (inventory & ordering models), pharmacy utilization, other operations optimization & even analysis of social media data (to extract information of use in clinical & operational workflows) will be essential. We are just at the beginning of the analytics effort & “letting a thousand flowers bloom[5]” is the best way to move toward a consolidation & consensus of what works. Five years from now, healthcare organizations will routinely be analyzing data for the continuous improvement of clinical & operational efforts & actual meaningful use will include such analysis as a core part of what healthcare organizations do. If it doesn’t, we’ll still be stuck in our current morass of huge amounts of data, but little insight or wisdom about how to provide care & control costs.

Next up: More on social media in healthcare (I may be getting too obsessed with this…)





[2] The list was created from personal communication & web search (2-10 June 2014)
[3] It is interesting (?) that the only outcome they could unanimously agree on was the death of the patient.
[4] Apparently first used in a syndicated newspaper story about early computerization efforts at the Internal Revenue Service in April, 1963; although my favorite (earlier) example comes from Charles Babbage who in 1864 wrote,” On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.” Babbage, C. 1864. Passages from the Life of a Philosopher. Longman & Co. London. P.67.
[5] Although most people think that when Mao ZeDong initiated the Hundred Flowers Campaign in 1956, it was to allow dissidents to express themselves so they could be identified & dealt with.

No comments: