Monday, December 8, 2014

Healthcare Analytics: Concepts & Assumptions



Data is a precious thing and will last longer than the systems themselves.” – Tim Berners-Lee, inventor of the World Wide Web.







In the past several years, we’ve heard an immense amount about data, big data, data analytics & every possible topic related to data. We know that 90% of all currently available data has been generated in the past two (2) years![1] We also know that every business publication has had articles on data (Business Week, May 2013; Harvard Business Review, December 2013; Forbes February 2014 to name just a few), & that every business consultant such as Accenture, Deloitte, Gartner etc. has a practice or advisory in this area.

Closer to home, many large healthcare organizations are developing analytic systems utilizing very large amounts of data to provide diagnostic, treatment planning & operational guidance. Examples would be the point-of-care recommendation systems currently used by Kaiser Permanente & the Mayo clinic, among others, that provide near-real-time diagnosis & treatment planning guidance to providers at a patient’s bedside. Dr. Watson (IBM) is another well-known example.[2] These systems use millions of patient records, often recorded over long periods of time, as well as thousands (or more) journal articles & physician’s notes to provide their analysis & recommendations. Not many healthcare organizations have this amount of patient data available, so what are the implications of analytics for most hospitals, clinics & practices, & how can they take advantage of analytics to make better clinical & operational decisions.

First, let’s define what we mean by data & analytics. Data, in this sense, is a set of qualitative or quantitative values. Simply restated, pieces of data are individual pieces of information[3]. They may be numeric (quantitative), or words or sets of words (qualitative) or even hybrids such as addresses (77 Massachusetts Avenue, E40-248). Analytics, in general, is the discovery & communication of meaningful patterns in data[4]. Contemporary analytics has taken on a more specific meaning, especially in contrast to statistical analysis of data (the application of statistical hypothesis testing methods to data). Analytics today are a set of methods for data organization & analysis that are applied when data have (some of the) the following characteristics:
  • Volume: management of multiple petabytes of data
  • Velocity: management of data values that are changing rapidly (e.g. NASA’s launch sensor net of >1M sensors of various types sampled 3x/second)
  • Variety: many different types of data in different formats & from different sources

In healthcare, data variety is most often the issue. This type of data is very difficult to organize & analyze in a conventional sense.

What are the differences between analytics & conventional analysis? They can be summarized as follows:
  • Contemporary analytics is the empirical characterization of data & information. An example would be: A physician at Kaiser is using their point-of- care recommendation in order to confirm a diagnosis & develop an optimal treatment plan. The physician is entering patient parameters while doing a bedside examination. The point-of-care recommendation system evaluates 4 PB of patient data against a set of patient parameters entered at the point-of-care for a specific patient, & it finds 9,372 cases similar enough to use for comparison with the patient. That is not a statistical prediction of similarity, but an exact empirical characterization. In the same sense, if that system classifies treatment plans of those 9,372 cases according to outcome, that is not a statistical prediction of outcome, but an exact characterization of the outcomes present in the data. This changes how we think about results in that we are looking at exact characterizations not predictions with associated probabilities. This is true of even smaller sets of data.
  • Contemporary analytics does not require extensive data transformation & normalization. Analytic systems such as Hadoop-based analytic stacks aggregate data in many different forms (alphanumeric, text, image, other media) & from many different sources (EHR, financial systems, practice management, public health systems, other private & public data sources), & perform analysis across all of these types (e.g. cost/service/location/provider or number of patient interactions vs. macro-demographic & population trends). It does require an understanding of the normalized definitions of common terms (encounters, providers etc.), especially if cross-organizational comparisons are to be made.
  • In general hypotheses & informational relationships are informed by the analysis, not by a priori assumptions. This means that empirical characterization is carried out by performing inquiry developed by consensus of the healthcare organization’s staff (or designees, all parts of the organization should be represented) aligned with strategy. Then hypotheses are formed (& relationships defined) based on empirical results & analysis may continue.

OK – so we know something about data & analytics, but what does this actually mean for healthcare organizations. As a technologist, I have to say that as interesting as the technology of analytics is, it’s not the point. The point is a way of thinking about data & analysis. I use the phrase “data as an asset” as shorthand for this way of thinking. Thinking of data as an asset means that you (& your team) look at data in a larger context than just the clinical &/or operational data that you have. You think about data in relation to the strategy of your organization & in relation to the kinds of strategic decisions that are required to keep your organization healthy. Thinking of data just as facts is no longer enough to create the largest amount of value from that data, you must think of data strategically. This means having an awareness of data, your own as well as external data… data from city, county, state & federal programs… data from other organizations… as much relevant data as you can discover & access.

Once you start thinking about data as an asset, there are some things you can do to utilize data strategically.
  1. First is to review (or develop) your organization’s strategy & identify what decisions are embedded in it. 
  2.  Next is to identify what data you have access to that is relevant to those decisions. This may, in fact, not be entirely straightforward. You may include data that is not immediately apparent as relevant. Remember, one of the characteristics of analytics is that the relationships in the data are defined empirically by inquiry, not a priori.
  3. Third is to convene groups of heterogeneous groups of stakeholders to develop areas of inquiry to be address by analysis. These can be quite general (e.g. the relationship of the provision of specific enabling services to outcome or cost), but they must be related to the organization’s strategy & to the decisions that need to be made to carry out that strategy.
  4. Fourth, detailed analytic queries are developed to address the areas of inquiry & carried out. 
  5. Finally, results are interpreted & presented in support of data-driven decision-making. Queries can also be redesigned, modified or enhanced at this point & rerun.

Recent conversations with CIOs & other healthcare executives at conferences & other meetings have focused on several areas of inquiry that are strategic to the continued growth & success of these organizations. These areas have included:
  • Classifying patients according to risk & cost: This requires defining a set of classes (such as healthy patients, patients with chronic conditions, patients with multiple chronic conditions, patients with chronic conditions & behavioral health issues, etc.) & then analyzing the patient population with respect to these classes. Additionally it often does additional analysis to determine the cost of care for each patient & each class. This allows the top 1%, 5% & bottom 5% etc. of patients to be identified with respect to cost & may lead to interventions once causes & similarities in these classes are also analyzed.
  • Determining the cost of providing specific clinical & non-clinical services (where data is available): This can be done along various axes such as per location, per time period, per provider; all of which may provide insight into costs & with additional analysis into the relationship of services to outcomes.
  • Analyzing population trends utilizing both internal clinical & demographic data as well as publicly available data (such as State provided population trend data per location, time period etc.): This can provide insight into encounter trends as well as revenue trends.

Many other areas of inquiry are possible, but need to be aligned with the organization’s strategy in order to be productive & to enable data-driven decision-making.

As I mentioned above, the technology of contemporary analytics is also interesting, & it will be covered in my next post.

[Please Note: A version of this post appears as my column for Technology in Focus on the RCHN Community Health Foundation website (www.rchnfoundation.org)]


[1] http://www.sciencedaily.com/releases/2013/05/130522085217.htm
[2] http://www-03.ibm.com/innovation/ca/en/watson/watson_in_healthcare.shtml
[3] http://en.wikipedia.org/wiki/Data
[4] http://en.wikipedia.org/wiki/Analytics

No comments: