“Data is a precious thing and will last
longer than the systems themselves.” – Tim Berners-Lee, inventor of the World Wide Web.
In the past several years, we’ve heard an
immense amount about data, big data, data analytics & every possible topic
related to data. We know that 90% of all currently available data has been
generated in the past two (2) years![1] We
also know that every business publication has had articles on data (Business
Week, May 2013; Harvard Business Review, December 2013; Forbes February 2014 to
name just a few), & that every business consultant such as Accenture,
Deloitte, Gartner etc. has a practice or advisory in this area.
Closer to home, many large healthcare
organizations are developing analytic systems utilizing very large amounts of
data to provide diagnostic, treatment planning & operational guidance.
Examples would be the point-of-care recommendation systems currently used by
Kaiser Permanente & the Mayo clinic, among others, that provide
near-real-time diagnosis & treatment planning guidance to providers at a
patient’s bedside. Dr. Watson (IBM) is another well-known example.[2] These
systems use millions of patient records, often recorded over long periods of
time, as well as thousands (or more) journal articles & physician’s notes
to provide their analysis & recommendations. Not many healthcare
organizations have this amount of patient data available, so what are the
implications of analytics for most hospitals, clinics & practices, &
how can they take advantage of analytics to make better clinical &
operational decisions.
First, let’s define what we mean
by data & analytics. Data, in this sense, is a set of qualitative or
quantitative values. Simply restated, pieces of data are individual pieces of
information[3].
They may be numeric (quantitative), or words or sets of words (qualitative) or
even hybrids such as addresses (77 Massachusetts Avenue, E40-248). Analytics,
in general, is the discovery & communication of meaningful patterns in data[4].
Contemporary analytics has taken on a more specific meaning, especially in
contrast to statistical analysis of data (the application of statistical
hypothesis testing methods to data). Analytics today are a set of methods for
data organization & analysis that are applied when data have (some of the)
the following characteristics:
- Volume: management of multiple petabytes of data
- Velocity: management of data values that are changing rapidly (e.g. NASA’s launch sensor net of >1M sensors of various types sampled 3x/second)
- Variety: many different types of data in different formats & from different sources
In healthcare, data variety is
most often the issue. This type of data is very difficult to organize &
analyze in a conventional sense.
What are the differences between
analytics & conventional analysis? They can be summarized as follows:
- Contemporary analytics is the empirical characterization of data & information. An example would be: A physician at Kaiser is using their point-of- care recommendation in order to confirm a diagnosis & develop an optimal treatment plan. The physician is entering patient parameters while doing a bedside examination. The point-of-care recommendation system evaluates 4 PB of patient data against a set of patient parameters entered at the point-of-care for a specific patient, & it finds 9,372 cases similar enough to use for comparison with the patient. That is not a statistical prediction of similarity, but an exact empirical characterization. In the same sense, if that system classifies treatment plans of those 9,372 cases according to outcome, that is not a statistical prediction of outcome, but an exact characterization of the outcomes present in the data. This changes how we think about results in that we are looking at exact characterizations not predictions with associated probabilities. This is true of even smaller sets of data.
- Contemporary analytics does not require extensive data transformation & normalization. Analytic systems such as Hadoop-based analytic stacks aggregate data in many different forms (alphanumeric, text, image, other media) & from many different sources (EHR, financial systems, practice management, public health systems, other private & public data sources), & perform analysis across all of these types (e.g. cost/service/location/provider or number of patient interactions vs. macro-demographic & population trends). It does require an understanding of the normalized definitions of common terms (encounters, providers etc.), especially if cross-organizational comparisons are to be made.
- In general hypotheses & informational relationships are informed by the analysis, not by a priori assumptions. This means that empirical characterization is carried out by performing inquiry developed by consensus of the healthcare organization’s staff (or designees, all parts of the organization should be represented) aligned with strategy. Then hypotheses are formed (& relationships defined) based on empirical results & analysis may continue.
OK – so we know something about
data & analytics, but what does this actually mean for healthcare
organizations. As a technologist, I have to say that as interesting as the
technology of analytics is, it’s not the point. The point is a way of thinking
about data & analysis. I use the phrase “data as an asset” as shorthand for
this way of thinking. Thinking of data as an asset means that you (& your
team) look at data in a larger context than just the clinical &/or
operational data that you have. You think about data in relation to the
strategy of your organization & in relation to the kinds of strategic
decisions that are required to keep your organization healthy. Thinking of data
just as facts is no longer enough to create the largest amount of value from
that data, you must think of data strategically. This means having an awareness
of data, your own as well as external data… data from city, county, state &
federal programs… data from other organizations… as much relevant data as you
can discover & access.
Once you start thinking about
data as an asset, there are some things you can do to utilize data
strategically.
- First is to review (or develop) your organization’s strategy & identify what decisions are embedded in it.
- Next is to identify what data you have access to that is relevant to those decisions. This may, in fact, not be entirely straightforward. You may include data that is not immediately apparent as relevant. Remember, one of the characteristics of analytics is that the relationships in the data are defined empirically by inquiry, not a priori.
- Third is to convene groups of heterogeneous groups of stakeholders to develop areas of inquiry to be address by analysis. These can be quite general (e.g. the relationship of the provision of specific enabling services to outcome or cost), but they must be related to the organization’s strategy & to the decisions that need to be made to carry out that strategy.
- Fourth, detailed analytic queries are developed to address the areas of inquiry & carried out.
- Finally, results are interpreted & presented in support of data-driven decision-making. Queries can also be redesigned, modified or enhanced at this point & rerun.
Recent conversations with CIOs
& other healthcare executives at conferences & other meetings have
focused on several areas of inquiry that are strategic to the continued growth
& success of these organizations. These areas have included:
- Classifying patients according to risk & cost: This requires defining a set of classes (such as healthy patients, patients with chronic conditions, patients with multiple chronic conditions, patients with chronic conditions & behavioral health issues, etc.) & then analyzing the patient population with respect to these classes. Additionally it often does additional analysis to determine the cost of care for each patient & each class. This allows the top 1%, 5% & bottom 5% etc. of patients to be identified with respect to cost & may lead to interventions once causes & similarities in these classes are also analyzed.
- Determining the cost of providing specific clinical & non-clinical services (where data is available): This can be done along various axes such as per location, per time period, per provider; all of which may provide insight into costs & with additional analysis into the relationship of services to outcomes.
- Analyzing population trends utilizing both internal clinical & demographic data as well as publicly available data (such as State provided population trend data per location, time period etc.): This can provide insight into encounter trends as well as revenue trends.
Many other areas of inquiry are
possible, but need to be aligned with the organization’s strategy in order to
be productive & to enable data-driven decision-making.
As I mentioned above, the
technology of contemporary analytics is also interesting, & it will be
covered in my next post.
[Please Note: A version of this post appears as my column for Technology in Focus on the RCHN Community Health Foundation website (www.rchnfoundation.org)]
[1]
http://www.sciencedaily.com/releases/2013/05/130522085217.htm
[2]
http://www-03.ibm.com/innovation/ca/en/watson/watson_in_healthcare.shtml
[3] http://en.wikipedia.org/wiki/Data
[4] http://en.wikipedia.org/wiki/Analytics
No comments:
Post a Comment