I’ve written several posts now about the importance of
analytics in healthcare & how healthcare organizations, especially those
with constrained resources, can move towards the use of analytics without a
large amount of expense (except in effort). I’ve also written a little about
what the analysis of available &/or large-scale data sets might consist of
& how it can be used to provide leverage for such organizations. Now I’d
like to write about the evolution of analytics & how search has also
evolved in parallel, & what the implications of this might be for
healthcare.
Some time ago, I wrote an essay titled “If Search is the Answer”[1].
In it, I proposed that search was not only an important functional
capability in our current & near-future work lives, but that it actually
was the principle around which our work was organized. Now it appears that our use
of constantly connected devices is resulting in our work lives & our lives increasingly
merging & that search has become an important, if not the important,
organizing principle in general for us. Search is much more than typing some
keywords into Google or Bing, etc. It really spans a range of capabilities that
includes not only naïve searches, but also semantic searches of all kinds. The
endpoint of the search range, at this time, is analytic query, that is, the
posing of questions that require quantitative or semantic analysis, or both of
a body of information. This body of information has grown so that we might be
talking about gigabytes (109 bytes) to petabytes (1015 bytes)
of things such as healthcare records, financial models, academic publications
etc.
Let’s look at two different examples of search evolution –
the first is Facebook Graph Search. Facebook has always provided search for
people based on names, profiles etc. Graph search is different in two ways:
first, it utilizes a semantic engine that allows natural language queries &
evaluates these queries to be able to use both the exact meanings &
interpretations of the meanings of the words used, & second, it uses the
structure of the semantic graph built by the underlying Facebook engine so that
it understands not only the content of user profiles, but the relationships of
that profile with other user’s profiles. It returns results from both within
Facebook & from the web, based on results from Bing (Microsoft) & now
also from Russian search engine Yandex (http://www.yandex.com/). Of course, it
only has a semantic graph (today) from Facebook content. Sample queries could
be such requests as “find the pictures of all of my friends who visited San
Francisco this year” or “find people who liked the movie Fruitvale Station & live in Oakland”. Semantic search is not new;
the concepts were first developed by Alan Collins & M. Ross Quillian (both
then at BBN Technologies) & enhanced by many people mainly working in
advanced database query. What’s different about Facebook graph search is the
reach that it has; Facebook has 1.2B monthly users.
The second example is IBM Watson. Watson is a cognitive
system that is a good deal more than what was exhibited on Jeopardy. Watson is
a reasoning system that performs not only semantic analysis of natural
language, but also hypothesis generation for answering questions, evaluation of
potential responses & synthesis of a “best” response. It uses large amounts
of information & is designed to be able to evaluate petabyte level
information sources in order to generate hypotheses & potential solutions. It ranks these solutions for presentation,
& it remembers the hypotheses it previously generated & how successful
they were for specific queries. It uses this information to optimize how it
answers similar queries, thereby “learning” from experience. One relevant
example query might be “Find all the patients with similar medical profiles
& diagnoses & rank the success of the treatment they received from most
to least successful”.
OK – so what about healthcare? Search will continue to evolve
toward more & more connected search; that is search organized in some way
such as relationships in a social network or relationships in a collection
medical records etc. Whether that connection is defined by parameterized graphs
(as in Facebook Graph Search) or by semantic query interpretation with
hypothesis generation & experienced-based learning (as in IBM Watson),
near-future search provide a way of using our own concepts & needs to
organize & generate knowledge from large bodies of information. Healthcare
analytics can be thought of as a kind of search. I have recently been involved
with a project that sought to determine the cost per medical encounter
classified by service category (medical, dental, behavioral, enabling,
ancillary, etc.) at a number of Community Health Centers. This analysis could
be expressed as an analytic query; in fact most analyses could be expressed as
analytic queries & could be posed to systems such as Watson, ParAccel
Analytics Platform or any of the Hadoop-based analytic packages. The accuracy
& validity of the answers would depend on a number of factors including (at
least): the quantity of the information available, the quality of the
information available, the ability to express the query appropriately in the
system, the ability of the system to interpret the query appropriately &
the ability of the system to present the results in an understandable way. If
we specialize the query we specified earlier to “find all the patients with the
diagnosis of non-Hodgkin’s lymphoma expressed in the skull, characterize their
symptoms for similarity & rank the success of their treatment from most to
least successful”, we’ll understand that the results might be different if we
had 750,000 patient records (a Health Center Controlled Network) to analyze
than if we had 9,000,000 patient records (Kaiser Southern California). What if
we could analyze even larger numbers of records? How good could our results be?
Let’s remember that quantity does always result in quality & the results
that we get are only as good as the questions we ask. For specific clinical
queries, though, we can get very good results, good enough that we can find
treatments that would not be obvious or even identifiable by other means except
serendipitously. Good enough, also, that we can determine that we’re asking the
wrong questions. This type of diagnosis & treatment planning is in the
future (the relatively near-future) for most clinicians, but somewhat less
ambitious queries can be done today in administrative & financial as well
as clinical areas.
The evolution of search in terms of the types of systems
that can be queried is leading to an evolution in how we use administrative,
financial & clinical information in healthcare. As search is increasingly
organized around concepts that reflect relationships in the real world, it will
become possible to ask questions that provide answers some of the most complex
issues we face such as improving clinical diagnosis & treatment. In
parallel, as the tools we use for search become more powerful, but with easier
to use “query interfaces”, asking these questions, & productively applying
the results will become easier & easier. Search, & the attendant
concept of discovery, is increasingly becoming the organizing principle for
much of our work in healthcare.
[1] The title is a homage to
Danny Bobrow’s 1985 paper If Prolog is
the Answer, What’s the Question” IEEE Trans Softw. Eng. 11(11) – perhaps
the most insightful paper on the logic of AI languages ever published, with the
possible exception of Doug Lenat’s paper on why AM worked (Lenat, D.B. &
J.S. Brown. 1984. Why AM and Eurisko
appear to work. AI 23(3):269-294.) My essay at
http://posttechnical.com/?page_id=58
No comments:
Post a Comment