"If your experiment needs statistics, you ought to have done a better experiment..."
Lord Rutherford
Introduction
I have been working with the RCHN
Community Health Foundation (rchnfoundation.org) for almost 10 years. The
RCHN Community Health Foundation (RCHN CHF) is a not-for-profit operating
foundation whose mission is to support and benefit the work of community health
centers (CHCs) nationally. I am their Director of Technology Research & in
this role I spearhead the organization’s continued evaluation, assessment and dissemination of findings related to health information technology. I also do
research[1]
on emerging technologies, primarily in ultra large-scale software systems,
machine learning & application of machine learning to current software
systems, & I work with a number of technology start-ups, many of them in
healthcare information technology.
In October of 2013, I
made a proposal to the Foundation to start a project that would deploy a
contemporary analytic software capability into community health centers that
volunteered for the project & to work with their IT & executive staffs
so that the capability could be productively used as part of how the health
center made strategic decisions[2].
I wrote at the time: “Everyone agrees that “analytics” are/will be
important for health centers as they evolve to new organizational (participants
HIEs, ACOs, HCCNs etc.) & sustainability (service providers, data
providers) models. What this means & how to do it are hotly discussed
topics, however, with no apparent tactic or strategy that seems feasible. There
is no big bang in this effort. This capability will not spring forth complete
& productive if health centers make the correct invocation or even spend a
large amount of money. This memo specifies a program that would pilot an actual
path for health centers (& other healthcare organizations with limited
resources) to follow to begin to productively use analytics & to evolve a
more & more effective capability in this area.”
I also wrote that: “Complex analytics, multi-layered analytics
and highly designed data warehouses are not necessary, and moreover, not
appropriate if the questions that are asked aren’t relevant or don’t require
them and the underlying data isn’t complete and reliable. “
That was just over two years ago.
What happened with the project & what is going on with it now? What lessons
have been learned? What lessons did we already know but needed to have
reinforced by painful experience? Here is a project update.
Project Initiation
By the early summer of 2014, two
CHCs had volunteered to begin the project – one urban & one (very) rural.
One more urban CHC & a Primary Care Association were added by late winter
2015. At that point, the project encompassed 33 health centers with over 400
sites & 1.3M patients/year for 2-3 full data years (depending on CHC).
These participants will be referred to as PCA1, Urban CHC1, Urban CHC2 &
Rural CHC1.
Each engagement began with a
face-to-face meeting with the CHCs’ IT & executive staffs. The goals of the
project were discussed as was how to think about analytics as an enabler of
strategic decision-making. Questions ranged from: “Isn’t this just UDS[3] so
why do we need it?” to “We already have more technology than we can use, so why
do we need it?” to “Is this big data & if so, why do we need it?” In each
case, the CEO &/or COO of the health center was interested & committed
to the project so a good deal of time was spent (10 hours in a month for one
CHC) working through how this was different than UDS, how analytics could be
used to support decision-making, what the technology was & why it was
needed (instead of what they already had) & many other topics/issues that
came up. During this time, I was also meeting with the IT Staff to work out
deployment of the new technology & arrange for data to be available. Deployment,
data acquisition & testing was very dependent on both the amount of
resources a CHC had to devote to it & the level of capability of the
resource(s). At a minimum, the skills required to do the deployment, data
acquisition & testing included: system management skills (program &
application installation), database capability (schema awareness, SQL
programming), network & connectivity troubleshooting, & testing skills.
Figure 1. Simplified Analytic System Components
OK – so what technology did we deploy & what inquiry did the project start with?
The figure above shows the
components that were deployed. They are all part of the Cloudera Express open
source Hadoop deployment. Cloudera is a company that was started by people who
came from Yahoo where Hadoop was originally developed. What’s Hadoop & why
did we choose it for the P2A project? Good questions…”Apache Hadoop is an open-source
software framework
written in Java for distributed storage
and distributed processing
of very large data sets on computer
clusters built from commodity hardware.”[4]
It was developed primarily by Doug Cutting (& Mike Cafarella) who was
working at YAHOO at the time. He named it after his son’s toy elephant. The
framework consists of: 1) a massively parallel-distributed file system (Hadoop
Distributed File System, HDFS), 2) MapReduce (currently Yarn MR2) which is a
programming model & implementation for processing & creating very large
data sets using parallel distributed algorithms, 3) Zookeeper, a centralized
service for providing configuration, naming & synchronization services for
distributed applications used with Hadoop systems, & 4) a large set of
ecosystem applications, the most important of which for P2A are: 1) HBase, a
non-relational data storage system optimized for very large data sets (billions
of rows x millions of columns) optimized for use with HDFS & other Hadoop
applications, 2) Impala, a query engine from Cloudera [5]that
allows SQL queries to be run against HDF & HBase data, & 3) Hue, an
web-base interface that supports Hadoop. All of these components are supported
under an Apache open source license & available as part of the Cloudera
Express deployment. The final piece in the figure is Tableau, an independent
product (not open source or part of the Cloudera distribution) that provides
visualization for large data sets. Visualization is often helpful when
explaining complex analyses.
The next figure shows
a screen shot of this system running at one of the project sites.
Figure 2. Deployment Dashboard at Urban CHC1
Deployment took between 2 weeks & 3 months depending on the resources & personnel that each CHC had available. It was most effective at the PCA that did a centralized deployment for 30+ CHCs, had a IT staff that was experienced with provisioning Linux distributions & also had a staff member that had experience with Hadoop deployment. This group was able to go from the download of the Cloudera Express distribution to full deployment & provisioning of HDFS/HBase from their data warehouse in about 2 weeks.
Deployment was least effective at
the Rural CHC1 that had a minimal IT Staff, only accessed their clinical data
through an intermediate BI tool & did not (initially) have permission to
access the database underlying their EHR. Deployment & data provisioning
here took multiple months & was complicated by the design of the database
underlying the EHR (that had >1000 tables most of which were redundant or
unused).
Initial Inquiry
An initial inquiry or set of
queries is done at each site after deployment is complete. This exercise acts
as a final test of the analytic system & also is meant to allow the
normalization of data, including data definitions, between the CHC’s normal
systems & the analytic stack. The exercise, called “level-up” consists of a
number of queries performed both through the CHC’s regular systems (EHR, SQL,
BI tools) compared with the same queries performed on the analytic stack with
the data in the HDFS/HBase information store. The following queries are
performed[6]:
- # patients/year
- # patients/diagnosis/year (hypertension, diabetes, obesity, heart disease, behavioral)
- rank order of prevalent comorbidities
- (cost/patient/year)[7]
- (cost/comorbidity/year)
This exercise has been carried out
at all sites except for rural CHC1.
Figure 3. P2A vs. PCA Level-Up Computing Paths
Figure 3. P2A vs. PCA Level-Up Computing Paths
Results for the level-up exercise
have been,… instructive. Again, there was a large difference in the time that
it took to run the queries among the sites. The PCA had this done within 2
weeks of completing deployment. Urban CHC1 required about a month, but Urban
CHC2 required 5-6 months. There were several reasons for these discrepancies
generally falling into 3 categories:
- Differences in definitions used - The UDS definitions were strictly used for the P2A (analytic stack) queries, but each CHC deviated from the UDS definitions, often in major ways. Here are some examples:
- Each CHC had different, & in some cases quite idiosyncratic, definitions in use for visits. In most cases these included visits that would have qualified under current & recent instructions (p. 17, op. cit.)
- Patients were defined in many different ways including one CHC that excluded dental patients from their counts (including those reported to HRSA) if they had dental encounters, but not "medical" encounters (i.e. did not have a primary care provider assigned)
- Although HRSA provides lists of ICD-9 codes to be reported for each diagnosis (condition), each CHC reported a different collection of codes as representing a particular condition. In addition, many CMOs reported that their providers did not generally report certain conditions (see below for obesity)
- In at least one case, the CHC was reporting the number of patients by diagnosis if they: a) had a medical encounter during the data year, & b) had ever been diagnosed with the condition regardless of when. This is in contradiction to the instructions for Table 6A, Column B, Lines 1-20d (p. 76 op. cit.)
- The PCA & Urban CHC1 had already done substantial normalization associated with the design & deployment of a data warehouse. Each of these organizations had both a data dictionary deployed & a written document with standard definitions. Even when the normalized definitions differed from the UDS definitions, they were able to be identified & modified much more easily than those centers that had not done this work & where the database schema or SQL code had to be examined to determine what definitions were used
- Issues with EHR structure & function – There 4 different EHRs in use as well as a variety of practice management, cost accounting & financial systems. EHRs included: NextGen, SuccessEHS, eClinicalWorks & GE Centricity
- Each EHR dealt with ICD-9 differently, but each had some anomalies with queries based on the codes. Generally the following gave different results upon query: 250, 250.0, 250.00 & 250*.
- In addition 250* general did not return the same results as an enumeration of all the 5-digit codes (250.01, 250.02, 250.03, 250.10 ,…, 250.93)
- It was also found that navigation is complicated enough that it was very rare for more than one diagnosis to be recorded per encounter. This may also have to do with how encounters are reimbursed.
- Lack of alignment between clinical (EHR) & financial (cost) data – While it is easy to associated a specific encounter with billing data (potential revenue) as this data is carried in the practice management & eBilling systems, it is quite difficult to associate actual costs with clinical encounter data. Cost accounting systems are separate from the PM & EHR systems used for clinical & billing data, & they are organized quite differently. Rather than keying on encounter, patient etc., they are usually organized around location, time period &/or payer. In no case did we find a direct key linking encounter data to actual cost data. This makes it difficult to analyze any clinical data in association with anything but billing (revenue) data, while what is actually required is the analysis with cost data.
Preliminary clinical results from the
Level-Up exercise
Despite the difficulties described
above, we have results for patients/condition/year for all but one of our
participating CHCs. Results presented here are preliminary & should not be
taken as anything other than an indication of trends. The project deployed all
software within the security perimeter of the participating CHCs, the
Foundation had BAAs with all relevant parties, all results are HIPAA
de-identified & nothing but results ever left the security perimeter of the
CHC. Results are presented for the Primary Care Association only as they are
the most complete & are indicative of trends in the other result sets. The
PCA’s data set represents 30 CHCs with close to 300 sites.
Results (table above) are
presented for three data years (DX): 2012, 2013 & 2014. The PCA results
were generated from SQL queries made directly to the PCA’s data warehouse, The
P2A results were generated by mpSQL queries made through Impala to the HBase
representation of the data imported into HDFS. Results represent the
percentages of each condition in the covered population. Total patients for the
three data years are 1,194,310 in the PCA data & 1,137,642 in the P2A data
(P2A figure = 95.3% of PCA figure). Percentages are compared with the CDC Fast
Stats figures for the U.S. population as a whole (U.S.%).
Table 1. PCA Data for Level-Up, 2012-2014
Results for diabetes & hypertension are low compared to the Cfigures, but are generally within 75%-77% of the U.S. population. Results for obesity & heart disease are, however very low with obesity ranging from 14%-34% & heart disease about 14% of the CDC figures. These percentages are consistent across all CHC sites including the non-PCA CHCs.
As previously stated, results for
the other CHCs in the study (3 CHCs, 45-50 sites, ~75,000 patients per year)
were in line with the PCA results. All of them showed lower than expected
figures for all conditions, but especially for obesity & heart disease.
In all cases, comorbidity results
were not usable because of the low condition percentages. No calculations of
cost vs. condition or comorbidity were made because in all cases: 1) it was not
clear how to associate the actual cost data with clinical results, 2)
comorbidity percentages were so low, generally <1%, that such calculations
would not have been meaningful, even if possible.
Discussion of Preliminary Results
The Path2Analytics project is still
in progress, but several results already stand out. The first is the large
range in the ability of community health centers to deploy, provision &
utilize Hadoop-based analytics. Next is the quality of the data available for
use in analysis. Additionally, results of an initial analysis did not meet
expectations for population health conditions, especially with respect to
obesity & heart disease. Finally, lack of alignment with cost data makes
analysis of such calculated parameters as cost/diagnosed condition/ patient
difficult while the lack of credible comorbidity data made it infeasible to
calculate cost/comorbidity/patient, even if cost data had been available. Each
of these issues will be discussed in more detail.
Condition Percentages
The real question is why are the
percentages of chronic disease so low in these populations. We do not generally
consider patient populations of community health centers to be as healthy or
healthier than the general U.S. population, yet in all cases, condition percentages
are below expectation & they are especially low for obesity & heart
disease. I have explored this anomaly with respect to obesity in discussions
with many of the Chief Medical Officers & other staff at participating
CHCs. Many of them estimated their population at ~40% obese overall. A recent
paper in the Journal of the American Medical Association, Internal Medicine
estimated that 40% of men & 30% of women were overweight & that 35% of
men & 37% of women were obese[8].
The estimate by CMOs of 40% would be in line with this study, but again the
expectation might be that percentages for health center populations would be
above this figure. CMOs & others thought there might be two explanations
for the low percentages of obesity occurring as a diagnosis in their EHRs. The
first is that most CMOs knew that their providers did not often diagnose
obesity & when they did, they did not use the full range of ICD-9 codes.
There are three specific codes for obesity (278, unspecified obesity; 278.01,
morbid obesity, BMI>30; & 278.02, overweight, BMI>25). These are
hardly ever used. The UDS guidelines specify the use of the 22 V-codes for
obesity that give a highly specific breakdown of BMI measurements. Apparently
these codes are used even less frequently than the 278 codes,… but why. Most
people thought that there was a sociocultural bias against making this
diagnosis & that in some demographics being overweight or obese was
considered attractive or at least not unacceptable.[9] A
recent paper in PLoS ONE[10]
compared ICD-9 data reported in the U.S. Inpatient Reporting Sample (NIS, HRSA)
to interview data reported in the Behavioral Risk Factor Surveillance System
(BRFSS, CDC) for 2011 data & found that in the NIS data, that is hospital
discharge data, the percentage of obesity reported was ~9%, & that in the
BRFSS or interview data, the percentage of obesity was 27%. This is a
significant difference between the recorded data & the observed data.
The underdiagnosis of heart
disease is harder to attempt to explain. The overall U.S. percentage is ~11%
but the CHC percentages were generally under 5%. Obesity might be subjective
(although BMI values are supposed to be used), but heart disease is a
diagnosable occurrence. You either have a myocardial infarction or systolic heart
failure, or you do not. Most CMOs thought their populations were in the range
of 20%-30% for some form of heart disease. Possible causes of underdiagnosis
are still under investigation.
Strategy & Normalization
CHCs differ greatly in their
capabilities around both the deployment & maintenance of information
technology & their ability to do strategic analysis, regardless of in what
form (quantitative, qualitative, scenario-based, etc.). In two cases, the
Primary Care Association & Urban CHC1, considerable work had already been
done on strategy development including, in the case of Urban CHC1, hiring a
Chief Strategy Officer (now COO) & in both cases, the development of a
strategic analysis & development process. The other two CHCs had no real strategy
development process, but appeared to react to both external & internal
events on an ad hoc basis. Most
health centers, in my experience, fall somewhere between these two extremes.
In addition, both the PCA &
Urban CHC1 had put substantial effort into understanding the database
underlying the EHR system in use & on creating unambiguous &
standardized definitions for terms such as visits, patients etc. Even though in
some cases, these terms differed from the definitions specified in the UDS
guidelines, identification & correction of these differences (at least for
the purpose of this analysis) was not difficult because in both cases a data
dictionary had been deployed & in the case of Urban CHC1, a written data
dictionary was also available.
In the case of Urban CHC2, a long
time, 5-6 months, was needed to fully uncover all of the definitional
differences between the health center analysis (SQL through intermediate BI
tool to EHR reporting extract derived from EHR database) & the P2A analysis
(mpSQL through Impala on HBase provisioned from data in HDFS imported from the
EHR database). Partly this was due to starting from scratch on determining what
the CHC’s definitions were & needing ultimately to examine (& debug)
their SQL coding, & partly it was due to multiple definitions being used
for the same term.
The Rural CHC1 has still not
completed the level-up exercise (after 14 months). There were a number of
reasons for this. They do their queries for reporting through a dedicated BI
tool that does not allow examination of the underlying generated SQL, the EHR
in use has an extremely complicated schema design with over 1000 tables the
majority of which are redundant or not used, they did not initially have a
license that allowed them to examine or query the EHR database directly nor did
they have any staff that could actually write SQL or other queries. This is not
typical of rural health centers, but in this particular instance the
impediments have proved to be difficult to overcome to date.
Data
Quality
Data quality has also been a
concern in the project. The PCA & Urban CHC1 had each gone through a
process to design & deploy a data warehouse for their clinical (EHR) data.
In both cases, there were minor issues with data quality, mainly with missing
data, but these issues could be resolved or adapted to so that analysis could
be carried out. Urban CHC2 had potentially five years of EHR data (2010-2014).
The center had done an EHR migration in 2011 that had been facilitated by their
new vendor. As it turned out data years 2010-2012 were not usable do to
corruption resulting from the conversion. The CHC had not looked at this data
until the P2A project started, & so they were surprised that so much of
their EHR data was not usable. This center started doing nightly extracts of
the EHR data in 2013 & even though they did not have a standard data model
or normalization procedures in place for the extract, the 2013-2014 data was
much better in terms of missing or unusable data. The rural CHC1 had read-only
access to the underlying database (Oracle) of their EHR. They had substantial
problems with missing data, mostly data just not entered at the time of the
patient visit according to anecdotal evidence & with unusable data, mainly
data outside of normal (or in some cases even conceivable) ranges. An example
would be many BMIs in the 400-500 range & several above 10,000. Their CMO
told me that she knows they had several patients with BMIs in the 45-50 range,
but clearly not in the 400-500 range. BMI is only one example of out of range
& therefore unusable data. These problems of missing & unusable data
were not limited to this rural health center but were found in every health
center’s data to a certain extent.
Clinical Results
Of course, the biggest data
anomaly is the very low percentages of specific conditions in all of the
centers’ clinical data. Hypertension was in the 20% range where nationwide
figures are closer to 30%[11].
Diabetes was generally in the 6%-7% range with nationwide figure in the 9%-10%
range. More problematically, heart disease was generally <5% while the
nationwide figure is more like 11%-12% & obesity was overall <10% while
nationwide figures are more like 35%. Many of the CMOs informally interviewed
about these results are not surprised that the percentage of diagnoses in the
EHR data are low. Most of them admitted that the providers at their CHC do a
poor job of coding many diagnoses without having an explanation for that
apparent fact. It is also the case that these low percentages affect the accuracy
& credibility of comorbidity percentages. I’ve already discussed the
potential sociocultural & organizational biases that might be operating to
affect the diagnosis of obesity. Similar factors may also be at work in the
case of other diagnoses, but many of these are strictly empirical (as actually
is obesity measured as BMI). You are either showing signs of cardiac distress
at a specific visit is either high or you are not. The apparent inconsistent
reporting of diagnoses during encounters, if true, is a real issue. If, as the
recent PLOS One (op. cit.) points
out, data in the National Inpatient reporting Sample (NIS) on obesity
(overweight, tobacco use & alcohol abuse) are substantially different than
data taken as part of a face-to-face interview (BRFSS), we have a real problem
with the use of reported data such as EHR data for any type of population
analysis. We have to assume that since
providers are seeing patients in person, they are treating what they see &
not just what they enter in the EHR.
This provides a segue to another
possible cause for he low diagnosis percentages, especially with respect to
comorbidities. In order for comorbidities to be identified, multiple diagnoses
have to be entered for a patient, ideally for the same encounter. A patient
that comes in with a severe foot infection might also have diabetes &
potentially hypertension indicated as diagnoses. It is quite possible that the
structure & function of current EHRs make this difficult. At the HIMSS 2015
conference, I proposed a use case to test this to 5 EHR vendors whose product
is in use at CHCs. In each case, I sat with a company representative who was an
expert in their EHR use & went through the use case that was designed to
see if multiple diagnoses would be recorded for a specific type of visit[12].
In no case were multiple diagnoses recorded, & in all cases the EHR’s user
interface & fixed workflow made it very difficult to do so. To be fair, in
two of the five cases, the EHR provided a mechanism to at least see other diagnoses
for the specific patient (problem list navigation) & in one case,
suggestions were made for additional diagnoses based on the diagnosis codes
entered for the visit, but each representative said that in their experience,
these mechanisms were rarely used in practice. It is also true that the
reimbursement model (payment for care for primary diagnosis) does not create an
environment where multiple diagnoses, & therefore the ability to identify
comorbidities, are emphasized (or rewarded).
Analysis of
Combined Financial & Clinical Results
Financial & cost accounting
systems were evaluated at Urban CHC1 & Rural CHC1. In both cases, the
financial systems were entirely separate from the EHR & other clinical
information systems. This included different user level applications &
different underlying databases. Also, in both cases, the data in the underlying
database for the financial system was structured quite differently than the
data in the database underlying the EHR system. Data in the EHR system was keyed
on encounter (visit) date/time & patient identification. Data in the
financial system was keyed on location where cost was accrued, cost date, payer
type & other values not found or used as keys in the clinical systems. This
made it difficult to impossible to calculate actual cost per patient per
encounter per diagnosis. Patient identification is not usually present in the
financial data. It is relatively to analyze revenue (or at least billed amount)
per patient per encounter as the billing data is available in the practice
management system that is keyed to the EHR data. It was not feasible at either
CHC to assign actual cost from aggregated location costs (salaries, overhead
etc.) to a specific patient visit, & so it is not possible to do analyses
such as patient/encounter/diagnosis or yearly patient costs/diagnosis etc.
These types of analyses however are essential if we are to understand &
modify clinical & operational processes in order to both reduce overall
costs & improve outcomes. CHCs are required to report annual cost/patient
as part of their UDS reporting to HRSA, but this figure is calculated simply by
dividing a “total cost per year” figure by total number of patients. There is
no granularity in this number with respect to individual patients whose actual
costs can vary between hundreds of dollars per year to hundreds of thousands of
dollars per year.
Summary
& Conclusions
Analytics have begun playing a
larger & larger role in healthcare in the last 10-15 years. Healthcare
organizations are using analytics for everything from facilitating &
clarifying strategic planning to optimizing operational processes to providing
data-based diagnosis & treatment planning to lowering the cost of care
without compromising care quality. Many healthcare organizations have very
large amounts of data in the petabyte range (Kaiser Permanente, Partners
Healthcare, Geisinger Healthcare, Cleveland Clinic, etc.), but most have more
modest amounts, in the gigabytes to low terabytes range. Regardless of the amount
of data available for analysis, analytics are beginning to provide real
leverage to healthcare organizations.
Healthcare safety-net
organizations, such as community health centers (CHCs), often do not have the
resources to explore new technological directions, especially if they require
not only new hardware & software & analytic skills but also a new
emphasis & understanding on the use of data. The Path2Analytics Project is
deploying contemporary (Hadoop-based) analytics into CHCs & working with
their staffs to make the necessary technological, organizational & cultural
changes in order to leverage this new capability strategically. Now going into
its third year, the P2A project is working with about 35 CHCs providing
healthcare to about 1.3M people. Deployment capability varies widely with the
Primary Care Association (PCA) & Urban CHC1 both deploying the analytic
software & connecting to their data source quickly (1-2 weeks). Urban CHC2
& Rural CHC1 both took substantially longer (8-12 weeks) & required
substantial assistance, especially to connect to their data sources. Similarly
the PCA & Urban CHC1 both carried out the initial “level-up” exercise
quickly while Urban CHC2 took 5-6 months & Rural CHC1 has still not
completed it after 14 months.
Preliminary results, as presented
& discussed above, were surprising. All population percentage measurements
of patient/diagnosis/year were lower than expected & figures for obesity
& heart disease were very much below nationwide figures as presented by the
CDC. This is surprising as we generally think of the population served by CHCs
as less healthy than the general U.S. population, & in fact their enhanced
CMS reimbursement is based on this assumption & the fact that they provide
a large amounts of enabling services to their patients (non-clinical services
that enable the delivery of healthcare including at least: case management,
benefit counseling, eligibility assistance, language interpretation,
transportation & education services). Some possible technological,
organizational & sociocultural causes of this underdiagnosis or data
capture error have been discussed, but there are several larger issues
including: 1) are these conditions being treated if they are not being
diagnosed? 2) how good are the data for use in population health efforts &
meta-analysis for clinical & operational process improvement? &, 3) can
appropriate policy be developed based on our current understanding of
individual & population health if the data in EHRs is not actually representative
of the health status of the population? These are serious questions that this
study is neither designed nor prepared to answer,… but an attempt must be made
to answer them.
The P2A project is continuing. We
hope to include several new CHCs, especially rural CHCs, although it should be
noted that the PCA involved includes many rural centers. A deeper examination
of the issues raised by the preliminary results is also planned, & we
expect that the health centers furthest along will begin including non-clinical
data in the Hadoop analytic stack & use it to perform analysis to address
specific strategic issues. Further reports will be made as the project
progresses.
Appendix
Use Case for Multiple Diagnoses in EHRs:
1.
Encounters:
1.1. Patient
who has not been seen before walks into CHC with severe foot infection (ICD-9:
730.97, unspecified infection of ankle & foot bone)
730.97, unspecified infection of ankle & foot bone)
1.1.1.
Patient is treated & appointment made for
follow-up in three days
1.2. Patient
returns to CHC for follow-up & reports severe headache (ICD-9: 339.10,
tension type
headache, unspecified
headache, unspecified
1.2.1.
Foot infection inspected, disinfected & re-bandaged
1.2.2.
Headache discussed with patient, analgesic
suggested
2.
Results:
2.1. Two
encounters are recorded for the patient:
2.1.1.
Encounter 1 – diagnosis 730.97
2.1.2.
Encounter 2 – diagnoses 730.97, 339.10
2.2. No
other diagnoses recorded, no comorbidities (diabetes, hypertension, obesity)
explored
[3] Uniform Data System yearly reporting
required by HRSA (Health Resources & Services Administration, HHS)
[4] https://en.wikipedia.org/wiki/Apache_Hadoop
[5] http://www.cloudera.com/content/www/en-us.html
[6] UDS definitions are used for
all terms including: visits, patients & conditions,
http://www.bphcdata.net/docs/uds_rep_instr.pdf
[7] actual cost (expenditure),
not billed cost (revenue)
[8] Yang , L. & G.A. Colditz. 2015. Prevalence of Overweight & Obese in the U.S., 2007-2012. JAMA Int. Med. Published online 22 June 2015.
[9] Please note that in ICD-10,
now in use, there are 6 E66 codes for obesity & 6 O99 codes for obesity
related to pregnancy
[10] Al
Kazzi ES, Lau B, Li T, Schneider EB,Makary MA, Hutfless S (2015) Differences in
the Prevalence of Obesity, Smoking and Alcohol in the United States Nationwide
Inpatient Sample and the Behavioral Risk Factor Surveillance System. PLoS ONE
10(11): e0140165. doi:10.1371/journal. pone.0140165
[11] Nationwide percentages from CDC Fast Stats
[12] See appendix for description of use case
No comments:
Post a Comment