You are on page 1of 1

Temporal Phenome Analysis of a Large Clinical Cohort Predicts Hospital-Acquired Complications

Jeremy Warner MD MS1,2, Amin Zollanvari3, Peijin Zhang4, Graham Snyder5, Gil Alterovitz5 1Vanderbilt University, Nashville, TN; 2Beth Israel Deaconess Medical Center, Boston, MA; 3Texas A&M University, College Sta=on, TX; 4MIT PRIMES; 5Harvard Medical School

Abstract Electronic medical records (EMRs) enable large-scale phenome-based analysis, which can reveal important biologic characteris=cs of large popula=ons. In pathologic disease states, the character of the phenome is expected to change over =me; analysis of such temporal evolu=on may create new insights into disease processes [Figure 1]. We developed a novel method for disease phenome visualiza=on, over =me, in a comprehensive inpa=ent EMR database of more than 20,000 adult pa=ents [Table 1]. Analysis of the resultant temporal phenome map [Figure 2] led to the recogni=on of the presence of several serious hospital-acquired complica=ons, including Clostridium dicile infec=on (HA-CDI) and venous thromboembolism (HA-VTE). Further phenotypic deni=on of these serious complica=ons [Table 2, Figure 3] allowed for the development of Bayesian classiers which could predict them in advance, using commonly available demographics, laboratory results, and medica=on ordering informa=on [Figure 4]. The trained Bayesian network classiers could predict either complica=on 24 hours in advance with good performance (AUC > 0.80). Transla=ng these ndings into clinical care focused on the detec=on and preven=on of such complica=ons could alleviate considerable morbidity and mortality and also yield signicant cost savings, on the order of $2.6-$4.4 billion annually in the United States alone.

Figure 1. Example schema of a learning healthcare system. This example demonstrates the ideal ow of a learning healthcare system environment, which begins with data analysis and visualiza=on. Based on interpreta=on of this data, poten=al problems are recognized and hypotheses are generated. These lead to development of interven=ons to mi=gate or improve the iden=ed problems, which are then implemented and evaluated in an itera=ve fashion.

Figure 3. Outcomes for HA-CDI and HA-VTE cases as compared to randomly selected controls. A) Hospitaliza=on dura=on is signicantly longer for HA-CDI cases; B) 30-day post-discharge mortality is slightly worse for HA-CDI cases compared to controls. C) Hospitaliza=on dura=on is signicantly longer for HA-VTE cases; D) 30-day post-discharge mortality is signicantly worse for HA-VTE cases compared to controls. Table 1. Baseline demographics of the MIMIC-II version 6 dataset and specic subgroups.
Characteris:cs Total Adult Hospitaliza:ons (n=28,061) 65 (51-77) 15,781 (56) 586 (2) 2,362 (8) 825 (3) 19,704 (70) 4,584 (16) 2 (1-4) 7 (4-14) HA-CDI Cases (n=362) 68 (22-99) 199 (55) 5 (1) 27 (7) 7 (2) 282 (78) 41 (11) 2 (1-3) 20 (13-33) HA-CDI Controls (n=362) 65 (21-95) 201 (56) 13 (4) 27 (7) 10 (3) 257 (71) 55 (15) 2 (1-3) 9 (6-14) HA-VTE Cases (n=580) HA-VTE Controls (n=580) 63 (16-103) 332 (57) 11 (2) 45 (8) 12 (2) 442 (76) 70 (12) 2 (1-3.5) 8 (3-13)

Age, median (IQR), y Men, No. (%) Race/Ethnicity, No. (%) Asian Black Hispanic White Other/unknown Elixhauser comorbidity index, median (IQR) Length of stay, median (IQR), d

65 (17-102) 330 (57) 19 (3) 57 (10) 19 (3) 422 (73) 63 (11) 2 (1-4) 17 (11-27)

Figure 4. Bayesian network classiers. A) The HA-CDI classier comprises 19 laboratory measurements and 20 medica=ons (including the aggregated high-risk an=bacterials category). B) The HA-VTE classier comprises 20 laboratory measurements and 26 medica=ons. ALT: alanine transaminase; BUN: blood urea nitrogen; HGB: hemoglobin; IH: inhaled; MAX: maximum value measured over data collec=on period; MCHC: mean corpuscular hemoglobin concentra=on; MIN: minimum value measured over data collec=on period; MULTI: more than two bioequivalent routes; OU: ocular; PLT: platelets; PTT: par=al thromboplas=n =me; RBC: red blood cell count; RDW: red cell distribu=on width; SP GRAV: specic gravity; TP: topical; WBC: white blood cell count.

Conclusions Temporal paqerns of risk of disease, as dened by ICD-9- CM codes, can be quan=ed and visualized. Bayesian network classiers can predict serious hospital- acquired complica=ons with good accuracy. This approach could enable learning healthcare systems.

Table 2. Exposures and outcomes of HA-CDI and HA-VTE cases as compared to matched controls. Exposure or Outcome Low-risk an=bacterial exposure, No. (%)a,b High-risk an=bacterial exposure, No. (%)a,b PPI or H2-blocker exposure, No. (%)a,b Length of stay, median (IQR), d 30-day post-discharge mortality, No. (%) Pharmacologic VTE prophylaxis, No. (%)b,e Length of stay, median (IQR), d 30-day post-discharge mortality, No. (%) HA-CDI Cases (n=362) 249 (69) 240 (66) 297 (82) HA-CDI Controls (n=362) 168 (46) 124 (34) 265 (73) P-Value <.001 <.001 .006 <.001 .05 <.001 <.001 .04 Hazard or Odds Ra:o (95% CI) 2.54 (1.86-3.49)c 3.77 (2.74-5.20)c 1.67 (1.16-2.43)c 0.34 (0.30-0.41)d 1.48 (1.00-2.17)d 2.12 (1.65-2.72)c 0.41 (0.36-0.46)d 1.35 (1.02-1.77)d

Figure 2. Temporal phenome-wide associa=on of ICD-9-CM codes that are more likely (increased risk) in the lengthier hospitaliza=on subgroup, as a func=on of =me. Odds ra=os of signicant codes are shown, with the upper 95% CI shown as a light blue halo. Each chapter of the ICD-9-CM coding schema is shown in a separate color, with V- and E- codes shown in purple and gray, on the right. Median dura=on of hospitaliza=on and IQRs, for the en=re MIMIC-II database, are shown as horizontal lines.

a Medica=on POE data collected up to 24 hours before diagnosis for cases, and for the rst 48 hours of healthcare exposure, for controls. b Aggregate medica=on categories are dened in Table S6. c Odds ra=o d Hazard ra=o e Medica=on POE data collected up to 24 hours before diagnosis for cases, and for the rst 24 hours of healthcare exposure, for controls.

20 (13-33) 80 (22) HA-VTE Cases (n=580) 270 (47) 17 (11-27) 134 (23)

9 (6-14) 42 (12) HA-VTE Controls (n=580) 169 (29) 8 (3-13) 86 (15)

You might also like