You are on page 1of 12

Running head: THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 1

The Impact of Machine Learning on Electronic Health Records

Michael W. Walker

University of San Diego


THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 2

Abstract

The explosion of data in the healthcare industry has provided us a unique opportunity to
transform the industry. Healthcare data is growing at 48% annually, one of the largest
increases of any industry. This is posing challenges where data is coming from different data
sources, approximately 80% of all healthcare data is unstructured, and the risk of data breaches
and thefts has increased exponentially.

Enter Machine Learning and Cognitive Computing, subsets of the larger discipline called
Artificial Intelligence. Applications based off these technologies are able to learn based on their
data inputs and outputs, essentially learning from experience and offering solutions based off
these experiences.

This is an exciting time in the industry, and it’s time to embrace these technologies.
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 3

The Impact of Machine Learning on Electronic Health Records

Since the enactment of the HITECH Act in 2009, which helped promote the adoption of
electronic health records (EHR), the amount of medical data generated is increasing rapidly. A
report from EMC and the research firm IDC (Dec. 2014) anticipated an overall increase in health
data of 48 percent annually. The report pegs the volume of healthcare data at 153 Exabytes in
2013. At the projected growth rate, that figure will swell to 2,314 Exabytes by 2020. ("Health
Data Volumes Skyrocket," 2015)

While making all this data available to patients and healthcare providers is a positive
step, it also leads to other opportunities to use this data. What if we could use this data to
predict patients that are a high-risk for heart disease or certain cancers, and tailor a treatment
plan to increase their chances of living longer and healthier? What if we could create models
that predict hypoglycemia events in patients with Type 2 diabetes? Or what if we could use
EHR data to predict where flu outbreaks will occur? The possibilities of what we could do with
all of this data is astounding.

But there’s problems with generating all this information and making it available.
According to IBM, 80% of health data is invisible to current systems because it’s unstructured.
(https://www.ibm.com/watson/health/). Unstructured data in an EHR can include doctor’s
notes and radiology images. These are all valuable pieces of information in predicting
outcomes and it’s currently being under-utilized.

In addition, security plays a more prominent role now with the adoption of EHRs. IDC’s
Health Insights group predicts that 1 in 3 healthcare recipients will be the victim of a medical
data breach in 2016. Other surveys found that in the last two years, 89% of healthcare
organizations reported at least one data breach, with 79% reporting two or more breaches.
The most commonly compromised data are medical records, followed by billing and insurance
records. The average cost of a healthcare data breach is about $2.2 million. (“Will a Duo of AI
and Machine Learning,” 2016)

How can we use all this powerful data being generated in EHRs to transform health care
and improve decision-making, while at the same time protecting the vast amount of personal
patient data from online data thieves?

The answer to this problem is machine learning. It’s similar to data mining in they both
look for patterns in data. However, while data mining extracts patterns for people to digest,
machine learning uses the same data to detect patterns and then automatically adjusts the
actions of the computer programs. In addition, it can learn and process information from
unstructured data, which solves the problem of 80% health data being unstructured.
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 4

The importance of implementing machine learning into the EHR environment can’t be
stressed enough. The industry is accumulating vast amounts of data through patient records,
wearable devices, images, etc. But what is the use of all this data if we can only read and take
action on a small amount of it. Through the use of machine learning, we can harness the power
of computing and have the computers learn trends on all this data and be able to provide
instant feedback in the form of treatment plan options and diagnosis that make use of these
trends derived from the stored data. The return on this investment will be in lives saved
around the world, including developing nations with few doctors that are in dire need of
technology like this to help in patient care.

There are challenges involved with implementing machine learning into the healthcare
industry. Historically a limitation of machine learning has been the lack of accessibility. Due to
how technical the content of machine learning is, only experts were able to decipher the output
or programs developed for it. All healthcare providers saw was recommendations, without
knowing how that recommendation was derived. In addition, there was concern among
doctors and other decision-makers in organizations that the computers would eventually take
away their jobs.

Vendors supplying these recommendations are working to also provide details on how
they came up with the recommendation. When a doctor sees a recommendation from the
system on a treatment plan, not only will they get several treatment options, but they will also
know how the system derived these plans. Was it something in the patient’s EHR, their
lifestyle, or historical data used to predict future outcomes.

In order to alleviate the concern among doctors about their jobs going away, it really
just needs discussion and examples to alleviate their fears. As an example, I’ve talked to
radiologists at Kaiser about the future of machine learning with radiology and how IBM is
teaching its Watson supercomputer to read images and compare with a patient’s EHR for
accurate diagnosis and treatment plan. The first response is always a concern that their job is
going to disappear soon, and I can understand their concern. But then I mention that
radiologists have to review close to a thousand images a day and compare that with patient
data to come up with an accurate diagnosis. Wouldn’t it be better if that workload can be
shared with a system that can compare an image with its database of over a million images,
plus compare with a patient’s EHR to accurately return a diagnosis? Radiologists just don’t have
adequate time to dig into a patient’s family history or past visit detail when they’re referencing
up to a thousand images a day.

Even after stating this the Radiologist is still skeptical, but starting to see the benefits
and realizing their workload could be reduced without actually going away. The last point I
bring up is in regards to developing countries. In those places you have millions of people, but
few doctors. In countries like India there just aren’t enough Radiologists to view and diagnose
all the images. People wait months for a visit to the Oncologist and to have their x-rays
diagnosed. I’m sure in many places they don’t get seen at all and suffer for it. Now imagine if a
supercomputer had the intelligence to compare their x-rays against a database of millions of
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 5

images, both good and bad, and could accurately determine if there was an issue that needed
to be treated. That point usually hits home and the Radiologist sees the vast benefits of these
machine learning capabilities.

With these challenges being mitigated I wanted to focus on a few companies at the
forefront of implementing machine learning solutions. First is Mayo Clinic, which has several
projects in various stages of implementation and also invests in startups in the machine
learning area.

Mayo Clinic’s Center for Individualized Medicine is collaborating with Tempus to tailor
personalized treatment plans for patients. These treatment plans will be derived from machine
learning and analytics. Tempus will focus on molecular sequencing and analysis of close to
1,000 patients who have bladder, breast, melanoma, and lung cancers. The company’s
machine learning tools will be used to generate data from genomic results and Mayo Clinic’s
research teams will use this data along with a patient’s EHR to better individualize treatment
plans for patients. This personalization of treatment plans will help reduce the number of
ineffective treatments given to patients as well as improve survival rate.

Mayo Clinic is also collaborating with AliveCor, a company that manufactures


electrocardiogram devices for smartphones. This device attaches on the back of a tablet or
smartphone. If a consumer wants to take an ECG, they merely need to open an app on their
phone or tablet and place their fingers on the device’s electronodes. There is also a feature
that allows the consumer to verbally state other symptoms they’re experiencing. The app will
then analyze the ECG and report back if there’s any abnormalities. An additional feature allows
the data to automatically be sent to the patient’s doctor if they choose and have the data
stored in the patient’s EHR for future reference.

AliveCor currently has approximately 10 million ECG recordings. They will use this data
with machine learning to uncover certain signals of an oncoming atrial fibrillation. From this
collaboration, Mayo Clinic has also learned that an ECG can also be helpful with patients who
experience kidney failure, as changes in potassium levels can be very dangerous and will be
detected by this device.

The next company at the forefront of machine learning is IBM. The company has turned
its Watson supercomputer loose on the healthcare industry where it can process 500GB, or the
equivalent of 1 million books per second! Over the last few years IBM has invested billions of
dollars in purchasing companies to help deliver their solutions, including Merge Healthcare
which will allow Watson to “see” images and compare its billions of data points against them.

IBM has teamed with Medtronics in the development of a smartphone application that
will accurately predict the beginning of low blood sugar in diabetics. It will predict this event up
to 3 hours in advance. Medtronics is providing the app and will use Watson’s pattern
recognition analytics for its predictive capabilities. Early tests where they took approximately
600 cases and used cognitive analytics against the data from Medtronic’s insulin pumps and
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 6

glucose monitors proved that they could predict hypoglycemia an hour in advance. These
provide the patient will ample time to prevent a dangerous health episode. The next steps in
the development from Medtronics is wearable devices that can track patient’s activity, store
this information in the EHR, and send text messages when hypoglycemia is likely to happen in
the next hour.

IBM also is collaborating with Apple and the American Sleep Apnea Association to create
the SleepHealth application, which uses the Apple ResearchKit to collect data that is analyzed
using the machine learning and cognitive capabilities of Watson. Data will be collected from
IPhones and Apple watches and used within the Watson Health cloud to uncover trends and
patterns. This data can then be fed back to users through their EHR to help with sleep related
issues that they can then discuss with their doctor.

According to the Centers for Disease Control, the lack of sleep affects 25% of Americans,
10% suffer from chronic insomnia, and 25 million suffer from sleep apnea.
(http://fortune.com/2016/03/02/ibm-watson-apple-researchkit/). All these conditions increase
the risk of more serious health issues. This collaboration is hoping the data collected along with
Watson’s powerful machine learning and cognitive capabilities will help develop personalized
recommendations and solutions for sleep disorders.

Last, IBM has partnered with the Memorial Sloan Kettering Cancer Center in New York
on a Watson for Oncology application that runs on a tablet. The goal is to allow any doctor who
has licensed the app to be able to access the expertise of the MSK oncologists to provide top-
notch care to their patients. They will accomplish this by training the Watson supercomputer
to think like the oncologists. Watson, using machine learning, will ingest the data and use it to
provide detailed treatment plans for doctors using the app.

Apps like these are vital because new cancer studies and drugs are released frequently,
and it’s impossible for doctors to keep up with all of them. Using the Watson for Oncology app
will allow them to have access to this current data, be able to cross-reference this data with the
data stored in a patient’s EHR to see if it’s a fit for the patient, and understand how the new
drugs work and should be applied.

Last, Geisinger Health System has been at the forefront of implementing EHRs. They
first started using EHRs back in 1996, but since have had difficulties aggregating all the data
they’ve accumulated. One problem they have is that their older analytic systems can’t handle
the new datatypes out today. In addition, physicians wanted to include data from departments
like radiology and cardiology with the patient EHRs. They also decided to plan ahead and
prepare for data from wearable devices and other devices within the hospital. Recently they
partnered with Apache Hadoop and Hortonworks to implement a whole new infrastructure that
could analyze both new and legacy EHR data, whether structured or un-structured like doctor
notes, together. Now, using features from Hortonworks, physicians are able to search through
millions of patient EHRs in seconds to find relevant information to make the correct decisions
for their patients.
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 7

As an example, with this new infrastructure they are able to analyze clinical and
diagnostic imaging reports in EHRs, which includes unstructured data. Sometimes this analysis
reveals anomalies or detects data not related to the issue, which could cause mistakes. They
report they’ve already saved lives with this analysis.

With all this new data being made available and wearable devices transmitting data to
EHRs, the issue of security has become hot. EHRs fetch a high price on the black market as
customer’s credit card information, social security #, and address is listed there. So while
traditional cybersecurity methods like firewalls and patching have been implemented and been
effective to some degree, we’re starting to see the use of machine learning in helping to secure
our EHR data.

The top security professional spend a good portion of their day learning about new
malware, and reading blogs on new technologies and other data sources to stay on top of new
threats. There are vast amounts of security related data released every day, yet we’re human
and can only ingest so much. According to IBM research, security operations centers are
overwhelmed with an estimated 200,000 pieces of security event data per day.
(https://securityintelligence.com/bringing-the-power-of-watson-and-cognitive-into-the-
security-operations-
center/?cm_mc_uid=19783515732914865278871&cm_mc_sid_50200000=1486980340).
Much of this data is unstructured and can’t be read by traditional systems. This is where
supercomputers and machine learning enter the picture. These powerful systems can ingest
this new unstructured data quickly and use cognitive learning to understand how to use this
information, detect threats, and take action.

One such technology combines machine learning with artificial intelligence to place a
“digital fingerprint” on EHR records and other important data within a hospital. This “digital
fingerprint” can detect abnormal user behavior on the EHR and can also detect and flag
employees who continually violate the rules. Basically the technology is using machine learning
to learn patient and employee patterns with EHRs and then flagging behavioral abnormalities
as they occur. These actions can then be tracked to a user who can be questioned on intent
and potentially lose their access.

Several large companies are developing technology with machine learning and cognitive
capabilities to help with cybersecurity. IBM is deploying Watson for Cybersecurity to help
analyze security report and threat data that security experts handle everyday. In this case,
Watson will be able to analyze the reports at unprecedented speeds and take action quickly.

In conclusion, the potential for machine learning in the healthcare industry is immense.
We can already see several proven solutions that greatly impact the quality of care and
information available at the hands of our doctors. As more and more organizations and
companies move to EHRs, the amount and quality of data available to learn from will increase
exponentially.
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 8

When I was watching a 60 minutes episode on the Watson supercomputer they said it
was able to consume the equivalent of a million books per second! This means as an industry
we can handle this exponential increase in data. Too many times we hear about industries
where all this data is available, but there’s nothing that can mine through the data to find
trends and historical data. We need machine learning to help us in the healthcare industry and
take the data, ingest it, and turn around and help us with making good decisions for our
patients.

To leave you with a final example, during the same episode of 60 Minutes, it was
mentioned that 8,000 new research papers are published every day. Since doctors can’t read
8,000 papers a day they were realizing that many treatment and therapy plans were based off
data from 12 to 24 months ago. A Molecular Tumor Board did an analysis of 1,000 patients
and made recommendations based on that analysis. In 99 percent of those cases, Watson
found the same as the human recommendation. What was more encouraging is that in 30
percent of those patients, Watson found something new, that had just come out and the
doctor’s weren’t aware of yet. They said when they used Watson, they had access to the those
studies and drug treatments immediately because Watson found them and ingested the data
on the spot. The machine learning capabilities of Watson then gave them treatment plans
based on those new studies and the historical data Watson had already learned.

We’ve also seen that with this exponential increase in data and movement to mobile
devices to access this data that the risk for data breaches has also increase exponentially. If the
average security operations center is dealing with 200,000 pieces of security event data per
day, we as humans can’t keep up. We need the power of machine learning to help us sift
through these security events and determine which events are serious and which do not
require any action. Otherwise we will always be in an uphill battle in regards to protecting our
data.

If you’re interested in learning more about this exciting new area in healthcare I would
recommend looking up information on machine learning in general, and also machine learning
in healthcare. There’s also some fantastic articles on how the Watson supercomputer is
helping in the healthcare industry. Last, get involved in discussions on the topic. You don’t
have to be an expert programmer or scientist. Anyone can give suggestions on how we can use
the power of machine learning to put our EHR data to use.

Finally, it’s exciting to be getting involved on the ground floor of this exciting
advancement in healthcare. The possibilities are immense and I hope you’re excited as I am to
see what’s next.
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 9

References

CBS Interactive. (2017, February 13). Artificial intelligence positioned to be a game-changer.


Retrieved February 10, 2017, from http://www.cbsnews.com/news/60-minutes-
artificial-intelligence-charlie-rose-robot-sophia/

The article was a transcript of a 60 Minutes episode that discussed how IBM is using the
Watson supercomputer in the health care industry. It gives facts on how fast Watson
can learn medical books and new studies that are released. It also talked to doctors at
the university of North Carolina at Chapel Hill, who compared their diagnosis of cancer
patients with what Watson diagnosed, and found out that Watson found treatment
plans that doctors hadn’t heard of before.

Corbin, J. (2017, February 13). Bringing the power of Watson and cognitive computing to the
security operations center (SOC). Retrieved February 13, 2017, from Cognitive,
https://securityintelligence.com/bringing-the-power-of-watson-and-cognitive-into-the-
security-operations-
center/?cm_mc_uid=19783515732914865278871&cm_mc_sid_50200000=1486980340

The author introduces Watson for Cyber Security and explains how machine learning is
being used for the security of medical data. The author cites several statistics on the
amount of new papers and blogs that come out yearly and cannot be read by security
experts quick enough. The author also cites the number of security events per day and
how machine learning can help sort through the events to uncover the most urgent.

From digital to cognitive: A new partnership between humanity and technology. Retrieved
January 29, 2017, from https://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?htmlfid=HH912347USEN

The article discusses cognitive computing at IBM. It gives several facts about the
percentage of medical data that is unstructured. It also gives examples of what
unstructured data is.

Hortonworks. (2016, October 27). Geisinger. Retrieved February 11, 2017, from
http://hortonworks.com/customers/geisinger/

The article describes the history of Geisinger Health System and how the company
integrates all the medical data from EHRs, clinical systems, patient surveys, etc. It then
explains the future plans of the company to introduce big data and machine learning to
put all its medical data to better use.
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 10

Idrus, A. A. (2016, October 25). Mayo clinic taps AliveCor’s machine learning to broaden ECG
analysis. Retrieved February 10, 2017, from http://www.fiercebiotech.com/medical-
devices/mayo-clinic-taps-alivecor-s-machine-learning-to-broaden-ecg-analysis

The author focuses on the collaboration between Mayo Clinic and AliveCor which will
combine AliveCor’s smartphone electrocardiogram device and machine learning
algorithms to find new physiological indicators of heart health. The patient would be
able to get instant information through the device by taking an EKG and getting
feedback on whether they are experiencing atrial fibrillation.

IT, H. H. (2015, July 24). Health data volumes skyrocket, legacy data archives on the rise.
Retrieved January 29, 2017, from http://www.healthdataarchiver.com/health-data-
volumes-skyrocket-legacy-data-archives-rise-hie/

The article reports on how patient data is increasing exponentially and gives hard data
on what experts predict the total patient data will be in the future. It also gives
information on the value of legacy data and HIEs. Finally, it talks about the importance
of a solid legacy medical data storage plan.

Lorenzetti, L. (2016, March 02). IBM debuts apple ResearchKit study on Watson health cloud.
Retrieved February 10, 2017, from http://fortune.com/2016/03/02/ibm-watson-apple-
researchkit/

The author announces the IBM partnership with the American Sleep Apnea Association
and discuss the SleepHealth application. The application will take advantage of the
Apple Research Kit to collect data that will be stored, sorted, and analyzed in the
Watson Health Cloud. The author then describes how this massive amount of data can
be used to learn trends and compare with medical literature. The app will also allow
patients to pull information and tips on better sleep.

Lorenzetti, L. From cancer to consumer tech: A look inside IBM’s Watson health strategy.
Retrieved February 12, 2017, from http://fortune.com/ibm-watson-health-business-
strategy/

The author explains how IBM is partnering with companies within the medical,
pharmaceutical, and hospital fields to enable better care by surfacing insights from the
massive amounts of personal and academic health data that’s being generated every
day. The author also explains how Watson for Oncology works and is beneficial to
doctors and patients.

McCarthy, J. (2017, January 11). Mayo clinic teams up with Groupon founder’s machine
learning startup Tempus to personalize cancer treatment. Retrieved February 12, 2017,
from http://www.healthcareitnews.com/news/mayo-clinic-teams-groupon-founders-
machine-learning-upstart-tempus-develop-molecular-sequencing
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 11

The author talks about Mayo Clinic teaming up with machine learning startup Tempus to
personalize cancer treatments for patients using analytics. Tempus’ bioinformatics
analytics and machine learning tools will generate data that Mayo Clinic's research
teams can use to provide better care for their patients.

Medtronic, IBM Watson reveal prototype of diabetes app to predict low blood sugar. (2016,
January 7). Retrieved February 12, 2017, from http://www.fiercebiotech.com/medical-
devices/medtronic-ibm-watson-reveal-prototype-diabetes-app-to-predict-low-blood-
sugar

The article explains a partnership between Medtronics and IBM about a smartphone
app that someday will predict the onset of dangerously low blood sugar in diabetics up
to three hours in advance using pattern recognition from the Watson supercomputer. It
gives evidence about how Medtronics, using 600 anonymous patient cases, was able to
predict hypoglycemia up to three hours in advance of onset.

Varughese, S. (2016, September 19). Will a duo of AI and machine learning catch data thieves
lurking in hospital EHR corridors? Retrieved January 28, 2017, from
http://www.emrandhipaa.com/guest/2016/09/19/will-a-duo-of-ai-and-machine-
learning-catch-data-thieves-lurking-in-hospital-ehr-corridors/

The author, Santosh Varughese, a guest blogger and president of Cognetyx, reported on
the growing nightmare of medical data theft. He cites several sources on the frequency
and cost of data breaches, as well as examples of each. He finishes his blog with
examples of how the industry is fighting back to prevent these data breaches.

Vespi, C. (2016, November 29). IBM researchers bring AI to Radiology at RSNA 2016 - IBM Blog
research. Retrieved February 11, 2017, from Cognitive Computing,
https://www.ibm.com/blogs/research/2016/11/ai-radiology/

The author of the article describes how IBM researchers are using the Watson
supercomputer to analyze large amounts of imaging and text in electronic health
records, and then reason and learn from the imaging and text data in real time. It
explains how this will help radiologists, who have to examine thousands of records
every day. It also states why this is so important to the main researcher, who almost lost
her father due to a mis-diagnosis.

Zieger, A. (2017, January 5). The value of pairing machine learning with EMRs. Retrieved
February 11, 2017, from http://www.emrandehr.com/2017/01/05/the-value-of-pairing-
machine-learning-with-emrs/

The author, a veteran healthcare consultant, explains in the article the benefits of
machine learning technologies with EMRs as well as some of the current downsides of
THE IMPACT OF MACHINE LEARNING ON ELECTRONIC HEALTH RECORDS 12

the technology in the healthcare industry. The article also mentions that smaller,
independent organizations might not see a direct impact anytime soon.

You might also like