You are on page 1of 2

FOR IMMEDIATE RELEASE

John Snow Labs, Turi and Alderley.ai Partnership Showcases Turnkey


Productivity of Data Science on a Massive Healthcare Dataset
Now publicly available: A tangible example of saving substantial time & money by
combining clean, current & enriched datasets with a highly optimized, scalable
data science toolset.
Lewes, Delaware, July 10, 2016, It is a well-known fact that data scientists spend 50%-80% of
their time today preparing data for analysis. John Snow Labs is a data operations company that
solves for this challenge, focusing on the healthcare domain. They serve software and data
science companies who are looking to outsource data operations to a trusted specialist. John
Snow Labs makes them faster by providing high quality, clean, updated and ready to use
datasets to enable them to focus on what they do best.
Domain experts curate & enrich datasets for specific data science challenges, and deliver
turnkey data that is high quality, always up to date, and inter-operable across different sources. It
is consistently documented, so that users easily know what each field and data element means.
And most fundamentally, it is just there, already formatted, optimized and loaded into the clients
analytics platform of choice.
A tangible example of saving time & money
The interactive, editable notebook detailing the step-by-step analysis is publicly available here.
Mark Pinches, a data scientist with years of expertise in pharma and the founder of Alderly.ai,
built the IPython Notebook that was used to perform the analysis. It uses the US healthcare
providers dataset from John Snow Labs to answer queries using slicing, joins, aggregations and
visualizations. The dataset is tabular and has almost 5 million rows and over 300 columns. This
analysis makes heavy use of GraphLab Create's SFrame library, provided by Turi (formerly
Dato) - a John Snow Labs data science partner, taking advantage of its optimizations &
scalability for large out-of-memory datasets.
Mark Pinches works in the UK and Europe, using a combination of onsite work and remote
access. His chosen tools are Qlikview for data wrangling and super quick application delivery;
Python for machine learning including Graphlab and sci-kit learn, for rapidly scalable machine
learning, with seaborne and matplotlib for visualisation. Mark has extensive experience in data
modelling, analysis, and visualization, applied statistics and machine learning with significant
study in drug development and toxicology. He has used everything from deep learning
techniques to historical research to solve problems.
GraphLab Create is an extensible machine learning framework that enables developers and
data scientists to easily build and deploy intelligent applications and services at scale. It includes
distributed data structures and rich libraries for data transformation and manipulation, scalable

task-oriented machine learning toolkits for creating, evaluating, and improving machine learning
models, data and model visualization for all aspects of development. GraphLab Create is built
on top of state-of-the-art technology in scalable data structures, powerful machine learning
methods, intuitive visualization, and flexible deployment options. It is written in C++ for the best
possible performance, with a Python interface for easy accessibility. The API, including autotuning for complex machine learning models, is designed to be easy to use for beginners, yet
flexible enough for expert data scientists.
John Snow Labs is a data operations company that accelerates data science, analytics and
software teams by providing turnkey data for analysis. The company's team of medical and data
specialists provides data across 15 broad categories, and takes away the ongoing pain many
analytics projects experience to find, clean, enrich, update and publish referential data.
"High productivity data science rests on three pillars: Having clean, current & rich data so that
one doesn't spend 80% of their time preparing data for analysis; having fast, scalable &
extensive machine learning libraries and tools; and having a domain expert at the helm to put
them together. John Snow Labs is proud to partner with Turi and Alderley.ai to provide best-ofclass solutions for each of these three pillars", said the founding team.
For further information, visit: www.JohnSnowLabs.com
Please follow John Snow Labs:
Twitter: twitter.com/johnsnowlabs
LinkedIn: www.linkedin.com/company/johnsnowlabs
Facebook: www.facebook.com/JohnSnowLabsInc
Google+: plus.google.com/u/0/+Johnsnowlabs/posts
Media Contact:
John Snow Labs
Attn: Ida Lucente
16192 Coastal Highway
Lewes, DE 19958
+1 (302) 786-5227
ida@JohnSnowLabs.com

Description: John Snow Labs is a USA based healthcare IT company providing


services for big data analytics, turnkey data analytics and data operations in
healthcare industry.
Keywords: Healthcare Data Centers, Data Ops, Data Philanthropy, Data Operations,
Turnkey Data, Clinical Data Download, Health Data Cleansing, Health Data
Wrangling, Find Healthcare Data, Buy Healthcare Data

You might also like