You are on page 1of 3

What is Data Science? Why Data Science?

Data science is an interdisciplinary field about processes and systems to extract knowledge or insights
from data in various forms, either structured or unstructured,[1][2] which is a continuation of some of the
data analysis fields such as statistics, data mining, and predictive analytics,[3] similar to Knowledge
Discovery in Databases (KDD).

Why Data Science?


The field of data science is evolving into one of the fastest-growing and most in-demand fields in
the world. Organizations across industries are looking to make sense of the data they can now
collect from new technologies from predicting the next hot product to determining the risk of
an infectious disease outbreak.

Demand and Opportunity


According to The New York Times, data science promises to revolutionize industries from
business to government, health care to academia. As data accumulates, organizations are hiring
individuals with the expertise to find meaning in the numbers and drive positive business
decisions based on what they learn. It is estimated that by 2018, 4 million to 5 million jobs in the
United States will require data analysis skills, and a recent study from the McKinsey Global
Institute found a shortage of the analytical and managerial talent necessary to make the most of
Big Data is a significant and pressing challenge (for the U.S.). Based on the number of job
openings, median base salary and career opportunities, Glassdoor has ranked data scientist as the
Best Job in America for 2016.

What Do Data Scientists Do?


Data scientists use their data and analytical ability to find and interpret rich data sources; manage large
amounts of data despite hardware, software, and bandwidth constraints; merge data sources; ensure
consistency of datasets; create visualizations to aid in understanding data; build mathematical models
using the data; and present and communicate the data insights/findings. They are often expected to
produce answers in days rather than months, work by exploratory analysis and rapid iteration, and to
produce and present results with dashboards (displays of current values) rather than papers/reports, as
statisticians normally do.[6]
In simple terms, a data scientists job is to analyze data for actionable insights.

Specific tasks include:

Identifying the data-analytics problems that offer the greatest opportunities to the organization
Determining the correct data sets and variables
Collecting large sets of structured and unstructured data from disparate sources
Cleaning and validating the data to ensure accuracy, completeness, and uniformity
Devising and applying models and algorithms to mine the stores of big data
Analyzing the data to identify patterns and trends
Interpreting the data to discover solutions and opportunities
Communicating findings to stakeholders using visualization and other means

In the book, Doing Data Science, the authors describe the data scientists duties this way:
More generally, a data scientist is someone who knows how to extract meaning from and
interpret data, which requires both tools and methods from statistics and machine learning, as
well as being human. She spends a lot of time in the process of collecting, cleaning, and munging
data, because data is never clean. This process requires persistence, statistics, and software
engineering skillsskills that are also necessary for understanding biases in the data, and for
debugging logging output from code.
Once she gets the data into shape, a crucial part is exploratory data analysis, which combines
visualization and data sense. Shell find patterns, build models, and algorithmssome with the
intention of understanding product usage and the overall health of the product, and others to
serve as prototypes that ultimately get baked back into the product. She may design experiments,
and she is a critical part of data-driven decision making. Shell communicate with team
members, engineers, and leadership in clear language and with data visualizations so that even if
her colleagues are not immersed in the data themselves, they will understand the implications.
Source: ONeil, C., and Schutt, R. Doing Data Science. First edition.
Would you make a good data scientist?

To find out, ask yourself: Do you . . .

hold a degree in mathematics, statistics, computer science, management information


systems, or marketing?
have substantial work experience in any of these areas?
have an interest in data collection and analysis?
enjoy individualized work and problem solving?
communicate well both verbally and visually?
want to broaden your skills and take on new challenges?

If you answered yes to any of these questions, you may find a lot to like in the field of data
science.
Data scientists require a knowledge of math or statistics. A natural curiosity is also important, as
is creative and critical thinking. What can you do with all the data? What undiscovered
opportunities lie hidden within? You must have a knack for connecting the dots and a desire to

search out the answers to questions that have not yet been asked if you are to realize the datas
full potential.
Data scientists are also highly educated. According to industry resource KDnuggets, 88 percent
of data scientists have at least a masters degree and 46 percent have PhDs.
You also need some background in computer programming so you can devise the models and
algorithms necessary to mine the stores of big data. Python and R are two of the premier
programming environments for data science.
You must be something of an entrepreneur. A head for business strategy is important. Although
you may work with other data specialists or even with an interdisciplinary team of professionals,
you will not be successful if you cannot devise your own methods and build your own
infrastructures to slice and dice the data that will lead you to your new discoveries and new
visions for the future.
You must also be able to communicate complex ideas to your nontechnical stakeholders in a way
they can easily understand. Data-science software tools can help you visualize your findings, but
you will also need the verbal communication skills to tell the story clearly.

You might also like