You are on page 1of 6

1

A Brief History of Statistics, Data Analysis, and Probability

Gaby De Jesus

Franciscan University of Steubenville

EDU 331
2

It is in the nature of things to evolve and develop as time goes on. Starting as merely a

survey or census of the population in various ancient cultures, the branch of mathematics

consisting of statistics, data analysis, and probability has become one of the most widely studied

and used branches. From the gambler to the psychologist, the baseball fanatic to the U.S.

government, so many professions and individuals partake in the collection of data and use of it to

make conclusions about past events and assumptions about future events. Statistics, data

analysis, and probability make up such an important field of mathematics because they allow us

to evaluate and make predictions about things so relevant to our lives today.

Before data can be analyzed, it must first be collected. As such, this collection has

become a pivotal element in the field of statistics. John Graunt, an English statistician living

during the mid-1600s, was a pioneer in the gathering of information. At that time in Europe,

many individuals were concerned with the quantitative study of disease, population, and wealth

especially because of the long-lasting devastation brought on by the plague. Graunt was one of

these individuals, and he was particularly interested in the information regarding mortality in the

city of London, England. In the late 1660s he published a collection of numbers pertaining to

mortality in London obtained from records of the plague. The table included causes of death, and

it recorded the number of individuals who passed away from the various causes each year from

1646 to 1660 (Porter, 2016). Called the “London Life Table,” Graunt created it with the intention

of predicting mortality rates in the future (“The Beginning,” n.d.). Not only did Graunt’s

collection of data on life and death form the foundation for the statistical study of human

populations, but it also led to the idea of predicting outcomes based on past data.

Similarly to predicting outcomes, probability focuses on the likelihood of certain events

occurring over others. Also known as the mathematics of chance, the first ideas resembling
3

probability were actually formed from the analysis of games involving gambling. Blaise Pascal

was a French mathematician of the late 17th century who developed some of the main principles

of probability after aiding a gambler in figuring out the outcome of an interrupted game of

chance. The problem that the gambler proposed to Pascal was this: If player A and player B

have wagered 32 pistols in a game to three, how much does each player receive if the game is

interrupted when player A has two points and player B has one point? Pascal approached the

question through the idea of expectation. Specifically, it was expected that player B would win

the next round. In that event, player A would automatically get his portion back, and the amount

player B would receive would depend on if he won the first round. Pascal saw that the first round

could be treated as a fair game which would result in player A receiving a portion of 32 plus 16

and player B receiving a portion of 16 (Porter, 2016). From this, Pascal was able to generalize

this process of predicting outcomes by applying his famous triangle. If all the numbers in one

whole row of the triangle represented the total number of outcomes, Pascal was able to divide the

row into number of outcomes in which player A would win and number of outcomes in which

player B would win (Mastin, 2010). This forms the basis for how probabilities are calculated

today – dividing the number of times the desired event occurs by the number of total events.

Other than probability, data analysis constitutes a large portion of what statisticians study.

One of the most important aspects of this branch is the interpretation of distributions of data.

Some of the biggest contributions made to interpreting distributions were made by a 19th century

British mathematician named Karl Pearson. Pearson was primarily interested in data collection

with regard to population and Darwin’s theory of natural selection. Because of the nature of the

data he collected, he worked with all different kinds of distributions, not just normal

distributions. Pearson defined four parameters for interpreting distributions. They are mean,
4

standard deviation or spread, skewness, and kurtosis which describes the peaked of flat shape of

the distribution. These four parameters can be applied to any distribution regardless of its shape.

His work with data forming non-normal distributions led Pearson to be a pioneer in the realm of

goodness of fit testing and correlation. His most important contribution to statistics was his

development of the chi-square goodness of fit test. This test allows statisticians to use methods

independent of normal distributions to interpret findings of data (Magnello, n.d.). Without such a

test the amount of knowledge about our world that we come to through data analysis would be so

much more limited.

In addition to Pearson, William Gosset made a significant contribution to statistics and

data analysis with his discovery of another type of distribution and the development of the

significance test that accompanies it. Gosset was in English man living in the early 20th century

who started working at the Guinness brewery in Dublin, Ireland after studying mathematics and

the natural sciences in college. As the business at the company grew, Gosset was given the task

to discover how Guinness could continue to increase production while maintaining quality. He

set out to accomplish the task by testing the quality of the hops – the main ingredient of the beer

– across many batches. However, the amount of batches was limited, so Gosset wanted to find

out the extent to which a small sample accurately represents the population. In other words, he

was curious about the error distribution when an inference is made from small samples. In

general, Gosset discovered that the larger the sample size the more representative it is of the

entire population and the less error there is. From his work at the factory, Gosset came up with

the t-distribution and t-test which is used to estimate the error of an estimate depending on the

sample size (Kopf, 2015). This test led to the concept of statistical significance, and it is very

widely used by statisticians in various fields today.


5

Such historical information on the development of statistics, data analysis, and

probability has multiple practical applications in the classroom. Firstly, in any class in which

probability is a concept that is taught I would go through the derivation of the formula for

probability from Pascal’s triangle. In many math classrooms of today, students are merely given

formulas and taught how to use them but are never shown where they come from. It is extremely

important for students to know why certain formulas work because it will strengthen their

understanding. Telling my students the story of how Pascal developed the formula for probability

by helping a gambler solve a problem will not only add an extra element of fun to the class, but it

will also give them the opportunity to make the connection as to why probability is represented

by desired outcomes over total outcomes. Secondly, teaching my students the history behind

statistics, data analysis, and probability will allow them to see how important mathematics is in

solving real-world problems. Each of the mathematicians above were faced with a problem

directly involving one aspect of life or another, and mathematics was necessary for the solving of

each one of them. Acknowledging this will hopefully encourage students to want to apply what

they learn in the mathematics classroom to their daily lives because of the way it can make

problem solving easier.

Statistics, data analysis, and probability are such important areas of mathematics because

of the way they allow us to solve real-life problems and make conclusions about occurrences in

the world around us. For Graunt it was mortality rates, for Pascal it was a game of chance, for

Pearson it was population and natural selection, and for Gosset it was production of high-quality

beer. Whatever it may be, it is clear that without statistics, data analysis, and probability as they

are today, we would not be able to extract as much meaning out of the events that occur in our

lives.
6

References

Kopf, D. (2015, December 11). The Guinness Brewer Who Revolutionized Statistics. Retrieved

November 10, 2017, from https://priceonomics.com/the-guinness-brewer-who-

revolutionized-statistics/.

Magnello, M. (n.d.) Karl Pearson and the Origins of Modern Statistics: An Elastician becomes a

Statistician. Retrieved November 10, 2017, from

http://www.rutherfordjournal.org/article010107.html.

Mastin, L. (2010). 17TH CENTURY MATHEMATICS – PASCAL. Retrieved November 11,

2017 from http://www.storyofmathematics.com/17th_pascal.html.

Porter, T. (2016, August 17). Probability and statistics. Retrieved November 10, 2017, from

https://www.britannica.com/topic/probability/The-rise-of-statistics

The Beginning. (n.d.). Retrieved November 10, 2017 from

http://www.math.utep.edu/Faculty/mleung/probabilityandstatistics/beg.html

You might also like