You are on page 1of 6

PA S KO F F | 1

Ben Paskoff
Dr. Cody Steele
STAT 200
6/16/2015
Spurious Correlations or Post Hoc Logic

Throughout history, humans have used statistics in a myriad of ways to interpret information
collected from the world we live in. Using means, medians, modes, population sampling, graphing,
and other techniques, statistics have been used by both small companies and large nations to
understand such things as human behavior, environmental changes, population trends, and even the
likelihood of an asteroid destroying all life on earth. However, statistics is not always as
straightforward as graphing a large set of numbers and reporting the shape of the graph. In fact,
many notable blunders throughout the history of government, media, and businesses have stemmed
from misinterpreting the statistical data. According to Darrell Huff, author of the book, How to Lie
with Statistics, our need to find reason behind the numbers has caused a plethora of what he calls
post hoc logic. Post hoc logic is the reasoning by which you draw the conclusion that if B follows
A, A has caused B. However, while this logic may occasionally be accurate, oftentimes when two
variables increase at the same time, it is the result of a third explanatory variable and does not report
a direct correlation between A and B.
To demonstrate the post hoc logic in action, Darrell Huff quotes a survey that attempted to
draw a correlation between smoking and lower grades in school. The study (presumably done prior
to 1954 when the book was published) demonstrated that students with lower grades tended to
engage in more tobacco use than their more exceptional peers. Immediately, propagandists took this
study and drew the correlation that smoking decreases brain function. However, despite the
correlation, there is no evidence to support this claim. As the book points out, students may be more
likely to smoke as a result of receiving low grades, or more probable, the type of person who takes
his studies less seriously is also more likely to engage in smoking tobacco. Both of these
possibilities may have caused the observed correlation and are far more likely than the claim that
tobacco inhibits brain function.
Even today such studies still exist. Below is the data from a survey done in 2009 by the
National Youth Risk Behavior Survey (YRBS) on tobaccos effect on students grades.

P AofS grades
KO FF | 2
Figure 1: Percentage of high school students who engaged in tobacco use, by type
earned (mostly As, Bs, Cs, or Ds/Fs)United States, Youth Risk Behavior Survey, 2009

However, unlike the survey done in the 1950s, this one comes with a disclaimer:
Data presented below from the 2009 National Youth Risk Behavior Survey (YRBS) show a
negative association between tobacco use and academic achievement after controlling for
sex, race/ethnicity, and grade level. This means that students with higher grades are less
likely to engage in tobacco use behaviors than their classmates with lower grades, and
students who do not engage in tobacco use behaviors receive higher grades than their
classmates who do engage in tobacco use behaviors. These associations do not prove
causation. Further research is needed to determine whether low grades lead to tobacco use,
tobacco use leads to low grades, or some other factors lead to both of these problems.
While this survey may be conscientious about the limitations of its data, many of the graphs
and statistical information we are inundated with in our news media, job searches, economy,
medicine, and other applications draw upon spurious correlations in an attempt to make their point.
It is for this reason that the interpreter of these graphs, be it the common man reading the New York

PA S KO F F | 3

Times, or a top analyst for a large corporation, must examine each graph and statistical claim to
determine if there are other variables influencing the study.
As a biochemistry and molecular biology major interested in the field of genetics, one
correlation that has been especially troubling to me has been the link between genetically modified
foods and autism. According to Gary Goldstein, MD, president and CEO of Kennedy Krieger
Institute in Baltimore, determining the actual cause of autism is far more difficult than determining
the cause of cancer because you cannot biopsy or X-ray autism (Doheny, 2008). Nevertheless, this
hasnt deterred many organizations from propagating a rising U.S. autism rate as a direct
relationship to various other potentially harmful variables. A major candidate for the increased
autism rate is genetically modified organisms or GMOs. According to supporters of this hypothesis,
autism in the U.S. began increasing around the same time that genetically modified organisms were
introduced to the U.S. Thus, many graphs, such as Figure 2 below, have been created to depict this
shocking correlation.

Figure 2 (Talty, 2013)


Upon looking at this graph, you cant help but be outraged at the prospect of GMOs
infiltrating your local produce and destroying lives through increased autism rates. Yet, despite this
correlation, no proof exists that this is indeed the cause. In fact, there are many correlations we can
draw from rising GMO production. Looking at data collected from the Framingham Study, a longterm health study of individuals from Framingham, MA, incidences of dementia from 1996 (the
year GMOs were first introduced) - 2007 were shown to have drastically decreased when compared
to dementia rates from 1978 as shown in the graph below (Perrone, 2014).

PA S KO F F | 4

Figure 3 (Perrone, 2014)


When plotted together with the GMO chart like the graph from HealthyFamily.org, it appears to
offer compelling evidence for the notion that GMOs reduce the risk of dementia.

Figure 4 (Perrone, 2014)


Does this graph prove that GMOs reduce the risk for dementia over time? While pro-GMO
activists might attempt to use this correlation to help their case, this is just another example of using
perfectly valid statistics to bias the facts by drawing correlations that cannot be proven to be related
to one another.
How do we determine when statistics are telling the truth? According to Tyler Vigen, a
Harvard Law student and creator of the website, Spurious Correlations, Statistical data can show
correlations, and then its up to us, rational thinkers, to determine if there is an actual connection
between the data or if its merely a coincidence. While it is vital that we not ignore statistical data,

PA S KO F F | 5

we must always keep in mind its limitations and not be deceived by post hoc logic. Just because a
correlation exists, no matter how compelling the data appears, a correlation will never exclude the
possibility that some unmeasured variable has a more direct impact on the response variable.

PA S KO F F | 6
Works Cited
1. 2009 National Youth Risk Behavior Survey. Tobacco Use And Academic Achievement. 2009.
Web. http://www.cdc.gov/HealthyYouth/health_and_academics/pdf/tobacco_use.pdf
2. Doheny, Kathleen. "Autism: Cases on the Rise; Reason for Increase a Mystery." WebMD. 28
Mar. 2008. Web. http://www.webmd.com/brain/autism/searching-for-answers/autism-rise
3. Huff, Darrell, and Irving Geis. How To Lie With Statistics. New York: Norton, 1954. Print.
4. Perrone, Joseph. GMOS And Autism: Lies, Damned Lies, And Statistics. The Center for
Accountability in Science. 2014. Web. https://www.accountablescience.com/gmos-autismlies-damned-lies-statistics/
5. Perry, Susan. Root Canals Cause Cancer! (And Other Spurious Correlations). MinnPost.
2014. Web. http://www.minnpost.com/second-opinion/2014/05/root-canals-cause-cancerand-other-spurious-correlations
6. Talty, Caryn. Do GMO Foods Cause Autism? Read About The GMO Crops Autism Connection
The Liberty Beacon, 2013. Web. http://www.thelibertybeacon.com/2013/10/27/do-gmofoods-cause-autism-read-about-the-gmo-crops-autism-connection/

You might also like