Professional Documents
Culture Documents
Mr. Mun
Period 1
I researched on how many AP classes a student takes correlates to how many hours they
spend on their phone in a day. I conducted my research by using a Google Survey with the
obvious two questions: “How many AP classes are you taking? How many hours do you spend
on your phone per day?” Then, I uploaded the link onto my Instagram bio, then informing my
followers of this by mentioning it on my Instagram Story. Because there weren’t a lot of people
taking the survey, I became desperate; therefore, I kindly asked each one of my friends via
Snapchat to take the survey. Through the collecting of a total of sixty data points from my lovely
Instagram followers and Snapchat friends, my graph told me four things: negative correlation,
linear, medium strength, and one significant outlier of the data point (1,19) -- 1 being the number
of AP classes taken (x), and 19 being the hours spent on phone (y). The correlation coefficient is
-0.24497. The standard deviation of the residuals is 3.44526. The coefficient of determination is
0.06001.
Based on my results, I can conclude that while the correlation of the two does not imply a
causation, there was a fairly evident negative slope of the graph. In other words, for the most
part, the more AP classes a student took, the less amount of hours they spent on their phone. The
correlation coefficient was -0.24497; therefore, we can assume that there was a weak negative
correlation between the values of +1 and -1 — as the it is a negative number closer to 0, further
away from -1 (that indicates a strong negative correlation). The standard deviation of the
residuals, or the standard error or estimate, which measures the accuracy of the dependent
variable being measured (hours spent on phone) was 3.44526; therefore, we can assume that the
line of best fit from the regression line shown in the scatter plot is not too accurate in
determining the line of best fit, as it is not close to the standard deviation, 1.3707. Lastly, the
coefficient of determination was 0.06001; therefore, we can assume that 6% of the variance in
the response variable can be explained by the explanatory variable. And no, we cannot use a
linear model; according to the residual plot, there is an evident pattern of the data points -- a
negative linear pattern. We can only use a linear if and only if the data points are scattered
Sample size: 60
Mean x (x̄): 2.1333333333333
Mean y (y): 5.0666666666667
Intercept (a): 6.4779582366589
Slope (b): -0.66154292343387
Regression line equation: y=6.4779582366589-0.66154292343387x