Professional Documents
Culture Documents
BUENO
IV-SSC
PEARSONS R
Pearsons Correlation Efficient
Carl/Karl Pearson
The full name is the Pearson Product Moment Correlation or PPMC
CORRELATION
The closer the value of r gets to zero, the greater the variation the data points are around the line
of best fit.
High correlation: .5 to 1.0 or -0.5 to 1.0
Medium correlation: .3 to .5 or -0.3 to .5
Low correlation: .1 to .3 or -0.1 to -0.3
WHERE:
r = Pearson r correlation coefficient
N = number of value in each data set
xy = sum of the products of paired scores
x = sum of x scores
y = sum of y scores
x2= sum of squared x scores
y2= sum of squared y scores
0.025
0.005
0.0005
Two-Tailed Probabilities
N
0.1
0.05
0.01
0.001
0.900
0.950
0.990
0.999
0.805
0.878
0.959
0.991
0.729
0.811
0.917
0.974
0.669
0.754
0.875
0.951
0.621
0.707
0.834
0.925
0.582
0.666
0.798
0.898
10
0.549
0.632
0.765
0.872
11
0.521
0.602
0.735
0.847
12
0.497
0.576
0.708
0.823
13
0.476
0.553
0.684
0.801
14
0.458
0.532
0.661
0.780
15
0.441
0.514
0.641
0.760
16
0.426
0.497
0.623
0.742
17
0.412
0.482
0.606
0.725
18
0.400
0.468
0.590
0.708
19
0.389
0.456
0.575
0.693
20
0.378
0.444
0.561
0.679
21
0.369
0.433
0.549
0.665
22
0.360
0.423
0.537
0.652
23
0.352
0.413
0.526
0.640
24
0.344
0.404
0.515
0.629
25
0.337
0.396
0.505
0.618
26
0.330
0.388
0.496
0.607
27
0.323
0.381
0.487
0.597
28
0.317
0.374
0.479
0.588
29
0.311
0.367
0.471
0.579
30
0.306
0.361
0.463
0.570
35
0.283
0.334
0.430
0.532
40
0.264
0.312
0.403
0.501
0.248
0.294
0.380
0.474
50
0.235
0.279
0.361
0.451
60
0.214
0.254
0.330
0.414
70
0.198
0.235
0.306
0.385
80
0.185
0.220
0.286
0.361
90
0.174
0.207
0.270
0.341
100
0.165
0.197
0.256
0.324
200
0.117
0.139
0.182
0.231
300
0.095
0.113
0.149
0.189
400
0.082
0.098
0.129
0.164
500
0.074
0.088
0.115
0.147
1000
0.052
0.062
0.081
0.104
EXAMPLE PROBLEM;
The following is a table of homework averages and final course grades for selected
students from a Finite Math class.
HW
96
CG
94
100
98
78
83
60
65
92
85
Find the line of best least squares fit for predicting the final average in the course as a
function of the homework grade. What kind of a final average would this model
predict for a person who got a 90 for their homework average. Compute Pearson's r.
What does Pearson's r tell us about these distributions?
First, we need to find the means of the homework grades and the course averages. Let
us call the homework grades x and the course averages y.
x
Total
y
96
94
100
98
78
83
60
65
92
85
426
425
mean 85.2
85
One way to check to make sure we got the means right is if the deviatins add up to 0.
Next we need columns for the squares of the deviations of the x's, squares of the
deviations of the y's, and the product of these deviations
We now have all the information we need to be able to complete the problem.
The equation of the regression line is of the form
y = mx + b
where
and
y = .755988(90) + 20.58982
= 88.6
If we plot the data and add the graph of the regression line, it looks like
The course average is pretty close to the homework average. The constant term of
20.5898 tells us that a person who did no homework could expect to get about 20 or
21 as a final course average. That would not be enough to pass.
We also have enough information to compute Pearson's r.