Pearson

APPLE MARIE M.
BUENO
IV-SSC
PEARSONS R
Pearsons Correlation Efficient
Carl/Karl Pearson
The full name is the Pearson Product Moment Correlation or PPMC
CORRELATION
Technique for investigating the relationship between two quantitative,

continuous variables.
Example: Age and blood pressure
Measure of the strength of the association between the two variables.
Pearson's r is always between -1 and +1
Pearson's r is symmetric.
Correlation between x and y is the same as the correlation between y
and x.
also referred to as the "bivariate correlation coefficient"
Correlation coefficient assumes that the relationship is linear.
it is the proportion of variation that can be explained
When calculated from a population, Pearson's coefficient is denoted
with the Greek letter 'rho' ().
When calculated from a sample, it is denoted with 'r'.
A normal distribution.
A value of 0 indicates that there is no association between the two
variables.
A value greater than 0 indicates a positive association; that is, as the
value of one variable increases, so does the value of the other variable.
A value less than 0 indicates a negative association; that is, as the
value of one variable increases, the value of the other variable
decreases.
The closer the value of r gets to zero, the greater the variation the data points are around the line
of best fit.
High correlation: .5 to 1.0 or -0.5 to 1.0
Medium correlation: .3 to .5 or -0.3 to .5
Low correlation: .1 to .3 or -0.1 to -0.3
The variables must be approximately normally distributed

The variables must be either interval or ratio measurements
APPLE MARIE M. BUENO

IV-SSC
If r = +.70 or higher Very strong positive relationship

+.40 to +.69 Strong positive relationship
+.30 to +.39 Moderate positive relationship
+.20 to +.29 weak positive relationship
+.01 to +.19 No or negligible relationship
-.01 to -.19 No or negligible relationship
-.20 to -.29 weak negative relationship
-.30 to -.39 Moderate negative relationship
-.40 to -.69 Strong negative relationship
-.70 or higher Very strong negative relationship
WHERE:
r = Pearson r correlation coefficient
N = number of value in each data set
xy = sum of the products of paired scores
x = sum of x scores
y = sum of y scores
x2= sum of squared x scores
y2= sum of squared y scores

IV-SSC
Table of critical values for Pearson correlation N (not df) is in column 1
One Tailed Probabilities
0.05
0.025
0.005
0.0005
Two-Tailed Probabilities
N
0.1
0.05
0.01
0.001
0.900
0.950
0.990
0.999
0.805
0.878
0.959
0.991
0.729
0.811
0.917
0.974
0.669
0.754
0.875
0.951
0.621
0.707
0.834
0.925
0.582
0.666
0.798
0.898
10
0.549
0.632
0.765
0.872
11
0.521
0.602
0.735
0.847
12
0.497
0.576
0.708
0.823
13
0.476
0.553
0.684
0.801
14
0.458
0.532
0.661
0.780
15
0.441
0.514
0.641
0.760
16
0.426
0.497
0.623
0.742
17
0.412
0.482
0.606
0.725
18
0.400
0.468
0.590
0.708
19
0.389
0.456
0.575
0.693
20
0.378
0.444
0.561
0.679
21
0.369
0.433
0.549
0.665
22
0.360
0.423
0.537
0.652
23
0.352
0.413
0.526
0.640
24
0.344
0.404
0.515
0.629
25
0.337
0.396
0.505
0.618
26
0.330
0.388
0.496
0.607
27
0.323
0.381
0.487
0.597
28
0.317
0.374
0.479
0.588
29
0.311
0.367
0.471
0.579
30
0.306
0.361
0.463
0.570
35
0.283
0.334
0.430
0.532
40
0.264
0.312
0.403
0.501

IV-SSC
45
0.248
0.294
0.380
0.474
50
0.235
0.279
0.361
0.451
60
0.214
0.254
0.330
0.414
70
0.198
0.235
0.306
0.385
80
0.185
0.220
0.286
0.361
90
0.174
0.207
0.270
0.341
100
0.165
0.197
0.256
0.324
200
0.117
0.139
0.182
0.231
300
0.095
0.113
0.149
0.189
400
0.082
0.098
0.129
0.164
500
0.074
0.088
0.115
0.147
1000
0.052
0.062
0.081
0.104
EXAMPLE PROBLEM;
The following is a table of homework averages and final course grades for selected
students from a Finite Math class.
HW
96
CG
94

IV-SSC
100
98
78
83
60
65
92
85
Find the line of best least squares fit for predicting the final average in the course as a
function of the homework grade. What kind of a final average would this model
predict for a person who got a 90 for their homework average. Compute Pearson's r.
What does Pearson's r tell us about these distributions?
First, we need to find the means of the homework grades and the course averages. Let
us call the homework grades x and the course averages y.
x
Total
y
96
94
100
98
78
83
60
65
92
85
426
425
mean 85.2
85

IV-SSC
Now we can compute the deviations of the x's and y's
One way to check to make sure we got the means right is if the deviatins add up to 0.
Next we need columns for the squares of the deviations of the x's, squares of the
deviations of the y's, and the product of these deviations
We now have all the information we need to be able to complete the problem.
The equation of the regression line is of the form
y = mx + b
where
and
So the regression equation is approximately

y = .755988 x + 20.58982
To predict the final average for a person who got a 90 average on their homework,
substitute a 90 in for x in the regression equation

IV-SSC
y = .755988(90) + 20.58982
= 88.6
If we plot the data and add the graph of the regression line, it looks like
The course average is pretty close to the homework average. The constant term of
20.5898 tells us that a person who did no homework could expect to get about 20 or
21 as a final course average. That would not be enough to pass.
We also have enough information to compute Pearson's r.

Pearson

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pearson

Uploaded by

Copyright:

Available Formats

APPLE MARIE M.

Technique for investigating the relationship between two quantitative,

The variables must be approximately normally distributed

APPLE MARIE M. BUENO

If r = +.70 or higher Very strong positive relationship

APPLE MARIE M. BUENO

APPLE MARIE M. BUENO

APPLE MARIE M. BUENO

APPLE MARIE M. BUENO

Now we can compute the deviations of the x's and y's

So the regression equation is approximately

APPLE MARIE M. BUENO

You might also like