Professional Documents
Culture Documents
The following data were collected to study the relationship between the sale price, y
and the total appraised value, x, of a residential property located in an upscale
neighborhood.
Property
1
2
3
4
5
x
2
3
4
5
6
20
x2
4
9
16
25
36
90
y
2
5
7
10
11
35
y2
4
25
49
100
121
299
x2
xy
4
15
28
50
66
163
y2
xy
n xy ( x)( y )
2
n( x ) ( x ) 2 n( y 2 ) ( y ) 2
o 1 x
n xy ( x)( y )
1
n x 2 ( x ) 2
0 y 1 x
14
12
SalePric
10
8
6
Y = -2.2 + 2.3X
R-Squared = 0.980
4
2
0
2
App val
Sale Price, y
$100,000
2
5
7
10
11
2.4
4.7
7
9.3
11.6
= -2.2 + 2.3x
(y -
-.4
.3
0
.7
-.6
(y -
)2
y
.16
.09
0
.49
.36
(y - y )2 = 1.1
*******************************************************************
The method of least squares chooses the prediction line y = B o + B 1x
that
2
minimizes the sum of the squared errors of prediction (y - y ) for all sample
points.
*******************************************************************
When talking about regression equations, the following are terms used for x and y
x: predictor variable, explanatory variable, or independent variable
y: response variable or dependent variable
Extrapolation is the use of the least-squares line for prediction outside the range of
values of the explanatory variable x that you used to obtain the line. Extrapolation
should not be done!
Measuring the Contribution of x in Predicting y
We can consider how much the errors of prediction of y were reduced by using the
information provided by x.
r2 (Coefficient of Determination) =
2
2
(y - y) - (y - y)
2
(y - y)
The coefficient of determination, r2, represents the proportion of the total sample
variation in y (measured by the sum of squares of deviations of the sample y values
about their mean y ) that is explained by (or attributed to) the linear relationship
between x and y.
Appraisal
Value, x
$100,000
2
3
4
5
6
Sale Price,
y $100,000
2
5
7
10
11
r (Coefficient of Determination) =
y y
2.4
4.7
7
9.3
11.6
-.4
.3
0
.7
-.6
( y y )2
.16
.09
0
.49
.36
1.1
2
2
(y - y) - (y - y)
2
(y - y)
( y y )2
25
4
0
9
16
54
54 1.1
.98
54
Interpretation: 98% of the total sample variation in y is explained by the straightline relationship between y and x, with the total sample variation in y being
measured by the sum of squares of deviations of the sample y values about their
mean y .