CHP 3 Notes, Gujarati

Chapter 3.
Estimation
Ordinary least squares (OLS) method
The two-variable population regression function is given by Yi = β1 + β2 X i + ui , but we do not

observe it so we estimate it from the sample regression function Yi = βˆ1 + βˆ2 X i + uî . This
function is equal to Yi =Yî +uî .
We can rewrite the sample regression function as uî = Yi −Yî = Yi − βˆ1 − βˆ2 X i . In other words,
the residuals are the differences between the actual and the estimated Yi values.
With n observations, we want to choose β̂1 and β̂2 such that the sum of the residuals is
n n
minimized: ∑uî = ∑(Yi − Yî ). This turns out not to be a very good rule because some
i =1 i =1
residuals are negative and some are positive (and they would cancel each other), and all residuals
have the same weight (importance) even though some are small and some are large.
Least squares criterion: Minimize ∑uî =∑ =∑

2 2
2
(Yi −Yî ) ˆ −β
(Yi −β ˆ X ) with respect to
1 2 i
β̂1 and β̂2 .
Partial differentiation yields

∂(∑uî )
2
= −2∑(Yi − βˆ1 − βˆ2 X i )

∂βˆ 1
∂(∑uî )
2
= −2∑(Yi − βˆ1 − βˆ2 X i )X i

∂βˆ 2
Setting these two equations equal to zero and rearranging terms yields the normal equations
∑Y i
ˆ +β
= nβ1
ˆ
2 ∑X i
∑Y X = βˆ1 ∑ X i + βˆ2 ∑ X i
2
i i
These equations can be solved simultaneously to obtain
βˆ2 = ∑x y i i
and β
ˆ =Y −β
ˆ X
, where xi = X i − X and yi = Yi −Y (these are
∑x 2 1 2
deviations from the mean values of X and Y.
Numerical properties of least-squares estimators:

I. OLS estimators are easy to compute because they are expressed in terms of
observable quantities
II. They are point estimators
III. Sample regression line can be easily obtained after OLS because:
1. It passes through the sample means of X and Y (i.e., Y = βˆ1 + βˆ2 X )
2. The mean value of the estimated Y is equal to the mean value of the actual Y
3. The mean value of the residuals ûi is zero (see proof on page 64)
4. The residuals ûi are uncorrelated with the predicted Yi (see proof on page 65)
5. The residuals ûi are uncorrelated with X i
Classical Linear Regression Model (CLRM) Assumptions:
A1: Linear regression model [the regression model is linear in the parameters]
A2: X values are fixed in repeated sampling [X is nonstochastic]
A3: Zero mean value of disturbance ui
A4: Homoscedasticity [ Var (ui | X i ) = σ ]

2
A5: No autocorrelation [correlation between any ui and u j ( i ≠ j ) is zero]
A6: Zero covariance between ui and X i [ E (ui X i ) = 0 ]
A7: The number of observations n must be greater than the number of parameters k.
A8: Variability in X values [ Var ( X ) > 0 ]
A9: The regression model is correctly specified [no specification bias]
A10: There is no perfect multicollinearity [there are no perfect linear relationships among
explanatory variables]
Variance and standard errors of least-squares estimates

σ2 σ
Var ( βˆ2 ) = and se ( βˆ2 ) =
2
∑x i
2
∑x i
Var ( βˆ1 ) =
∑X i
2
σ and se ( βˆ1 ) =
2 ∑X i
2
σ
n∑ x i
2
n∑ x i
2
How do we estimate the variance of ui , σ 2 ? Using the OLS estimator of σ 2 , σˆ 2 =

∑ uî2 ,
n−2
where ∑uî is the residual sum of squares (RSS) and n − 2 are the number of degrees of
2
freedom (df).
σˆ =
∑uˆ 2
i
= the standard error of estimate or standard error of regression. This is the standard
n−2
deviation of the Y values around the estimated regression line, and it is often used as a measure
of “goodness of fit”.
Properties of Least-squares Estimators
An estimator is said to be the best linear unbiased estimator (BLUE) if:

1. It is linear
2. It is unbiased [the expected value is equal to the true value]
3. It is efficient [efficiency means that the estimator has the minimum variance within the
class of all linear unbiased estimators]
Gauss-Markov Theorem
Given the assumptions of the CLRM, the least-squares estimators, in the class of unbiased linear
estimators, have minimum variance, that, is, they are BLUE.
The properties presented above are finite (small) sample properties. We will discuss large sample
properties later on.
Coefficient of Determination (R-squared)
r2 (two-variable case) or R2 (multiple regression) tells us how well the sample regression line fits
the data.
∑(Yˆ ∑uˆ 2
−Y ) 2 ESS i RSS
r2 = i
= or, alternatively, as r 2 =1 − =1 −
∑(Yi −Y ) 2 TSS ∑(Y −Y )
i
2
TSS
Where ESS = explained sum of squares; RSS = residual sum of squares; and TSS = total sum of
squares.
r2 is called the Coefficient of Determination and it “measures the percentage of the total variation
in Y explained by the regression model”. r2 is a non-negative number that ranges from zero and
one. Zero means no fit and an r-squared of one means a perfect fit.
r =± r 2 =
∑x y i i
The sample correlation coefficient can be estimated as (∑xi2 )( ∑yi2 )
Properties of r:
1. It can be positive or negative [depends on the sign of the numerator]
2. −1 ≤ r ≤ 1
3. It is symmetrical [i.e., you get the same value whether you calculate it between X and Y,
or between Y and X]
4. It is independent of the origin and scale
5. If X and Y are independent then the correlation coefficient is zero [but zero correlation
does not necessarily imply independence
6. It is a measure of linear association only
7. It does not imply that there is any cause-and-effect relationship
Monte Carlo Experiment (See Example on Page 92)
A Monte Carlo experiment is essentially a computer simulation that is useful to check the
sampling properties of estimators. If you know the true value of the parameters then you would
choose the sample size, fix the values of the independent variables at a given level and draw
random numbers of the residual to obtain values of the dependent variable. You can do this since
you know the X’s, the betas and u. The generated values for Y are then used with the values of X
to get the parameter estimates (the estimated betas).
You would repeat this experiment 100 or 1,000 times, which will generate 100 or 1,000
parameter estimates. If the average values of these estimates are close to the true values then the
Monte Carlo experiment tells you that your estimator is unbiased. In general, Monte Carlo
experiments are useful when we want to know the statistical properties of different ways of
estimating population parameters.

CHP 3 Notes, Gujarati

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CHP 3 Notes, Gujarati

Uploaded by

Copyright:

Available Formats

Chapter 3.

Ordinary least squares (OLS) method

The two-variable population regression function is given by Yi = β1 + β2 X i + ui , but we do not

Least squares criterion: Minimize ∑uˆi =∑ =∑

β̂1 and β̂2 .

Partial differentiation yields

= −2∑(Yi − βˆ1 − βˆ2 X i )

= −2∑(Yi − βˆ1 − βˆ2 X i )X i

These equations can be solved simultaneously to obtain

deviations from the mean values of X and Y.

Numerical properties of least-squares estimators:

Classical Linear Regression Model (CLRM) Assumptions:

A2: X values are fixed in repeated sampling [X is nonstochastic]

A3: Zero mean value of disturbance ui

A4: Homoscedasticity [ Var (ui | X i ) = σ ]

A5: No autocorrelation [correlation between any ui and u j ( i ≠ j ) is zero]

A6: Zero covariance between ui and X i [ E (ui X i ) = 0 ]

A8: Variability in X values [ Var ( X ) > 0 ]

A9: The regression model is correctly specified [no specification bias]

Variance and standard errors of least-squares estimates

How do we estimate the variance of ui , σ 2 ? Using the OLS estimator of σ 2 , σˆ 2 =

An estimator is said to be the best linear unbiased estimator (BLUE) if:

Coefficient of Determination (R-squared)

You might also like