Lecture 4

Lecture 4
Econ 488
Ordinary Least Squares (OLS)

Yi 0 1 X 1i 2 X 2i ... K X Ki i
Objective of OLS Minimize the sum of
squared residuals:
min
ei
i 1
where
ei Yi Yi
Remember that OLS is not the only possible

estimator of the s.
But OLS is the best estimator under certain
assumptions
Classical Assumptions
1. Regression is linear in parameters
2. Error term has zero population mean
3. Error term is not correlated with Xs
4. No serial correlation
5. No heteroskedasticity
6. No perfect multicollinearity
and we usually add:
7. Error term is normally distributed
Assumption 1: Linearity
The regression model:
A) is linear
It can be written as
Yi 0 1 X 1i 2 X 2i ... K X Ki i
This doesnt mean that the theory must be linear
For example suppose we believe that CEO salary is
related to the firms sales and CEOs tenure.
We might believe the model is:
log( salary ) i 0 1 log( salesi ) 2tenurei 3tenure2 i i
B) is correctly specified
The model must have the right variables
No omitted variables
The model must have the correct functional form
This is all untestable We need to rely on economic
theory.
C) must have an additive error term
The model must have + i
Assumption 2: E(i)=0
Error term has a zero population mean
E(i)=0
Each observation has a random error with
a mean of zero
What if E(i)0?
This is actually fixed by adding a constant
(AKA intercept) term
Example: Suppose instead the mean of i
was -4.
Then we know E(i+4)=0
We can add 4 to the error term and
subtract 4 from the constant term:
Yi =0+ 1Xi+i
Yi =(0-4)+ 1Xi+(i+4)
Yi =0+ 1Xi+i
Yi =(0-4)+ 1Xi+(i+4)
We can rewrite:
Yi =0*+ 1Xi+i*
Where 0*= 0-4 and i*=i+4
Now E(i*)=0, so we are OK.
Assumption 3: Exogeneity
Important!!
All explanatory variables are uncorrelated
with the error term
E(i|X1i,X2i,, XKi,)=0
Explanatory variables are determined
outside of the model (They are
exogenous)
What happens if assumption 3 is violated?
Suppose we have the model,
Yi =0+ 1Xi+i
Suppose Xi and i are positively correlated
When Xi is large, i tends to be large as
well.
120
100
True Line
80
60
True Line
40
20
0
0
-20
-40
10
15
20
25
120
100
Data
Data
80
True Line
60
True
TrueLine
Line
40
20
0
0
-20
-40
10
15
20
25
120
Estimated Line
100
Data
80
60
True Line
40
20
0
0
-20
-40
10
15
20
25
Why would x and be correlated?
Suppose you are trying to study the
relationship between the price of a
hamburger and the quantity sold across a
wide variety of Ventura County
restaurants.
We estimate the relationship using the
following model:
salesi= 0+1pricei+i
Whats the problem?
Whats the problem?
What else determines sales of hamburgers?
How would you decide between buying a
burger at McDonalds ($0.89) or a burger at TGI
Fridays ($9.99)?
Quality differs
salesi= 0+1pricei+i quality isnt an X
variable even though it should be.
It becomes part of i
Whats the problem?
But price and quality are highly positively
correlated
Therefore x and are also positively correlated.
This means that the estimate of 1will be too
high
This is called Omitted Variables Bias (More in
Chapter 6)
Assumption 4: No Serial Correlation

Serial Correlation: The error terms across
observations are correlated with each
other
i.e. 1 is correlated with 2, etc.
This is most important in time series
If errors are serially correlated, an
increase in the error term in one time
period affects the error term in the next.

The assumption that there is no serial
correlation can be unrealistic in time series
Think of data from a stock market
Real S&P 500 Stock Price Index

2000
1500
1000
Price
500
0
1870
-500
1920
1970
Year
Stock data is serially correlated!
2020
Assumption 5: Homoskedasticity
Homoskedasticity: The error has a
constant variance
This is what we wantas opposed to
Heteroskedasticity: The variance of the
error depends on the values of Xs.
Homoskedasticity: The error has constant variance
Heteroskedasticity: Spread of error depends on X.
Another form of Heteroskedasticity
Assumption 6: No Perfect Multicollinearity

Two variables are perfectly collinear if one
can be determined perfectly from the other
(i.e. if you know the value of x, you can
always find the value of z).
Example: If we regress income on age,
and include both age in months and age in
years.
But age in years = age in months/12
e.g. if we know someone is 246 months old, we
also know that they are 20.5 years old.

Whats wrong with this?
incomei= 0 + 1agemonthsi +
2ageyearsi + i
What is 1?
It is the change in income associated with
a one unit increase in age in months,
holding age in years constant.
But if you hold age in years constant, age in
months doesnt change!

1 = income/agemonths
Holding ageyears = 0
If ageyears = 0; then agemonths = 0
So 1 = income/0
It is undefined!

When more than one independent variable
is a perfect linear combination of the other
independent variables, it is called Perfect
MultiCollinearity
Example: Total Cholesterol, HDL and LDL
Total Cholesterol = LDL + HDL
Cant include all three as independent
variables in a regression.
Solution: Drop one of the variables.
Assumption 7: Normally Distributed Error
Assumption 7: Normally Distributed Error

This is required not required for OLS, but it
is important for hypothesis testing
More on this assumption next time.
Putting it all together

Last class, we talked about how to compare
estimators. We want:
1. is unbiased.
E ( )
on average, the estimator is equal to the population
value
2. is efficient
The variance of the estimator is as small as possible
Putting it all togehter
Gauss-Markov Theorem
Given OLS assumptions 1 through 6, the
OLS estimator of k is the minimum
variance estimator from the set of all linear
unbiased estimators of k for k=0,1,2,,K
OLS is BLUE
The Best, Linear, Unbiased Estimator
What happens if we add assumption 7?
Given assumptions 1 through 7, OLS is
the best unbiased estimator
Even out of the non-linear estimators
OLS is BUE?
With Assumptions 1-7 OLS is:
1. Unbiased: E ( )
2. Minimum Variance the sampling distribution
is as small as possible
3. Consistent as n, the estimators
converge to the true parameters
As n increases, variance gets smaller, so each estimate
approaches the true value of .
4. Normally Distributed. You can apply

statistical tests to them.

Lecture 4

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 4

Uploaded by

Copyright:

Available Formats

Lecture 4

Ordinary Least Squares (OLS)

Remember that OLS is not the only possible

Assumption 4: No Serial Correlation

Assumption 4: No Serial Correlation

Real S&P 500 Stock Price Index

Assumption 4: No Serial Correlation

Homoskedasticity: The error has constant variance

Heteroskedasticity: Spread of error depends on X.

Another form of Heteroskedasticity

Assumption 6: No Perfect Multicollinearity

Assumption 6: No Perfect Multicollinearity

Assumption 6: No Perfect Multicollinearity

Assumption 6: No Perfect Multicollinearity

Assumption 7: Normally Distributed Error

Assumption 7: Normally Distributed Error

Putting it all together

Putting it all togehter

4. Normally Distributed. You can apply

You might also like