You are on page 1of 36

Lecture 4

Econ 488

Ordinary Least Squares (OLS)


Yi 0 1 X 1i 2 X 2i ... K X Ki i
Objective of OLS Minimize the sum of
squared residuals:
min

ei
i 1

where

ei Yi Yi

Remember that OLS is not the only possible


estimator of the s.
But OLS is the best estimator under certain
assumptions

Classical Assumptions
1. Regression is linear in parameters
2. Error term has zero population mean
3. Error term is not correlated with Xs
4. No serial correlation
5. No heteroskedasticity
6. No perfect multicollinearity
and we usually add:
7. Error term is normally distributed

Assumption 1: Linearity
The regression model:
A) is linear
It can be written as
Yi 0 1 X 1i 2 X 2i ... K X Ki i
This doesnt mean that the theory must be linear
For example suppose we believe that CEO salary is
related to the firms sales and CEOs tenure.
We might believe the model is:
log( salary ) i 0 1 log( salesi ) 2tenurei 3tenure2 i i

Assumption 1: Linearity
The regression model:
B) is correctly specified
The model must have the right variables
No omitted variables
The model must have the correct functional form
This is all untestable We need to rely on economic
theory.

Assumption 1: Linearity
The regression model:
C) must have an additive error term
The model must have + i

Assumption 2: E(i)=0
Error term has a zero population mean
E(i)=0
Each observation has a random error with
a mean of zero
What if E(i)0?
This is actually fixed by adding a constant
(AKA intercept) term

Assumption 2: E(i)=0
Example: Suppose instead the mean of i
was -4.
Then we know E(i+4)=0
We can add 4 to the error term and
subtract 4 from the constant term:
Yi =0+ 1Xi+i
Yi =(0-4)+ 1Xi+(i+4)

Assumption 2: E(i)=0
Yi =0+ 1Xi+i
Yi =(0-4)+ 1Xi+(i+4)
We can rewrite:
Yi =0*+ 1Xi+i*
Where 0*= 0-4 and i*=i+4
Now E(i*)=0, so we are OK.

Assumption 3: Exogeneity
Important!!
All explanatory variables are uncorrelated
with the error term
E(i|X1i,X2i,, XKi,)=0
Explanatory variables are determined
outside of the model (They are
exogenous)

Assumption 3: Exogeneity
What happens if assumption 3 is violated?
Suppose we have the model,
Yi =0+ 1Xi+i
Suppose Xi and i are positively correlated
When Xi is large, i tends to be large as
well.

Assumption 3: Exogeneity
120

100

True Line

80

60

True Line
40

20

0
0
-20

-40

10

15

20

25

Assumption 3: Exogeneity
120

100

Data
Data

80

True Line

60

True
TrueLine
Line

40

20

0
0
-20

-40

10

15

20

25

Assumption 3: Exogeneity
120

Estimated Line
100

Data
80

60

True Line
40

20

0
0
-20

-40

10

15

20

25

Assumption 3: Exogeneity
Why would x and be correlated?
Suppose you are trying to study the
relationship between the price of a
hamburger and the quantity sold across a
wide variety of Ventura County
restaurants.

Assumption 3: Exogeneity
We estimate the relationship using the
following model:
salesi= 0+1pricei+i
Whats the problem?

Assumption 3: Exogeneity
Whats the problem?
What else determines sales of hamburgers?
How would you decide between buying a
burger at McDonalds ($0.89) or a burger at TGI
Fridays ($9.99)?
Quality differs
salesi= 0+1pricei+i quality isnt an X
variable even though it should be.
It becomes part of i

Assumption 3: Exogeneity
Whats the problem?
But price and quality are highly positively
correlated
Therefore x and are also positively correlated.
This means that the estimate of 1will be too
high
This is called Omitted Variables Bias (More in
Chapter 6)

Assumption 4: No Serial Correlation


Serial Correlation: The error terms across
observations are correlated with each
other
i.e. 1 is correlated with 2, etc.
This is most important in time series
If errors are serially correlated, an
increase in the error term in one time
period affects the error term in the next.

Assumption 4: No Serial Correlation


The assumption that there is no serial
correlation can be unrealistic in time series
Think of data from a stock market

Real S&P 500 Stock Price Index

Assumption 4: No Serial Correlation


2000
1500
1000

Price
500
0

1870

-500

1920

1970

Year
Stock data is serially correlated!

2020

Assumption 5: Homoskedasticity
Homoskedasticity: The error has a
constant variance
This is what we wantas opposed to
Heteroskedasticity: The variance of the
error depends on the values of Xs.

Assumption 5: Homoskedasticity

Homoskedasticity: The error has constant variance

Assumption 5: Homoskedasticity

Heteroskedasticity: Spread of error depends on X.

Assumption 5: Homoskedasticity

Another form of Heteroskedasticity

Assumption 6: No Perfect Multicollinearity


Two variables are perfectly collinear if one
can be determined perfectly from the other
(i.e. if you know the value of x, you can
always find the value of z).
Example: If we regress income on age,
and include both age in months and age in
years.
But age in years = age in months/12
e.g. if we know someone is 246 months old, we
also know that they are 20.5 years old.

Assumption 6: No Perfect Multicollinearity


Whats wrong with this?
incomei= 0 + 1agemonthsi +
2ageyearsi + i
What is 1?
It is the change in income associated with
a one unit increase in age in months,
holding age in years constant.
But if you hold age in years constant, age in
months doesnt change!

Assumption 6: No Perfect Multicollinearity


1 = income/agemonths
Holding ageyears = 0
If ageyears = 0; then agemonths = 0
So 1 = income/0
It is undefined!

Assumption 6: No Perfect Multicollinearity


When more than one independent variable
is a perfect linear combination of the other
independent variables, it is called Perfect
MultiCollinearity
Example: Total Cholesterol, HDL and LDL
Total Cholesterol = LDL + HDL
Cant include all three as independent
variables in a regression.
Solution: Drop one of the variables.

Assumption 7: Normally Distributed Error

Assumption 7: Normally Distributed Error


This is required not required for OLS, but it
is important for hypothesis testing
More on this assumption next time.

Putting it all together


Last class, we talked about how to compare
estimators. We want:
1. is unbiased.
E ( )
on average, the estimator is equal to the population
value

2. is efficient
The variance of the estimator is as small as possible

Putting it all togehter

Gauss-Markov Theorem
Given OLS assumptions 1 through 6, the
OLS estimator of k is the minimum
variance estimator from the set of all linear
unbiased estimators of k for k=0,1,2,,K
OLS is BLUE
The Best, Linear, Unbiased Estimator

Gauss-Markov Theorem
What happens if we add assumption 7?
Given assumptions 1 through 7, OLS is
the best unbiased estimator
Even out of the non-linear estimators
OLS is BUE?

Gauss-Markov Theorem
With Assumptions 1-7 OLS is:
1. Unbiased: E ( )
2. Minimum Variance the sampling distribution
is as small as possible
3. Consistent as n, the estimators
converge to the true parameters
As n increases, variance gets smaller, so each estimate
approaches the true value of .

4. Normally Distributed. You can apply


statistical tests to them.

You might also like