You are on page 1of 6

Chapter 02

Ch.2 The simple regression model


1.
2.
3.
4.
5.
6.

The Simple Regression Model


y = 0 + 1x + u

Econometrics

Definition of the simple regression model


Deriving the OLS estimates
Mechanics of OLS
Units of measurement & functional form
Expected values & variances of OLSE
Regression through the origin

Econometrics

Econometrics

2.1 Definition of the model


Equation (2.1), y = 0 + 1x + u, defines the
Simple Regression model.
In the model, we typically refer to

y as the Dependent Variable


x as the Independent Variable
s as parameters, and
u as the error term.

Econometrics

The Concept of Error Term

A Simple Assumption for u

u represents factors other than x that affect y.


If the other factors in u are held fixed, so that
u = 0, then y = 1x.
Ex. 2.1: yield = 0 + 1fertilizer + u (2.3)

u includes land quality, rainfall, etc.

Ex. 2.2: wage = 0 + 1educ + u

(2.4)

u includes experience, ability, tenure, etc.


Econometrics

Simple Regression Model

The average value of u, the error term, in


the population is 0. That is, E(u) = 0.

This is not a restrictive assumption, since we


can always use 0 to normalize E(u) to 0.

To draw ceteris paribus conclusions about


how x affects y, we have to hold all other
factors (in u) fixed.

Econometrics

Chapter 02

E(y|x) as a linear function of x, where for any x


the distribution of y is centered about E(y|x)

Zero Conditional Mean

We need to make a crucial assumption


about how u and x are related.
We want it to be the case that knowing
something about x does not give us any
information about u, so that they are
completely unrelated. That is, that

f(y)

x1

E(y|x) = 0 + 1x
.
u4 {

y3
y2

.} u1

y1

(B.27)

Now we prepare 2 restrictions to estimate s.


(2.10)
(2.11)

Simple Regression Model

x2

x3

x4

Econometrics

x
10

Deriving OLSE using MM

Since u = y 0 1x, we can rewrite;


(2.12)
E(u) = E(y 0 1x) = 0
(2.13)
E(xu) = E[x(y 0 1x)] = 0
These are called moment restrictions

E(u|x) = E(u) = 0 also implies that


Cov(x,u) = E(xu) = 0

Econometrics

.} u3

u {.

Cont.

To derive the OLS estimates, we need to


realize that our main assumption of

y
y4

Deriving OLSE using MM

E(u) = 0
E(xu) = 0

Econometrics

x1
Econometrics

Population regression line, sample data points


and the associated error terms

Basic idea of regression is to estimate the


population parameters from a sample.
Let {(xi,yi): i = 1, , n} denote a random
sample of size n from the population.
For each observation in this sample, it will
be the case that
yi = 0 + 1xi + ui. (2.9)

Because Cov(X,Y) = E(XY) E(X)E(Y)

x2

2.2 Deriving the OLSE

E(u|x) = E(u) = 0 (2.5&2.6), which implies


E(y|x) = 0 + 1x (PRF)
2.8
Econometrics

. E(y|x) = + x

11

The approach to estimation implies imposing the


population moment restrictions on the sample
moments. It means, a sample estimator of E(X),
the mean of a population distribution, is simply
the arithmetic mean of the sample.
Econometrics

12

Chapter 02

More Derivation of OLS

Cont.

We want to choose values of the parameters


that will ensure that the sample versions of
our moment restrictions are true
The sample versions are as follows:
n

Given the definition of a sample mean, and


properties of summation, we can rewrite the first
condition as follows

y 0 1 x (2.16) or 0 y 1 x (2.17)
So the OLS estimated slope is

n 1 yi 0 1 xi 0 (2.14)
i 1
n

More Derivation of OLS

n 1 xi yi 0 1 xi 0 (2.15)
i 1

Econometrics

Sample regression line, sample data points


and the associated estimated error terms

y1

Econometrics

u y
n

i 1

i 1

0 1 xi

(2.22)

The first order conditions, which are the


almost same as (2.14) & (2.15),

y
n

.} 1

i 1

x1

16

Given the intuitive idea of fitting a line, we


can set up a formal minimization problem.

.} 3

14

Alternate approach to derivation

y 0 1 x
2 {

Econometrics

15

4 {

(2.19)

2
xi x

Intuitively, OLS is fitting a line through the


sample points such that the sum of squared
residuals is as small as possible, hence the
term is called least squares.
The residual, , is an estimate of the error
term, u, and is the difference between the
fitted line (sample regression function) and
the sample point.

See (2.18) & Figure (2.3)

y4

More OLS

The slope estimate is the sample


covariance between x and y divided by the
sample variance of x.
If x and y are positively (negatively)
correlated, the slope will be positive
(negative).
x needs to vary in our sample.

13

Summary of OLS slope estimate

y3
y2

i 1

i 1

Econometrics

x x y

x2
Econometrics

Simple Regression Model

x3

x4

0 1 xi 0,
xi yi 0 1 xi 0
i 1

x
17

Econometrics

18

Chapter 02

2.3 Properties of OLS

Cont.

2. The sample covariance between the

Algebraic Properties of OLS

regressors and the OLS residuals is zero

1. The sum of the OLS residuals is zero.

x u

Thus, the sample average of the OLS


residuals is zero as well.
n

u
i 1

0 and thus,

1
n

u
i 1

Econometrics

Cont.

i 1

19

Econometrics

20

Goodness-of-Fit

y y SST (2.33)
y y SSE (2.34)
u SSR (2.35)
2

2
i

Then, SST SSE SSR (2.36)


21

2.4 Measurement Units & Function Form


If we use the model y* = 0* + 1* x* + u*
instead of y = 0 + 1 x + u, we get
c
0* c0 and 1* 1
d
where y* = c y and x* = d x. Similarly,
y x
y

1
x y
x

Its useful we think about how well the


sample regression line fits sample data.
From (2.36),
SSE
SSR

(2.38).
R2
1
SST
SST
R2 indicates the fraction of the sample
variation in yi that is explained by the
model.
Econometrics

22

2.5 Means & Variance of OLSE


Now, we view i as estimators for the parameters
i that appears in the population, which means
properties of the distributions of i over different
random samples from the population.

Unbiasedness of OLS
Unbiased estimator: An estimator whose expected
value (or mean of its sampling distribution) equals
the population value (regardless of the population
value).

where y* = ln y and x* = ln x.

Simple Regression Model

0 ( 2.31)

y 0 1 x

up of an explained part, and an unexplained part,


yi y i ui (2.32) Then we define the following :

Econometrics

through the mean of the sample

We can think of each observation as being made

1*

3. The OLS regression line always goes

(2.30)

Algebraic Properties

Econometrics

Algebraic Properties

23

Econometrics

24

Chapter 02

Cont.

Unbiasedness of OLS

Cont.

In order to think about unbiasedness, we


need to rewrite our estimator in terms of
the population parameter.

Assumption for unbiasedness


1. Linear in parameters as y = 0 + 1x + u
2. Random sampling {(xi, yi): i = 1, 2, , n},
Thus, yi = 0 + 1xi + ui
3. Sample variation in the xi, thus

(x x)
i

Unbiasedness of OLS

xi x yi

(x x)

then E 1 1

4. Zero conditional mean, E(u|x) = 0

x x u
(x x)
i

i
2

(2.49), (2.52)

x x E (u
(x x)
i

| x ) 1
(2.53)

* we can also get E ( 0 ) 0 in the same way.


Econometrics

25

Unbiasedness Summary

Cont.

26

Variances of the OLS Estimators

The OLS estimates of 1 and 0 are


unbiased.
Proof of unbiasedness depends on our 4
assumptions if any assumption fails, then
OLS is not necessarily unbiased.
Remember unbiasedness is a description of
the estimator in a given sample our
estimate may be near or far from the
true parameter.
Econometrics

Econometrics

Now we know that the sampling


distribution of our estimate is centered
around the true parameter.

We want to think about how spread out this


distribution is.
It is much easier to think about this variance
under an additional assumption, so assume

5. Var(u|x) = 2 (Homoskedasticity)

27

Econometrics

28

Homoskedastic Case

Variance of OLSE

2 is also the unconditional variance, called


the error variance, since

f(y|x)

Var(u|x) = E(u2|x) - [E(u|x)]2


2
2
2
E(u|x) = 0, so = E(u |x) = E(u ) = Var(u)
And , the square root of the error variance, is
called the standard deviation of the error.
Then we can say

. E(y|x) = + x

E(y|x)=0 + 1x and Var(y|x) = 2


Econometrics

Simple Regression Model

x1
29

x2
Econometrics

30

Chapter 02

Heteroskedastic Case

Cont.

Variance of OLSE

f(y|x)
Var ( 1 )

.
x1

x2

x3

E(y|x) = 0 + 1x

Econometrics

We dont know what is the error variance,

What we observe are only the residuals, i,


not the errors, ui.
So we can use the residuals to form an
estimate of the error variance.
Econometrics

33

Estimate

32

Estimate

ui yi 0 1 xi
0 1 xi ui 0 1 xi
u x
i

Then, an unbiased estimator of 2 is


1
u 2
n 2 i

(2.61)

Econometrics

34

Now, consider the model without a intercept:


~
~
y 1 x (2.63).

Solving the FOC to the minimization


problem, OLS estimated slope is
xy
~
1 i 2 i (2.66).
xi

If we substitute for , then we have


the standard error of ,
1

(
x
i x )2
2

Simple Regression Model

Econometrics

2.6 Regression through the Origin

recall that s.d. Var ( )

Econometrics

( 2 . 57 )

x )2

The larger the error variance, 2, the larger


the variance of the slope estimate.
The larger the variability in the xi, the
smaller the variance of the slope estimate.
As a result, a larger sample size should
decrease the variance of the slope estimate.

2 Standard error of the regression

Cont. Error Variance

2, because we dont observe the errors, ui.

se 1

(x

31

Estimating the Error Variance

Cont. Error Variance

* Recall that a intercept can always normalize E(u)


to 0 in the model with 0.
35

Econometrics

36

You might also like