Simple Regression Notes Econometrics

Chapter 02
Ch.2 The simple regression model

1.
2.
3.
4.
5.
6.
The Simple Regression Model

y = 0 + 1x + u
Econometrics
Definition of the simple regression model

Deriving the OLS estimates
Mechanics of OLS
Units of measurement & functional form
Expected values & variances of OLSE
Regression through the origin
Econometrics
Econometrics
2.1 Definition of the model

Equation (2.1), y = 0 + 1x + u, defines the
Simple Regression model.
In the model, we typically refer to
y as the Dependent Variable

x as the Independent Variable
s as parameters, and
u as the error term.
Econometrics
The Concept of Error Term
A Simple Assumption for u
u represents factors other than x that affect y.

If the other factors in u are held fixed, so that
u = 0, then y = 1x.
Ex. 2.1: yield = 0 + 1fertilizer + u (2.3)
u includes land quality, rainfall, etc.
Ex. 2.2: wage = 0 + 1educ + u
(2.4)
u includes experience, ability, tenure, etc.

Econometrics
Simple Regression Model
The average value of u, the error term, in

the population is 0. That is, E(u) = 0.
This is not a restrictive assumption, since we

can always use 0 to normalize E(u) to 0.
To draw ceteris paribus conclusions about

how x affects y, we have to hold all other
factors (in u) fixed.
Econometrics
Chapter 02
E(y|x) as a linear function of x, where for any x

the distribution of y is centered about E(y|x)
Zero Conditional Mean
We need to make a crucial assumption

about how u and x are related.
We want it to be the case that knowing
something about x does not give us any
information about u, so that they are
completely unrelated. That is, that
f(y)
x1
E(y|x) = 0 + 1x
.
u4 {
y3
y2
.} u1
y1
(B.27)
Now we prepare 2 restrictions to estimate s.

(2.10)
(2.11)
x2
x3
x4
Econometrics
x
10
Deriving OLSE using MM
Since u = y 0 1x, we can rewrite;

(2.12)
E(u) = E(y 0 1x) = 0
(2.13)
E(xu) = E[x(y 0 1x)] = 0
These are called moment restrictions
E(u|x) = E(u) = 0 also implies that

Cov(x,u) = E(xu) = 0
Econometrics
.} u3
u {.
Cont.
To derive the OLS estimates, we need to

realize that our main assumption of
y
y4
Deriving OLSE using MM
E(u) = 0
E(xu) = 0
Econometrics
x1
Econometrics
Population regression line, sample data points

and the associated error terms
Basic idea of regression is to estimate the

population parameters from a sample.
Let {(xi,yi): i = 1, , n} denote a random
sample of size n from the population.
For each observation in this sample, it will
be the case that
yi = 0 + 1xi + ui. (2.9)
Because Cov(X,Y) = E(XY) E(X)E(Y)
x2
2.2 Deriving the OLSE
E(u|x) = E(u) = 0 (2.5&2.6), which implies

E(y|x) = 0 + 1x (PRF)
2.8
Econometrics
. E(y|x) = + x
11
The approach to estimation implies imposing the

population moment restrictions on the sample
moments. It means, a sample estimator of E(X),
the mean of a population distribution, is simply
the arithmetic mean of the sample.
Econometrics
12
Chapter 02
More Derivation of OLS
Cont.
We want to choose values of the parameters

that will ensure that the sample versions of
our moment restrictions are true
The sample versions are as follows:
n
Given the definition of a sample mean, and

properties of summation, we can rewrite the first
condition as follows
y 0 1 x (2.16) or 0 y 1 x (2.17)
So the OLS estimated slope is
n 1 yi 0 1 xi 0 (2.14)
i 1
n
More Derivation of OLS
n 1 xi yi 0 1 xi 0 (2.15)
i 1
Econometrics
Sample regression line, sample data points

and the associated estimated error terms
y1
Econometrics
u y
n
i 1
i 1
0 1 xi
(2.22)
The first order conditions, which are the

almost same as (2.14) & (2.15),
y
n
.} 1
i 1
x1
16
Given the intuitive idea of fitting a line, we

can set up a formal minimization problem.
.} 3
14
Alternate approach to derivation
y 0 1 x
2 {
Econometrics
15
4 {
(2.19)
2
xi x
Intuitively, OLS is fitting a line through the

sample points such that the sum of squared
residuals is as small as possible, hence the
term is called least squares.
The residual, , is an estimate of the error
term, u, and is the difference between the
fitted line (sample regression function) and
the sample point.
See (2.18) & Figure (2.3)
y4
More OLS
The slope estimate is the sample

covariance between x and y divided by the
sample variance of x.
If x and y are positively (negatively)
correlated, the slope will be positive
(negative).
x needs to vary in our sample.
13
Summary of OLS slope estimate
y3
y2
i 1
i 1
Econometrics
x x y
x2
Econometrics
x3
x4
0 1 xi 0,
xi yi 0 1 xi 0
i 1
x
17
Econometrics
18
Chapter 02
2.3 Properties of OLS
Cont.
2. The sample covariance between the
Algebraic Properties of OLS
regressors and the OLS residuals is zero
1. The sum of the OLS residuals is zero.
x u
Thus, the sample average of the OLS

residuals is zero as well.
n
u
i 1
0 and thus,
1
n
u
i 1
Econometrics
Cont.
i 1
19
Econometrics
20
Goodness-of-Fit
y y SST (2.33)
y y SSE (2.34)
u SSR (2.35)
2
2
i
Then, SST SSE SSR (2.36)

21
2.4 Measurement Units & Function Form

If we use the model y* = 0* + 1* x* + u*
instead of y = 0 + 1 x + u, we get
c
0* c0 and 1* 1
d
where y* = c y and x* = d x. Similarly,
y x
y
1
x y
x
Its useful we think about how well the

sample regression line fits sample data.
From (2.36),
SSE
SSR
(2.38).
R2
1
SST
SST
R2 indicates the fraction of the sample
variation in yi that is explained by the
model.
Econometrics
22
2.5 Means & Variance of OLSE

Now, we view i as estimators for the parameters
i that appears in the population, which means
properties of the distributions of i over different
random samples from the population.
Unbiasedness of OLS
Unbiased estimator: An estimator whose expected
value (or mean of its sampling distribution) equals
the population value (regardless of the population
value).
where y* = ln y and x* = ln x.
0 ( 2.31)
y 0 1 x
up of an explained part, and an unexplained part,

yi y i ui (2.32) Then we define the following :
Econometrics
through the mean of the sample
We can think of each observation as being made
1*
3. The OLS regression line always goes
(2.30)
Algebraic Properties
Econometrics
Algebraic Properties
23
Econometrics
24
Chapter 02
Cont.
Unbiasedness of OLS
Cont.
In order to think about unbiasedness, we

need to rewrite our estimator in terms of
the population parameter.
Assumption for unbiasedness

1. Linear in parameters as y = 0 + 1x + u
2. Random sampling {(xi, yi): i = 1, 2, , n},
Thus, yi = 0 + 1xi + ui
3. Sample variation in the xi, thus
(x x)
i
Unbiasedness of OLS
xi x yi
(x x)
then E 1 1
4. Zero conditional mean, E(u|x) = 0
x x u
(x x)
i
i
2
(2.49), (2.52)
x x E (u
(x x)
i
| x ) 1
(2.53)
* we can also get E ( 0 ) 0 in the same way.

Econometrics
25
Unbiasedness Summary
Cont.
26
Variances of the OLS Estimators
The OLS estimates of 1 and 0 are

unbiased.
Proof of unbiasedness depends on our 4
assumptions if any assumption fails, then
OLS is not necessarily unbiased.
Remember unbiasedness is a description of
the estimator in a given sample our
estimate may be near or far from the
true parameter.
Econometrics
Econometrics
Now we know that the sampling

distribution of our estimate is centered
around the true parameter.
We want to think about how spread out this

distribution is.
It is much easier to think about this variance
under an additional assumption, so assume
5. Var(u|x) = 2 (Homoskedasticity)
27
Econometrics
28
Homoskedastic Case
Variance of OLSE
2 is also the unconditional variance, called

the error variance, since
f(y|x)
Var(u|x) = E(u2|x) - [E(u|x)]2

2
2
2
E(u|x) = 0, so = E(u |x) = E(u ) = Var(u)
And , the square root of the error variance, is
called the standard deviation of the error.
Then we can say
. E(y|x) = + x
E(y|x)=0 + 1x and Var(y|x) = 2

Econometrics
x1
29
x2
Econometrics
30
Chapter 02
Heteroskedastic Case
Cont.
Variance of OLSE
f(y|x)
Var ( 1 )
.
x1
x2
x3
E(y|x) = 0 + 1x
Econometrics
We dont know what is the error variance,
What we observe are only the residuals, i,

not the errors, ui.
So we can use the residuals to form an
estimate of the error variance.
Econometrics
33
Estimate
32
Estimate
ui yi 0 1 xi
0 1 xi ui 0 1 xi
u x
i
Then, an unbiased estimator of 2 is

1
u 2
n 2 i
(2.61)
Econometrics
34
Now, consider the model without a intercept:

~
~
y 1 x (2.63).
Solving the FOC to the minimization

problem, OLS estimated slope is
xy
~
1 i 2 i (2.66).
xi
If we substitute for , then we have

the standard error of ,
1
(
x
i x )2
2
Econometrics
2.6 Regression through the Origin
recall that s.d. Var ( )
Econometrics
( 2 . 57 )
x )2
The larger the error variance, 2, the larger

the variance of the slope estimate.
The larger the variability in the xi, the
smaller the variance of the slope estimate.
As a result, a larger sample size should
decrease the variance of the slope estimate.
2 Standard error of the regression
Cont. Error Variance
2, because we dont observe the errors, ui.
se 1
(x
31
Estimating the Error Variance
Cont. Error Variance
* Recall that a intercept can always normalize E(u)

to 0 in the model with 0.
35
Econometrics
36

Simple Regression Notes Econometrics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Simple Regression Notes Econometrics

Uploaded by

Copyright:

Available Formats

Chapter 02

Ch.2 The simple regression model

The Simple Regression Model

Definition of the simple regression model

2.1 Definition of the model

y as the Dependent Variable

The Concept of Error Term

A Simple Assumption for u

u represents factors other than x that affect y.

u includes land quality, rainfall, etc.

Ex. 2.2: wage = 0 + 1educ + u

u includes experience, ability, tenure, etc.

Simple Regression Model

The average value of u, the error term, in

This is not a restrictive assumption, since we

To draw ceteris paribus conclusions about

E(y|x) as a linear function of x, where for any x

Zero Conditional Mean

We need to make a crucial assumption

Now we prepare 2 restrictions to estimate s.

Simple Regression Model

Deriving OLSE using MM

Since u = y 0 1x, we can rewrite;

E(u|x) = E(u) = 0 also implies that

To derive the OLS estimates, we need to

Deriving OLSE using MM

Population regression line, sample data points

Basic idea of regression is to estimate the

Because Cov(X,Y) = E(XY) E(X)E(Y)

2.2 Deriving the OLSE

E(u|x) = E(u) = 0 (2.5&2.6), which implies

The approach to estimation implies imposing the

More Derivation of OLS

We want to choose values of the parameters

Given the definition of a sample mean, and

More Derivation of OLS

Sample regression line, sample data points

The first order conditions, which are the

Given the intuitive idea of fitting a line, we

Alternate approach to derivation

Intuitively, OLS is fitting a line through the

See (2.18) & Figure (2.3)

The slope estimate is the sample

Summary of OLS slope estimate

Simple Regression Model

2.3 Properties of OLS

2. The sample covariance between the

Algebraic Properties of OLS

regressors and the OLS residuals is zero

1. The sum of the OLS residuals is zero.

Thus, the sample average of the OLS

Then, SST SSE SSR (2.36)

2.4 Measurement Units & Function Form

Its useful we think about how well the

2.5 Means & Variance of OLSE

Simple Regression Model

up of an explained part, and an unexplained part,

through the mean of the sample

We can think of each observation as being made

3. The OLS regression line always goes

In order to think about unbiasedness, we

Assumption for unbiasedness

4. Zero conditional mean, E(u|x) = 0

* we can also get E ( 0 ) 0 in the same way.