You are on page 1of 12

Introduction to Econometrics

Email - awap@cam.ac.uk
Office - Room 46
Office hours - Tuesday 10 or by appointment no
Office hour 28th October or 4th November
Key readings
J Wooldridge Introductory Econometrics (5th edn)
J Stock & M Watson
Introduction to Econometrics (3rd edn)
S Ross

Introductory Statistics (3rd edn)

C Dougherty

Introduction to Econometrics (4th edn)

G Koop

Analysis of Economic Data (4th edn)

Classes
Classes in Weeks 2, 4, 6 and 8 they start 20 Oct
Two classes Monday 9 and Tuesday 9
To even out attendance please
If your name begins with A-K
attend on Monday
If your name begins with L-Z
attend on Tuesday
Problem sheets will be handed out in lectures and
are available on the Web answers available after
the class

Introduction to econometrics
The application of statistical techniques to
economic data
Computer rather than calculator based
Reflected in nature of exercises and final exam
Not just a course in econometric theory
Questions are mainly problems not formal proofs
Take-home exam tests practical skills counts for
40% of total mark

Two main sections to the course


The random component is sample of
independent observations (usually the case in
experimental sciences)
statistical basics in Lectures 1-3
multiple regression in Lectures 4-10

additional concepts and techniques needed


when data are time series
covered in Lectures 11-16

Statistical experiments
Observation on a random variable - outcome of a
statistical experiment
We can predict the frequency of different outcomes
More usually we infer the underlying experimental mechanism
from sample of outcomes
We cannot predict the exact outcome of an individual
experiment.
The experiment is observation of characteristics for a randomly
selected individual

The experimenter (in Economics)


does not control the characteristics
can not measure all characteristics

Sampling distributions and Monte


Carlo experiments
Generate artificial data samples from known probability
distribution
Each experiment (replication) is a sample of N
primitive experiments
We can establish sampling distribution of statistics
experimentally this is useful if the sampling
distribution cannot be derived theoretically
Hence it is particularly useful for non-normal parent
distributions or statistics which are non-linear
functions of data
The technique can be used for other statistics - eg
rejection probabilities

Sampling distributions and Monte


Carlo experiments
Generate artificial data samples from known probability
distribution
Each experiment (replication) is a sample of N
primitive experiments
We can establish sampling distribution of statistics
experimentally this is useful if the sampling
distribution cannot be derived theoretically
Hence it is particularly useful for non-normal parent
distributions or statistics which are non-linear
functions of data
The technique can be used for other statistics - eg
rejection probabilities

Sampling distribution of the Mean: 100 observations


0.09

0.08

0.07

0.06

0.05

Experiment
Theory

0.04

0.03

0.02

0.01

0
-0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

Sampling distribution of the Mean: 1000 observations


0.3

0.25

0.2

Experiment

0.15

Theory

0.1

0.05

0
-0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

The Wage data


Data on 3296 working age individuals in US - covers
hourly wage
gender
years of schooling
years of work experience
We are particularly interested in
mean wage
wage premium
We can treat data as if it were the population then
look at randomly chosen 4% samples

Sampling distribution of mean wage

4.75

5.06

5.37

5.68

6.00

6.31

6.62

6.93

7.25

7.56

Sampling distribution of male wage premium

-0.82

-0.26

0.31

0.87

1.43

2.00

2.56

3.12

3.69

4.25

Estimating the premium


Population

Sample

MC mean

5.8164

5.9792

5.8182

Mean female wage 5.1469

4.9379

5.1436

Male premium

1.9360

1.2889

Mean wage

1.2777

Confidence Intervals for Male Premium


'Population' value = 1.2777
3.5
3
2.5
2
1.5
Upper Bound

Lower Bound

0.5

Point Estimate

0
-0.5
-1
-1.5
Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Sample

Drink consumption
60

50

40

30

20

10

0
0

10

12

14

16

18

20

22

24

26

28

30

Alternative Estimators
Samples of 30 drinkers - true value = 15

Estimator
Conventional mean
Median

Mean
15.00774
15.00354

Variance
1.056068
0.862354

Robust mean

15.00069

0.690245

Alternative estimators for the population mean


0.6

0.5

0.4

Mean
0.3

Median
Robust Mean

0.2

0.1

0
10.0

11.1

12.0

12.8

13.6

14.4

15.3

16.1

16.9

17.7

18.6

20.0

Estimators and Statistics


Any function of the sample data alone is a statistic
Two special types of statistic are covered in lectures 2-3
Estimators
Test statistics
Aim is to present criteria for choosing particular
statistics, based on properties of sampling distribution
The lecture considers simple cases (mean, t-ratio) but
the principles are general

Criteria for ranking estimators


To compare estimators we look at performance over the
average of all possible samples: the expected
performance averaging over all values of x (with
appropriate probability weights)
Unbiasedness
To make inferences about the population parameter
an estimator should be correct
averaging over all possible samples

Unbiasedness of the sample mean


We can show the sample mean is an unbiased
estimator of the population mean
The sample mean (which is also an OLS estimator)

Any weighted average is unbiased, including an


estimator based on a single observation
1
1
so unbiasedness is not a very restrictive criterion

10

Criteria for ranking estimators


Efficiency (minimum variance)
The variance of the estimator should be as small as
possible (this implies a symmetric squared error
loss function). For the sample mean


2

2/

We cannot show that this is less than the variance of


any other estimator of the population mean, but
we can show that it is less than the variance of
any other linear unbiased estimator

Efficiency of the sample mean


so we must minimise



to minimise the variance

so

Define

1,


2
2
This result can be generalised to simple and multiple
linear regression (the Gauss-Markov Theorem).

11

Variance, bias and MSE


Why not chose an estimator which is as close as
possible to the truth. Such an estimator would
minimise Mean Square Error (
)

But
depends on the unknown parameter
as the data, so it is not a statistic
Also we cannot chose to minimise
of - a counterexample is
0

as well

for all values

regardless of sample

which minimises

0, but not for other values

for

Decomposition of Mean Squared Error


can be broken down into variance and bias
components
2

2

2

So for an unbiased estimator minimising variance is


equivalent to minimising MSE
Follow-up reading: Wooldridge Appendix C.1,C.2
Stock & Watson 2.1, 2.2, 3.1

12

You might also like