Introduction to Econometrics: Applying Stats to Econ Data

Introduction to Econometrics
Email - awap@cam.ac.uk
Office - Room 46
Office hours - Tuesday 10 or by appointment no
Office hour 28th October or 4th November
Key readings
J Wooldridge Introductory Econometrics (5th edn)
J Stock & M Watson
Introduction to Econometrics (3rd edn)
S Ross
Introductory Statistics (3rd edn)
C Dougherty
Introduction to Econometrics (4th edn)
G Koop
Analysis of Economic Data (4th edn)
Classes
Classes in Weeks 2, 4, 6 and 8 they start 20 Oct
Two classes Monday 9 and Tuesday 9
To even out attendance please
If your name begins with A-K
attend on Monday
If your name begins with L-Z
attend on Tuesday
Problem sheets will be handed out in lectures and
are available on the Web answers available after
the class
Introduction to econometrics
The application of statistical techniques to
economic data
Computer rather than calculator based
Reflected in nature of exercises and final exam
Not just a course in econometric theory
Questions are mainly problems not formal proofs
Take-home exam tests practical skills counts for
40% of total mark
Two main sections to the course

The random component is sample of
independent observations (usually the case in
experimental sciences)
statistical basics in Lectures 1-3
multiple regression in Lectures 4-10
additional concepts and techniques needed

when data are time series
covered in Lectures 11-16
Statistical experiments
Observation on a random variable - outcome of a
statistical experiment
We can predict the frequency of different outcomes
More usually we infer the underlying experimental mechanism
from sample of outcomes
We cannot predict the exact outcome of an individual
experiment.
The experiment is observation of characteristics for a randomly
selected individual
The experimenter (in Economics)

does not control the characteristics
can not measure all characteristics
Sampling distributions and Monte

Carlo experiments
Generate artificial data samples from known probability
distribution
Each experiment (replication) is a sample of N
primitive experiments
We can establish sampling distribution of statistics
experimentally this is useful if the sampling
distribution cannot be derived theoretically
Hence it is particularly useful for non-normal parent
distributions or statistics which are non-linear
functions of data
The technique can be used for other statistics - eg
rejection probabilities
Sampling distributions and Monte

Carlo experiments
Generate artificial data samples from known probability
distribution
Each experiment (replication) is a sample of N
primitive experiments
We can establish sampling distribution of statistics
experimentally this is useful if the sampling
distribution cannot be derived theoretically
Hence it is particularly useful for non-normal parent
distributions or statistics which are non-linear
functions of data
The technique can be used for other statistics - eg
rejection probabilities
Sampling distribution of the Mean: 100 observations

0.09
0.08
0.07
0.06
0.05
Experiment
Theory
0.04
0.03
0.02
0.01
0
-0.4
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
Sampling distribution of the Mean: 1000 observations

0.3
0.25
0.2
Experiment
0.15
Theory
0.1
0.05
0
-0.4
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
The Wage data

Data on 3296 working age individuals in US - covers
hourly wage
gender
years of schooling
years of work experience
We are particularly interested in
mean wage
wage premium
We can treat data as if it were the population then
look at randomly chosen 4% samples
Sampling distribution of mean wage
4.75
5.06
5.37
5.68
6.00
6.31
6.62
6.93
7.25
7.56
Sampling distribution of male wage premium
-0.82
-0.26
0.31
0.87
1.43
2.00
2.56
3.12
3.69
4.25
Estimating the premium

Population
Sample
MC mean
5.8164
5.9792
5.8182
Mean female wage 5.1469
4.9379
5.1436
Male premium
1.9360
1.2889
Mean wage
1.2777
Confidence Intervals for Male Premium

'Population' value = 1.2777
3.5
3
2.5
2
1.5
Upper Bound
Lower Bound
0.5
Point Estimate
0
-0.5
-1
-1.5
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Sample
Drink consumption
60
50
40
30
20
10
0
0
10
12
14
16
18
20
22
24
26
28
30
Alternative Estimators
Samples of 30 drinkers - true value = 15
Estimator
Conventional mean
Median
Mean
15.00774
15.00354
Variance
1.056068
0.862354
Robust mean
15.00069
0.690245
Alternative estimators for the population mean

0.6
0.5
0.4
Mean
0.3
Median
Robust Mean
0.2
0.1
0
10.0
11.1
12.0
12.8
13.6
14.4
15.3
16.1
16.9
17.7
18.6
20.0
Estimators and Statistics

Any function of the sample data alone is a statistic
Two special types of statistic are covered in lectures 2-3
Estimators
Test statistics
Aim is to present criteria for choosing particular
statistics, based on properties of sampling distribution
The lecture considers simple cases (mean, t-ratio) but
the principles are general
Criteria for ranking estimators

To compare estimators we look at performance over the
average of all possible samples: the expected
performance averaging over all values of x (with
appropriate probability weights)
Unbiasedness
To make inferences about the population parameter
an estimator should be correct
averaging over all possible samples
Unbiasedness of the sample mean

We can show the sample mean is an unbiased
estimator of the population mean
The sample mean (which is also an OLS estimator)
Any weighted average is unbiased, including an

estimator based on a single observation
1
1
so unbiasedness is not a very restrictive criterion
10
Criteria for ranking estimators

Efficiency (minimum variance)
The variance of the estimator should be as small as
possible (this implies a symmetric squared error
loss function). For the sample mean

2

2/
We cannot show that this is less than the variance of

any other estimator of the population mean, but
we can show that it is less than the variance of
any other linear unbiased estimator
Efficiency of the sample mean

so we must minimise

to minimise the variance
so
Define
1,

2
2
This result can be generalised to simple and multiple
linear regression (the Gauss-Markov Theorem).
11
Variance, bias and MSE

Why not chose an estimator which is as close as
possible to the truth. Such an estimator would
minimise Mean Square Error (
)

But
depends on the unknown parameter
as the data, so it is not a statistic
Also we cannot chose to minimise
of - a counterexample is
0
as well
for all values
regardless of sample
which minimises
0, but not for other values
for
Decomposition of Mean Squared Error

can be broken down into variance and bias
components
2

2

2
So for an unbiased estimator minimising variance is

equivalent to minimising MSE
Follow-up reading: Wooldridge Appendix C.1,C.2
Stock & Watson 2.1, 2.2, 3.1
12

Introduction to Econometrics: Applying Stats to Econ Data

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction to Econometrics: Applying Stats to Econ Data

Uploaded by

Copyright:

Available Formats

Introduction to Econometrics

Introductory Statistics (3rd edn)

Introduction to Econometrics (4th edn)

Analysis of Economic Data (4th edn)

Two main sections to the course

additional concepts and techniques needed

The experimenter (in Economics)

Sampling distributions and Monte

Sampling distributions and Monte

Sampling distribution of the Mean: 100 observations

Sampling distribution of the Mean: 1000 observations

The Wage data

Sampling distribution of mean wage

Sampling distribution of male wage premium

Estimating the premium

Mean female wage 5.1469

Confidence Intervals for Male Premium

Alternative estimators for the population mean

Estimators and Statistics

Criteria for ranking estimators

Unbiasedness of the sample mean

Any weighted average is unbiased, including an

Criteria for ranking estimators

We cannot show that this is less than the variance of

Efficiency of the sample mean

Variance, bias and MSE

for all values

0, but not for other values

Decomposition of Mean Squared Error

So for an unbiased estimator minimising variance is

You might also like