Professional Documents
Culture Documents
Probability limits
lim P X n a 0
n
We will start with an abstract definition of a probability limit and then illustrate it with a simple
example.
3
Probability limits
lim P X n a 0
n
Probability limits
lim P X n a 0
n
plim X n a
The constant a is described as the probability limit of the sequence, usually abbreviated as
plim.
5
n
0.08
50
0.06
0.04
0.02
n=1
50
100
150
200
We will take as our example the mean of a sample of observations, X, generated from a
random variable X with population mean X and variance 2X. We will investigate how X
behaves as the sample size n becomes large.
6
n
0.08
50
0.06
0.04
0.02
n=1
50
100
150
200
For convenience we shall assume that X has a normal distribution, but this does not affect
the analysis. If X has a normal distribution with mean X and variance 2X, X will have a
normal distribution with mean X and variance 2X / n.
7
n
0.08
50
0.06
0.04
0.02
n=1
50
100
150
200
For the purposes of this example, we will suppose that X has population mean 100 and
standard deviation 50, as in the diagram.
8
n
0.08
50
0.06
0.04
0.02
n=1
50
100
150
200
The sample mean will have the same population mean as X, but its standard deviation will
be 50/ n , where n is the number of observations in the sample.
9
n
0.08
50
0.06
0.04
0.02
n=1
50
100
150
200
The larger is the sample, the smaller will be the standard deviation of the sample mean.
10
n
0.08
50
0.06
0.04
0.02
n=1
50
100
150
200
If n is equal to 1, the sample consists of a single observation. X is the same as X and its
standard deviation is 50.
11
n
0.08
1
4
50
25
0.06
0.04
n=4
0.02
50
100
150
200
We will see how the shape of the distribution changes as the sample size is increased.
12
n
0.08
1
4
25
0.06
X
50
25
10
n = 25
0.04
0.02
50
100
150
200
13
n
n = 100
0.08
1
4
25
100
0.06
X
50
25
10
5
0.04
0.02
50
100
150
200
To see what happens for n greater than 100, we will have to change the vertical scale.
14
n
0.8
1
4
25
100
0.6
X
50
25
10
5
0.4
n = 100
0.2
50
100
150
200
15
n
0.8
1
4
25
100
1000
0.6
n = 1000
X
50
25
10
5
1.6
0.4
0.2
50
100
150
200
16
n = 5000
0.8
1
4
25
100
1000
5000
0.6
0.4
X
50
25
10
5
1.6
0.7
0.2
50
100
150
200
In the limit, the variance of the distribution tends to zero. The distribution collapses to a
spike at the true value. The plim of the sample mean is therefore the population mean.
17
lim P X X 0
n
Formally, the probability of X differing from X by any finite amount, however small, tends to
zero as n becomes large.
18
lim P X X 0
n
plim X X
19
Consistency
An estimator of a population characteristic is said to be
consistent if it satisfies two conditions:
(1) It possesses a probability limit, and so its
distribution collapses to a spike as the sample size
becomes large, and
(2) The spike is located at the true value of the
population characteristic.
20
0.6
0.4
0.2
50
100
150
200
The sample mean in our example satisfies both conditions and so it is a consistent
estimator of X. Most standard estimators in simple applications satisfy the first condition
because their variances tend to zero as the sample size becomes large.
21
0.6
0.4
0.2
50
100
150
200
The only issue then is whether the distribution collapses to a spike at the true value of the
population characteristic. A sufficient condition for consistency is that the estimator
should be unbiased and that its variance should tend to zero as n becomes large.
22
0.6
0.4
0.2
50
100
150
200
It is easy to see why this is a sufficient condition. If the estimator is unbiased for a finite
sample, it must stay unbiased as the sample size becomes large.
23
0.6
0.4
0.2
50
100
150
200
Meanwhile, if the variance of its distribution is decreasing, its distribution must collapse to
a spike. Since the estimator remains unbiased, this spike must be located at the true value.
The sample mean is an example of an estimator that satisfies this sufficient condition.
24
n = 20
However the condition is only sufficient, not necessary. It is possible that an estimator may
be biased in a finite sample
25
n = 100
n = 20
26
n = 1000
n = 100
n = 20
to the point where the bias disappears altogether as the sample size tends to infinity.
Such an estimator is biased for finite samples but nevertheless consistent because its
distribution collapses to a spike at the true value.
27
Consistency
1 n
Z
Xi .
n 1 i 1
A simple example of an estimator that is biased in finite samples but consistent is shown
above. We are supposing that X is a random variable with unknown population mean X
and that we wish to estimate X.
28
Consistency
1 n
Z
Xi .
n 1 i 1
E Z
n
X .
n1
The estimator is biased for finite samples because its expected value is n X/(n + 1). But as
n tends to infinity, n /(n + 1) tends to 1 and the estimator becomes unbiased.
29
Consistency
1 n
Z
Xi .
n 1 i 1
E Z
var( Z )
n
X .
n1
n
2
n 1 2 X
The variance of the estimator is given by the expression shown. This tends to zero as n
tends to infinity. Thus Z is consistent because its distribution collapses to a spike at the
true value.
30
Consistency
In practice we deal with finite samples, not infinite ones. So why
should we be interested in whether an estimator is consistent?
One reason is that sometimes it is impossible to find an estimator
that is unbiased for small samples. If you can find one that is at
least consistent, that may be better than having no estimate at all.
A second reason is that often we are unable to say anything at all
about the expectation of an estimator. The expected value rules are
weak analytical instruments that can be applied in relatively simple
contexts.
In particular, the multiplicative rule E{g(X)h(Y)} = E{g(X)} E{h(Y)}
applies only when X and Y are independent, and in most situations
of interest this will not be the case. By contrast, we have a much
more powerful set of rules for plims.
31
Consistency
In practice we deal with finite samples, not infinite ones. So why
should we be interested in whether an estimator is consistent?
One reason is that sometimes it is impossible to find an estimator
that is unbiased for small samples. If you can find one that is at
least consistent, that may be better than having no estimate at all.
A second reason is that often we are unable to say anything at all
about the expectation of an estimator. The expected value rules are
weak analytical instruments that can be applied in relatively simple
contexts.
In particular, the multiplicative rule E{g(X)h(Y)} = E{g(X)} E{h(Y)}
applies only when X and Y are independent, and in most situations
of interest this will not be the case. By contrast, we have a much
more powerful set of rules for plims.
32
Consistency
In practice we deal with finite samples, not infinite ones. So why
should we be interested in whether an estimator is consistent?
One reason is that sometimes it is impossible to find an estimator
that is unbiased for small samples. If you can find one that is at
least consistent, that may be better than having no estimate at all.
A second reason is that often we are unable to say anything at all
about the expectation of an estimator. The expected value rules are
weak analytical instruments that can be applied in relatively simple
contexts.
In particular, the multiplicative rule E{g(X)h(Y)} = E{g(X)} E{h(Y)}
applies only when X and Y are independent, and in most situations
of interest this will not be the case. By contrast, we have a much
more powerful set of rules for plims.
33
Consistency
In practice we deal with finite samples, not infinite ones. So why
should we be interested in whether an estimator is consistent?
One reason is that sometimes it is impossible to find an estimator
that is unbiased for small samples. If you can find one that is at
least consistent, that may be better than having no estimate at all.
A second reason is that often we are unable to say anything at all
about the expectation of an estimator. The expected value rules are
weak analytical instruments that can be applied in relatively simple
contexts.
In particular, the multiplicative rule E{g(X)h(Y)} = E{g(X)} E{h(Y)}
applies only when X and Y are independent, and in most situations
of interest this will not be the case. By contrast, we have a much
more powerful set of rules for plims.
34
Plim rules
Plim rule 1
Plim rule 2
Plim rule 3
35
Plim rules
Plim rule 1
Plim rule 2
Plim rule 3
36
Plim rules
Plim rule 1
Plim rule 2
Plim rule 3
37
Plim rules
Plim rule 4
Plim rule 5
plim X
plim Y
38
Plim rules
Plim rule 4
Plim rule 5
plim X
plim Y
39
Plim rules
Plim rule 6
40
Y Z
To illustrate how the plim rules can lead us to conclusions when the expected value rules
do not, consider this example. Suppose that you know that a variable Y is a constant
multiple of another variable Z
41
Y Z
Z is generated randomly from a fixed distribution with population mean Z and variance 2Z.
is unknown and we wish to estimate it. We have a sample of n observations.
42
Y Z
X Zw
Y is measured accurately but Z is measured with random error w with population mean zero
and constant variance 2w. Thus in the sample we have observations on X, where X = Z + w,
rather than Z.
43
Y Z
X Zw
Y
X
Z
Z
Z w Z w
w
w
Zw
Z w
i
44
Y Z
X Zw
Y
X
Z
Z
Z w Z w
w
w
Zw
Z w
i
Substituting from the first two equations, the estimator can be rewritten as shown.
45
Y Z
X Zw
Y
X
Z
Z
Z w Z w
w
w
Zw
Z w
i
The expression can be simplified as shown. Hence we have decomposed the estimator into
the true value, , and an error term. To investigate whether the estimator is biased or
unbiased, we need to take the expectation of the error term.
46
Y Z
X Zw
Y
X
Z
Z
Z w Z w
w
w
Zw
Z w
i
But we cannot do this. The random quantity appears in both the numerator and the
denominator and the expected value rules are too weak to allow us to investigate the
expectation analytically.
47
Y Z
X Zw
Y
X
Z
Z
Z w Z w
w
w
Zw
Z w
i
However, we know that a sample mean tends to a population mean as the sample size tends
to infinity, and so plim w = 0 and plim Z = Z.
48
Y Z
X Zw
Y
X
plim
plim w
0
plim Z plim w
Z 0
i
Since the plims of the numerator and the denominator of the error term both exist, we are
able to take the plim of the error term. Thus we are able to show that the estimator is
consistent, despite the fact that we cannot say anything about its finite sample properties.
49
11.07.25