You are on page 1of 87

ACTL2002/ACTL5101 Probability and Statistics: Week 6

ACTL2002/ACTL5101 Probability and Statistics


c Katja Ignatieva

School of Risk and Actuarial Studies
Australian School of Business
University of New South Wales
k.ignatieva@unsw.edu.au

Week 6
Week 2
Week 3
Week 4
Probability:
Review
Estimation: Week 5
Week
7
Week
8
Week 9
Hypothesis testing:
Week
10
Week
11
Week
12
Linear regression:
Week 2 VL
Week 3 VL
Week 4 VL
Video lectures: Week 1 VL
Week 1

Week 5 VL

ACTL2002/ACTL5101 Probability and Statistics: Week 6

Last five weeks


Introduction to probability;
Moments: (non)-central moments, mean, variance (standard
deviation), skewness & kurtosis;
Special univariate (parametric) distributions (discrete &
continue);
Joint distributions;
Moments & distribution for sample mean and variance.
Convergence; with applications LLN & CLT;
Estimators (MME, MLE, and Bayesian).
1101/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6

This week
Evaluation estimators:
- UMVUE (unbiased, lowest variance);
- Cramer-Rao lower bound;
- Rao-Blackwell Theorem.

Interval estimation (v.s. point estimates last week):


- Pivotal quantity method;
- Confidence interval for: mean, difference between two means,
proportions, variance, ratio of two variances, paired difference,
and MLE estimates.

Many examples: not going to cover all in the lecture. Know


and be able to apply the method, do not memorize them!
1102/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Fisher (1922) on good estimators

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Fisher (1922) on good estimators

Fisher (1922) on good estimators


Last week we have seen 3 estimators.
There are infinite different estimators.
How to tell whether an estimator is good/better than another?
Fisher (1922) can with three conditions for good estimators:
- Efficiency: good estimator has smaller variance than others;
- Consistency: good estimator converges to true value of
parameter;
- Sufficiency: good estimator contains/uses all the information
about our parameter of interest that is present in the data.

1103/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Fisher (1922) on good estimators

Methods to Evaluate Estimators


How good is an estimator?
One can compare them using:
i. The Best Unbiased Minimum Variance Estimator;
- Lowest mean squared error and unbias;
- Prove using Cramer-Rao Lower Bound;
ii. Consistency;
iii. Sufficient Statistics.

In the next slides we will discuss all three.

1104/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
UMVUEs

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
UMVUEs

Mean Squared Error and Bias


The mean squared error (MSE) of an estimator
T (X1 , X2 , . . . , Xn ) of a parameter is defined as:
h
i
MSE = E (T )2 .
The MSE gives the average squared difference between the
estimator T (X1 , X2 , . . . , Xn ) and and is given by:
h
i


MSE = E (T )2 = E T 2 + 2 2T + E[T ] E[T ]
 
 
= E T 2 E [T ]2 + E [T ]2 + E 2 E [2T ]

= Var (T ) + (E[T ] )2 = Var (T ) + (Bias (T ))2 ,

where Bias(T) =E[T ] ; * note is a constant.


1105/1175

An unbiased estimator has: E[T ] = .

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
UMVUEs

Example: estimation of Poisson


bML = X .
Recall (last weeks lecture) the MLE of a Poisson is
bMM = X .
We know X = E [X ] = , thus MME estimator:
These are thus both unbiased:
Bias (T ) = E[T ] = E[X ]
n
n
1X
E[Xk ] =
= 0.
=
n
n
k=1

The MSE is given by (using unbiased):


n

MSE

= Var (T ) + (Bias (T )) = Var

1X
Xk
n
k=1

n
n
1 X
1X

Var (Xk ) = 2
= .
2
n
n
n
k=1

1106/1175

k=1

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
UMVUEs

Approximation
Note that the variance of the estimator is a function of the
parameter we are estimating.
Hence, we do not know Var (T ), thus we approximate using
the estimator with:
b
X

\
Var
(T ) = = .
n
n
The square root of this is called the standard error of the
estimate:
s
X
sd
() =
.
n
1107/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
UMVUEs

Functions of
Note that we defined () as a function of the unknown
parameters.
Question: Why might we be interested in determining an
estimate of a function of the parameters instead of the an
estimate of the parameters?
Solution: We might be interested in an estimate of a
non-linear transformation of the parameters.
Example: consider Pr(X = 0), where X Poi():

1108/1175

e 0
.
Pr(X = 0) =
0!
h i
b = , however
We know that E
" b #
h 
i
b0
0
b =E e 6= e .
E Pr X = 0|
0!
0!

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
UMVUEs

UMVUEs
Consider two unbiased estimators, say T1 and T2 . We define
efficiency of T1 relative to T2 as:
eff (T1 , T2 ) =

Var (T2 )
.
Var (T1 )

It is clear that if this is larger than 1, then:


Var (T2 ) > Var (T1 ),
i.e., estimator T1 has lower variance than estimator T2 .
Thus high value of eff (T1 , T2 ) implies prefer T1 above T2 .
1109/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
UMVUEs

Unbiased Estimators with Minimum Variance (UMVUEs)

An estimator T is said to be a best unbiased estimator of


() if it satisfies two conditions:
- The estimator T is unbiased, i.e., E[T ] = ();
- The estimator T has the smallest variance, i.e.,
Var (T ) Var (T ? ), for any other unbiased estimator T ? .

Note that the best unbiased estimator T is often called the


uniform minimum variance unbiased estimator (UMVUE) of
() .

1110/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Cram
er-Rao Lower Bound (CRLB)

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Cram
er-Rao Lower Bound (CRLB)

Cramer-Rao Lower Bound (CRLB)


How to prove T (X1 , X2 , . . . , Xn ) has the lowest variance of all
unbiased estimators?
Calculate efficiency for all unbiased estimators?
That will take some time, what is all?
Let X1 , X2 , . . . , Xn be a random sample from fX (x|) and let
T (X1 , X2 , . . . , Xn ) be an unbiased estimator of .
The smallest lower bound of the variance (called the
Cramer-Rao Lower Bound (CRLB)) for unbiased estimators is:
1
Var (T (X1 , X2 , . . . , Xn ))
,
n If ? ()
1111/1175

where If ? () is the Fisher information of the parameter (see


next slide).

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Cram
er-Rao Lower Bound (CRLB)

Cramer-Rao Lower Bound (CRLB)


Score: S = `(x; )/. MLE satisfies FOC E[S] = 0.
The Fisher information of the parameter is defined to be the
function:

h 2
i
2 

log(fX (x|))
log(fX (x|))
=
E
If ? () = E
2


2 
h 2
i

`(x;)
2
= E `(x;)
/n
=
E
/n,

2
* see also slides 1166-1168 (we do not need to prove it in this
course). Fisher information is the variance of the score (using
mean of zero). ** using i.i.d. samples.
Note: asymptotically, as n , the MLE is on the CRLB
MLE is asymptotically UMVUE.
1112/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Cram
er-Rao Lower Bound (CRLB)

Exercise: Cramer-Rao Lower Bound (CRLB)


Consider n draws from a Bin(m, p) r.v..
 
m
fX (x; p) =
p x (1 p)mx
x
 
m
log (fX (x; p)) = log
+ x log(p) + (m x) log(1 p)
x
Question: Find the CRLB.
Solution: First, Fisher information (* Var (X ) = mp(1 p)):

mx
x mp
log (fX (x; p)) x
=
=
p
p
1p
p(1 p)

2
2
log (fX (x; p))
(x mp)
= 2
p
p (1 p)2
"
#



E (x mp)2
log (fX (x; p)) 2
Var (X )
m
If ? (p) = E
= 2
= 2
=
2
2
p
p
(1

p)
p
(1

p)
p(1
p)
1113/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Cram
er-Rao Lower Bound (CRLB)

Exercise: Cramer-Rao Lower Bound (CRLB)


Alternative, we can find the Fisher information by:
2 log (fX (x; p)) x
mx
= 2
p 2
p
(1 p)2
 2



log (fX (x; p))
E [X ] m E [X ]
m
If ? (p) = E
.
=

=
2
2
2
p
p
(1 p)
p(1 p)
Thus, the Cramer-Rao Lower Bound is given by:
Var (T (X1 , . . . , Xn ))

1
n

m
p(1p)

p(1 p)
.
mn

Hence, the minimum of the variance of the estimate p


decreases if the number of r.v. (i.e., m) increases or the
sample size (i.e., n) increases.
1114/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Consistency

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Consistency

Consistency
A sequence of estimators {Tn } is a consistent sequence of
estimators of the parameter if for every  > 0 we have:
lim Pr (|Tn | < ) = 1,

n
a.s.

i.e., Tn .
Equivalently, if Tn is a sequence of estimators of a parameter
that satisfies the following two conditions:
i) lim Var (Tn ) = 0 (the uncertainty in the estimate is zero as
n

n );
ii) lim Bias (Tn ) = 0 (estimator is asymptotically unbiased);
n

1115/1175

then it is a sequence of consistent estimators of (Proof


using Chebyshevs inequality: Pr (|X | > ) 2 /2 ).

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Consistency

Example: consistency of MLEs


Suppose X1 , X2 , . . . Xn is a random sample from fX (x|).
 
Let b be the MLE of so that b is the MLE of any
continuous function ().
Under certain regularity conditions (e.g., continuous,
differentiable,
 no parameter on the boundaries of x, etc.) on
fX (x|), b is a consistent estimator of ().
Due to:




b
d
n n N 0, I ?1() .

Proof: See slide 1166.


1116/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Sufficient Statistics

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Sufficient Statistics

Sufficient Statistics
Let (X1 , X2 , . . . , Xn ) have joint p.d.f. f (x; ). A statistic S is
said to be sufficient for if for any other statistic T the
conditional p.d.f. of T given S = s, denoted by fT |S (t) does
not depend on , for any value of t.
Idea: if S is observed, additional information about cannot
be obtained from if the conditional distribution of T given
S = s is free of .
Factorization Theorem. A necessary and sufficient condition
for T (X1 , . . . , Xn ) to be a sufficient statistic for is that the
joint probability function (density function or frequency
function) factors in the form:
1117/1175

fX (x1 , . . . , xn | ) = g (T (x1 , . . . , xn ) , ) h (x1 , . . . , xn ) .

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Sufficient Statistics

The Rao-Blackwell Theorem


b be an estimator of with E[
b 2 ] < (i.e., finite) for
Let
all . Suppose that T is sufficient for . Define a new
estimator as:
e = E[
b |T ].

Then for all , this new estimator has a smaller MSE. We


have that:
 
 
e MSE
b
MSE
or, equivalently:

E

1118/1175

2 



2 

Thus, we see from that Rao-Blackwell theorem, that if an


estimator is not a function of a sufficient statistic it can be
improved in terms of MSE (proof: see next slides).

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Sufficient Statistics

Proof: From * the law of iterated expectation (see week 4):


i
h
h i
h i

b
e = E[E |T
b ,
E
]=E
| {z }
e
=

so to compare the two estimators, we need only compare their


variances. Using the conditional variance identity, we have:
 
 h
i
h

i
b
b
b
Var
= Var E |T
+ E Var |T
 


e
b
= Var + E[Var |T ].
| {z }
0

1119/1175



 
 
b
b > Var
e , unless Var |T
Thus, Var
= 0. This is
b is a function of T , which would imply
the case only if
b = .
e

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Sufficient Statistics

The Rao-Blackwell Theorem


How do we explain this last clause? Well,
Z 
 
i2

h

b d b = 0,
b =t
b
|t
f|T
b E |T
Var |T = 0
b
b

for all possible realizations t of T , and so:




h
i
b
b =t ,
Var |T
= 0 b = E |T
which implies b is a function of t, and thus:


h
i
b
b = E |T
b
e
Var |T
=0
= ,
as stated above.
1120/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Evaluating estimators
Sufficient Statistics

Example: Sufficient Statistic for Exponential distribution


Consider a random sample Xi EXP() for i = 1, . . . , n. The
joint p.d.f. is:
!
n
X
n
xi .
fX (x1 , . . . , xn ; ) = exp
i=1

This suggests checking statistic S =


S Gamma(n, ) so that:
fS (s; ) =

Pn

i=1 xi ,

we know

n
s n1 exp ( s) .
(n)

The conditional density given S = s is:


P
fX (x1 , . . . , xn ; )
n exp ( ni=1 xi )
(n)
= n
= n1 ,
n1
fS (s; )
s
exp ( s)
(n) s
1121/1175

which is free of , thus S is sufficient for .

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Introduction

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Introduction

Introduction
Last week we have seen point estimators;
Point estimators: using a sample tries to describes the
distribution of a population;
However, the sample itself is a random variable;
This implies that parameters estimated using a sample are
uncertain!
You should take that into account, especially when you are
interested in tail risk (example insurer: probability of ruin).
Using a point estimate would underestimate the true risk.
1122/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Introduction

Application: parameter risk


See Excel file.
We have 25 samples of 100 simulated observations of a
N(8, 122 ) random variable.
For each sample we can estimate the parameters of the
normal distribution.
Using the parameters we estimate the 99.5% percentile (VaR
required capital) for each sample or expected shortfall
E[Y |Y > b] where b = Y + Y (0.99).
Large variation in required capital between samples: between
35 and 43.
Parameters themselves are source of uncertainty!
1123/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Introduction

Parametric Interval Estimation


An interval estimate of a parameter has the form
b1 < < b2 , where b1 and b2 are realized values of suitable
random variables b1 (X1 , . . . , Xn ) and b2 (X1 , . . . , Xn ), which
are functions of the random sample X1 , . . . , Xn .
Construct the interval:


Pr b1 (X1 , . . . , Xn ) < < b2 (X1 , . . . , Xn ) = 1 ,
for some specified 0 1 and then we define:


b1 (X1 , . . . , Xn ) , b2 (X1 , . . . , Xn )
as the 100 (1 ) % confidence interval for .
1124/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Introduction

Example
Consider an i.i.d. sample of size 4, X1 , X2 , X3 , X4 from
N (, 1). Recall that we can estimate the population mean
that will be in the range
by X . The probability

X 1, X + 1 is:


Pr X 1 < < X + 1 =Pr 1 < X < 1



=Pr 4 < Z < 4


=(2) (1 (2))

=0.9544.

* using m.g.f. technique we have X N(, 2 /n).



Thus, is in the range: X 1, X + 1 with probability
0.9544.
1125/1175

Use: (4) = 0.999968, (2) = 0.97725, (1) = 0.8413

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
The Pivotal Quantity Method

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
The Pivotal Quantity Method

The Pivotal Quantity Method


The general method for constructing confidence intervals is
using the pivotal quantity method.
1. Find a pivot: i.e., function of X1 , . . . , Xn whose distribution
does not depend on .
2. Find the function g (X1 , . . . , Xn ; ):
The pivotal quantity method requires finding a function of the
form g (X1 , . . . , Xn , ), so that it is known that for quantiles
q1 and q2 we have:
Pr (q1 < g (X1 , . . . , Xn ; ) < q2 ) = 1 ,
with q1 q2 .
1126/1175

Continues next slide.

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
The Pivotal Quantity Method

The Pivotal Quantity Method


Thus, let g (X1 , . . . , Xn ; ) be a monotonic function of and
let it have a unique inverse g 1 (X1 , . . . , Xn ) = .
3. The 100 (1 ) % confidence interval of is given by:
g 1 (X1 , . . . , Xn ; q1 ) < < g 1 (X1 , . . . , Xn ; q2 ) ,
if g (X1 , . . . , Xn ; ) is an increasing function, and
g 1 (X1 , . . . , Xn ; q2 ) < < g 1 (X1 , . . . , Xn ; q1 ) ,
if g (X1 , . . . , Xn ; ) is a decreasing function.

See graph on slide 1128.


1127/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
The Pivotal Quantity Method

Y = 2 X n 2 (2n)

fY(y)

Confidence intervals

1 2
q2 (2n)
q1

1128/1175

1=q (2n)q (2n)

2
q1 (2n)

q2

Pr (q1 < g (X1 , . . . , Xn ; ) < q2 ) = 1

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: Pivotal quantity method and the Exponential


Suppose X1 , X2 , . . . , Xn is a random sample from Exp()
distribution (with MXi (t) = (1 t/)1 ). We know that
(week 2):
n
X
nX =
Xk Gamma (n, ) .
k=1

We know that the the m.g.f. of nX is:


h Pn
i

t n
MnX (t) = E e k=1 Xk t = (MXi (t))n = 1
,

and the m.g.f. of the random variable 2 n X is:


M2nX (t) = MnX (2 t) =
1129/1175




2n/2
2 t n
1
1
=
.

12t

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

1. The pivot is (recall from week 5):



2 n X Gamma n, 21 = 2 (2 n) .
Its distribution is free of the parameter value , thus a pivot.
If we therefore denote the quantiles from the 2 distributions
as (F&T page 164-166 & survival: 168, 169, see graph on
slide 1128):
q1 = 2/2 (2 n)

and

q2 = 21/2 (2 n) .

2. The function g (X1 , . . . , Xn ; ) = 2 n X (increasing):




Pr 2/2 (2 n) < 2 n X < 21/2 (2 n) = 1 .
3. Hence, a 100 (1 ) % confidence interval for is:
2/2 (2 n)

1130/1175

2nx

<<

21/2 (2 n)
2nx

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: Confidence Interval for the Mean


Recall from week 5.
Suppose X1 , X2 , . . . , Xn are independent, identically
distributed random variables with finite mean and finite
variance 2 . As before, denote the sample mean by X n .
Then, the central limit theorem states:
Xn d
 N (0, 1) ,

as n .

This holds for all r.v. with finite mean and variance, not only
normal r.v.!
Suppose X1 , . . . , Xn is a random sample from a population
with mean and known variance 2 .
1131/1175

Question: Find the CI for .

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Solution: By the central limit theorem, X is approximately


normally distributed
 with mean X = and (population)
variance X2 = 2 n.
1. Our pivot is Z =

X X
X
N(0, 1).
=
X
/ n
X
(decreasing). Using:
/ n

< Z < z1/2 = 1 ,

2. The function g (X1 , . . . , Xn ; ) =


Pr z/2
we then have:
Pr(z/2 <


Pr z/2 n X <

Pr X n z1/2 <

1132/1175

/ n

< z1/2 )
< z1/2

<X

z/2

X


= 1
= 1
= 1

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: Confidence Interval for the Mean


3. Thus we have:

x z1/2 < < x+ z1/2 ,


n
n
where z1/2 is the point on the standard normal for which
the probability above it is /2 (note symmetry of standard
normal distribution). This is an approximate 100 (1 ) %
confidence interval for (given known population variance
2 ).
Question: Why approximate 100 (1 ) % confidence
interval?
Solution: Recall, X is asymptotically normally distributed
using CLT (except when the Xi are i.i.d. normally distributed).
1133/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Confidence Interval for the Mean


For standard normal distribution, we have the following (often
used) quantiles:

1%
5%
10%

1134/1175

two-sided
z1/2
2.05
1.96
1.645

one-sided
z1
2.33
1.645
1.28

Note that the above gives the confidence interval for the mean
both when the population variance is known and when it is
only an approximation for which the approximation improves
with increasing sample size. This same confidence interval
formula for the mean holds even if the population variance is
replaced by the sample variance provided the sample is large
(generally, n > 30 is a rule of thumb for large samples).

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI mean, unknown Variance, Small Sample


Let X1 , . . . , Xn is a random sample from a population with
mean and unknown variance 2 (but with known sample
variance s 2 ).
a. Question: What is the pivot? See week 5 online lecture.
b. Question: Find an (approximated) 100 (1 ) % confidence
interval for .
a. Solution: The pivot is:
,s

X
X
(n 1)S 2
=

(n 1) tn1 .
T =
2
S/ n
/ n
| {z } |
{z
}
=Z

1135/1175

The function g (X1 , . . . , Xn ; ) =

s/ n

2 (n1)
n1

(decreasing).

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI mean, unknown Variance, Small Sample


b. Solution: an approximate 100 (1 ) % confidence interval
for is given by:
s
s
x t1/2,n1 < < x+ t1/2,n1 ,
n
n
where t1/2,n1 is the point on the t-distribution with n 1
degrees of freedom for which above it is /2.
Table of percentiles (quantiles) from the t-distribution are
given in F&T page 163 (note symmetry of the distribution).
d

Note: tn1 N(0, 1) as n , often used for large


samples.
1136/1175

Interpretation: as n we have s .

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI for the variance


Let X1 , . . . , Xn be a random sample from N , 2 .
We suppose that is not known and we wish to construct a
100 (1 ) % confidence interval for 2 .
a. Question: What is the pivot? See week 5 online lecture.
b. Question: Find an (approximated) 100 (1 ) % confidence
interval for 2 .

1137/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI for the variance


Define quantities 2/2 (n 1) and 21/2 (n 1):


Pr X 2/2 (n 1) =/2


Pr X 21/2 (n 1) =1 /2,
where X 2 (n 1). See F&T tables page 164-169.
a. Solution: We know from week 5 that the pivot is:
(n 1) S 2
2 (n 1) .
2
 (n 1) s 2
The function g X1 , . . . , Xn ; 2 =
(decreasing).
2
1138/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI for the variance


b. Solution:


(n 1) S 2
2
2
< 1/2 (n 1) =1 .
Pr /2 (n 1) <
2
Rewriting, we obtain:
Pr

(n 1) S 2
(n 1) S 2
2
<

<
21/2 (n 1)
2/2 (n 1)

!
=1 .

A 100 (1 ) % confidence interval estimate for 2 is:


(n 1) s 2
(n 1) s 2
2
<

<
,
21/2 (n 1)
2/2 (n 1)
where s 2 is the observed sample variance.
1139/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI for ratios of two variances


When comparing the variances of two populations, the ratio
of the variances (rather than the difference) is considered
because there is a pivotal quantity available for ratios of the
variances that has an F -distribution.
Assume that we have two sets of samples:
X11 , X12 , . . . , X1n1 ,


from N 1 , 12 ,

X21 , X22 , . . . , X2n2 ,


from N 2 , 22 .

and
Denote the respective sample variances by S12 and S22 .
Application: Is one portfolio riskier than another?
1140/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI for ratios of two variances


Recall that:
(n1 1) S12
2 (n1 1)
12

and

(n2 1) S22
2 (n2 1) .
22

1. The pivot is:



(n1 1) S12 12
2 (n1 1)
n1 1 
n 1
F (n1 1, n2 1)
= 2 1
2
2
(n2 1)
(n2 1) S2 2
n2 1
n2 1

S12 12
2 S 2
= 2  2 = 22 12 F (n1 1, n2 1) .
1 S2
S2 2

1141/1175

 2 s 2

2
2. The function g X1 , . . . , Xn ; 12 = 22 12 (decreasing).
2
1 s2

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI for ratios of two variances


So that:


22 S12
Pr F/2 (n1 1, n2 1) < 2 2 < F1/2 (n1 1, n2 1) =1
1 S2
 2

2
S2
2
S22
Pr
F/2 (n1 1, n2 1) < 2 < 2 F1/2 (n1 1, n2 1) =1
S12
1
S1
 2

2
S1
1
1
S12
1
Pr

< 2 < 2
=1 ,
S22 F1/2 (n1 1, n2 1)
2
S2 F/2 (n1 1, n2 1)
where F/2 (n1 1, n2 1) and F1/2 (n1 1, n2 1) are
determined from the table of F -distribution (see F&T page
170174).
1142/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises
Snecdors F p.d.f.

Snecdors F p.d.f.

0.7
n1=3, n2=15

0.6

0.4

0.4

fX(x)

fX(x)

0.5

0.3

0.1

0.1
0

Snecdors F c.d.f.

Snecdors F c.d.f.

7/8

FX(x)

7/8

FX(x)

0.3
0.2

0.2

n1=15, n2=3

0.5

1/2

1/8

n1=3, n2=15
0.23 0.83

1143/1175

2.25
x

1/2

1/8

n1=15, n2=3
0.45

1.21

4.37
x

F1/2 (n2 1, n1 1) =

1
F/2 (n1 1,n2 1)

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI for ratios of two variances


Note that we have:
F1/2 (n2 1, n1 1) =

1
.
F/2 (n1 1, n2 1)

Note: F&T tables only has tables for 1 = 0.1,


1 = 0.05, 1 = 0.025, or 1 = 0.01.

3. A 100 (1 ) % confidence interval estimate for


by:

12
is given
22

s12
1
12
s12

<
F1/2 (n2 1, n1 1) ,
<
s22 F1/2 (n1 1, n2 1)
22
s22

1144/1175

where s12 and s22 are the observed sample variances from the
two populations.

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Application: CI for Ratios of Two Variances


ABC Manufacturing Company makes computer chips in the
Asia Pacific region. It has been alleged that the price of its
computer chip is less variable in Asia than in the Pacific. A
total of 230 random purchases of ABCs computer chips were
made in the region and the following sample statistics were
determined:
Asia:
n1 = 179, S1 = 0.68;
Pacific: n2 = 51,
S2 = 0.85.
Question: Construct a 95% confidence interval for

1145/1175

12
.
22

One may use F10.025 (178, 50) 1.56 and


F10.025 (50, 178) 1.435. These are approximated from F
tables. For degrees of freedom much larger than 120 just use
the corresponding value at .

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Application: CI for Ratios of Two Variances

Solution: A 95% confidence interval for 12 /22 is:




0.68
0.85

2



1
12
0.68 2

<
<
F10.025 (50, 178)
F10.025 (178, 50) 22
0.85




0.68 2
1
2
0.68 2

< 12 <
1.435
0.85
1.56
0.85
2
0.410 3<

1146/1175

12
< 0.918 4
22

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI for the Difference Between Two Means


Consider two sets of independent random samples from two
different normal populations:

- X11 , X12 . . . , X1n1 from N 1 , 12 (sample size: n1 );

- X21 , X22 . . . , X2n2 from N 2 , 22 (sample size: n2 ).

a. Question: What is the distribution of X 1 X 2 ?


b. Question: What is the pivot for X 1 X 2 ?
c. Question: What is (approximated) 100 (1 ) % confidence
interval for (1 2 )?

1147/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI for the Difference Between Two Means


a. Solution: Recall that (week 4) the statistic X 1 X 2 is
normally distributed with mean:


E X 1 X 2 = 1 2 ,
and variance:



 2 2
Var X 1 X 2 = Var X 1 +Var X 2 2Cov X 1 , X 2 = 1 + 2 .
n1
n2
* using Cov (X 1 , X 2 ) = 0 using independent samples.
b. Solution: To construct a confidence interval for 1 2 , we
use the pivot:

X 1 X 2 (1 2 )
s
N (0, 1) .
12 22
+
n1
n2
1148/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI for the Difference Between Two Means


We have the (decreasing) function:
g (X1 , . . . , Xn ; 1 2 ) =

s


Pr X 1 X 2

(X
1 X 2 )(1 2 )
12 /n1 +22 /n2


2
12
+ 2 z1/2 < 1 2 < X 1 X 2 +
n1
n2

22
12
+
z1/2 = 1.
n1
n2

c. Solution: An approximate 100 (1 ) % confidence interval


for (1 2 ) is given by:
s
s
2
2
1

12 22
(x 1 x 2 )
+ 2 z1/2 < 1 2 < (x 1 x 2 ) +
+ z1/2 ,
n1
n2
n1
n2
where z1/2 is the point on the standard normal for which
the probability below it is /2.
1149/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

CI diff means when variances are equal/not equal

Previous slides assumed that the variances in the two samples


were unequal.
Sometimes, only the location should change, not the volatility.
Then, we might have more information if we combine the
information of the volatility from the two samples. This leads
to better prediction.
Be cautious when to use it!

1150/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI diff means when variances are equal


Consider the case where 1 = 2 = , then the random
variable:

X 1 X 2 (1 2 )
r
Z=
1
1

+
n1 n2
has an approximate standard normal distribution.
2 can be estimated by pooling the squared deviations from
the means of the two samples with the pooled estimator:
Sp2 =

(n1 1) S12 + (n2 1) S22


.
n1 + n2 2

This is unbiased, that is E[Sp2 ] = 2 .


1151/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI diff means when variances are equal


Also we have:
(n1 1) S12
2 (n1 1)
and
2
Hence, the weighted average:
Y

(n1 1) S12
2
|
{z
}

=
=

Pn1 1
i=1

(n2 1) S22
2 (n2 1) .
2
(n2 1) S22
2
|
{z
}

Zi2 2 (n1 1)

Pn2 1
i=1

Zi2 2 (n2 1)

(n1 + n2 2) Sp2
2 (n1 + n2 2) ,
2

|
{z
}
=

Pn1 +n2 2
i=1

Zi2

since the sum of two chi-square random variables is another


chi-square random variable with d.f. the sum of the d.f.s.
1152/1175

Question: Find the CI for 1 2 .

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI diff means when variances are equal


Solution: Recall the t-distribution definition (week 5).
1. Use as pivot the random variable:
T =q

Z
Y
n1 +n2 2

( )
(X 1 X
2) 1 2

1/n1 +1/n2

Sp2
2


X 1 X 2 (1 2 )
r
=
tn1 +n2 2 .
1
1
Sp
+
n1 n2
Here Sp is the pooled standard deviation (see slide 1151).
2. We have (decreasing function):


 . p
g (X1 , . . . , Xn ; 1 2 ) = X 1 X 2 (1 2 )
Sp 1/n1 + 1/n2 .

1153/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI diff means when variances are equal


We have:

Pr

1
1
+
< 1 2
n1
n2
r


1
1
< X 1 X 2 +t1/2,n1 +n2 2 Sp
+
= 1 .
n1
n2

X 1 X 2 t1/2,n1 +n2 2 Sp

3. An approximate 100 (1 ) % confidence interval for


(1 2 ) is given by:
r
1
1
+
< 1 2
(x 1 x 2 ) t1/2,n1 +n2 2 sp
n1 n2
r

1
1
+ ,
n1 n2
where t1/2,n1 +n2 2 is the point on the t-distribution (with
n1 + n2 2 degrees of freedom) for which the probability
above it is /2.
< (x 1 x 2 ) +t1/2,n1 +n2 2 sp

1154/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Application: CI difference in means


An insurance company offers marine insurance.
Up to two years ago, the insurer had 150 contracts, with
sample mean claims $150 and sample standard deviation $25.
Last year, the insurer introduced a small deductible in the
contract. The number of contracts after the introduction was
25 with sample mean claims $140 and sample standard
deviation $21.
Question: What is the 95% confidence interval for the change
in the sample mean due to the introduction of the deductible?
p
Solution 1: z0.025 = 1.96, 252 /150 + 212 /25 = 4.66976.
CI: (19.15; 0.85).

1155/1175

Solution 2: t0.025,173 = 1.96, p


Sp =
p
(149 252 + 24 212 )/173 1/150 + 1/25 = 5.289183.
CI: (20.37; 0.37).

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: Confidence Interval for proportions


Confidence interval estimates for p which is the proportion of
successes in a population can be found using the sampling
distribution of proportions.
Let X be the random variable denoting the number of
successes in an experiment of n trials.
Then, X Bin (n, p) and an estimator for p is b
p = X /n.
It is unbiased, because E [b
p ] = p.
Its variance is Var (b
p) =

p (1 p)
.
n

Application: Probability of issuing a claim.


1156/1175

Question: How to construct a pivotal quantity?

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: Confidence Interval for proportions


1. Solution: The pivot is (using CLT):
Z=s

b
pp


b
p 1b
p

approx

N(0, 1).

n
p
p (1 b
p ) /n and
2. Thus, g (X1 , . . . , Xn ; p) = (b
p p) / b
!
r
r
b
b
p (1 b
p)
p (1 b
p)
Pr b
p
z1/2 < p < b
p+
z1/2 = 1 .
n
n
3. A 100 (1 ) % confidence interval for p is given by:
r
r
b
b
p (1 b
p)
p (1 b
p)
b
z1/2 < p < b
p+
z1/2 .
p
n
n
1157/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI for difference of proportions


For two population proportions say p1 and p2 , the statistic
(b
p1 b
p2 ) is the unbiased point estimator for the difference
between p1 and p2 .
The variance of the sampling distribution is given by the sum
of the variances as:
b2 )

bp21 p2 = Var (b
p1 b
p2 ) =Var (b
p1 ) + Var (b
p2 ) 2Cov (b
p1 , p
b2 (1 p
b2 )
b1 (1 p
b1 ) p
p
+
.
=
n1
n2
Question: Why is Cov (b
p1 , b
p2 ) = 0?
Solution: We have two different populations we draw from,
hence independent.
1158/1175

Question: Find a CI for p1 p2 .

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Exercise: CI for difference of proportions


1. Solution: The pivot is (using CLT):
Z=

b2 ) (p1 p2 )
(b
p1 p
q

bp21 p2

approx

N(0, 1).

2. Thus,
g (X1 , . . . , Xn ; p1 p2 ) = ((b
p1 b
p2 ) (p1 p2 )) /b
p1 p2 and

Pr (b
p1 b
p2 )
bp1 p2 z1/2 < p1 p2 < (b
p1 b
p2 ) +b
p1 p2 z1/2 = 1 .

3. A 100 (1 ) % confidence interval estimate for p1 p2 is


given by:
(b
p1 b
p2 ) z1/2
bp1 p2 < p1 p2 < (b
p1 b
p2 ) +z1/2
bp1 p2 .
1159/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Application: CI for difference of proportions


A motor vehicle insurer insurer is interested in the difference
in claim rates between males and females. The insurer had
each year 300 males insured and 270 females insured.
The yearly
year
Males
Females

claim sizes in
2011 2010
45
46
37
42

the past five years were:


2009 2008 2007 total
31
49
45
216
41
36
32
188

Question: Is there a difference in the claim rate between


males and females?

1160/1175

216
188
bF = 1350
Solution: b
pM = 1500
= 0.144, p
= 0.139259,
b
pM b
pF = 0.004740741. Note: pM = 0.15 and pF = 0.13!
p2M pF = 0.144(10.144)
+ 0.139259(10.139259)
= 1.71 104
1500
1350
pM pF = 0.0131. Z = 0.004740741/0.01307538 =
0.362569852 = 1 0.641536883 = 0.358463117.

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI for paired difference

Sometimes we are interested in the comparison of two


samples, but the samples are not independent.
Let investigate data which comes in pairs, i.e.,:
(X11 , X21 ) , (X12 , X22 ) , . . . , (X1n , X2n ) .
In the case of paired or matched data, we are interested in
analysing the differences in the sample Di = X1i X2i and
therefore estimating the difference in the mean D = 1 2 .

1161/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI for paired difference


Define:
D=

n
n
1 X
1 X

Dk =
(X1k X2k ) ,
n
n
k=1

and define:

k=1

v
uP
2
u n
Dk D
u
t
SD = k=1
,
n1

which are respectively the sample mean and sample standard


deviation of the differences in the sample.
Question: How to construct a pivotal quantity?
1162/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Example: CI for paired difference


1. Solution: The pivot is (using CLT):
,s

D D
D D
(n 1)S 2
approx
 =

(n 1) t (n 1) .
2
D
SD
n
D
n
| {z } |
{z
}
2
=Z
(n1)/(n1)

2. Thus, g (D1 , . . . , Dn ; D ) = n D D /SD and


 
 
Pr D t1/2,n1 sD
n < D < D+t1/2,n1 sD
n = 1 .
3. A 100 (1 ) % confidence interval estimate for D is:
 
 
d t1/2,n1 sD
n < D < d+t1/2,n1 sD
n ,
where d is the observed sample mean of the differences and sD
is the observed sample standard deviation of the differences.
1163/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Application: CI for paired difference


An insurance company offers directors and officers liability
insurance (D&O) Iin Australia and China. The yearly claim
sizes in the past ten years were:
year
2011 2010 2009 2008 2007 2006 2005 2004 2003 2002
Aus
93
113
93
115 103 111 136
86
133 121
China 137 116 126 117 118 140 122 108 130 127
Differ 44 3 33 2 15 29
14
22
3
6
Moreover, the total claim size in Australia is $1,104 and in
China $1,241. The sum of the squared yearly claim size is
$12,444 in Australia and $15,489 in China and the sum of the
product of the Australian and Chinese yearly claim size is
$13,725.

1164/1175

The total difference in claim size is difference $-137 and the


sum of the yearly squared differences is $4,829.

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Interval estimation using confidence intervals
Examples & Exercises

Application: CI for paired difference


Question: Find the correlation coefficient between the claims
in Australia and China.
Solution: Cov (Aus,
 Ch) =

13,725
10

1,104 1,241
10 10 = 24.66
2
1,104
= 285,
10

Var (Aus) = 1/9 12, 444 10




2 
1,241
Var (Ch) = 1/9 15, 489 10 10
= 98. Hence
=

24.66
28598

= 0.15.

Question: Find the probability that the claims in China, on


average, are $10 larger than in Australia.

1165/1175

Solution: d = 137
10 = 13.7,
sd = 1/9 (4, 829 13.72 10) = 18.11.
Pr (d < 10 = 13.7 + t1,9 (18.11/3)) = 1
t1,9 = 0.61288 = 0.277561 1 = 0.722439.

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
Important properties of MLE estimates

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
Important properties of MLE estimates

Important properties of the MLEs


Suppose the density fX (x|) satisfies certain regularity
conditions (e.g., continuous, differentiable, no parameter on
the boundaries of x, etc.) and suppose bn is the MLE of for
a random sample of size n from fX (x|). Then the bn are
asymptotically normally distributed with mean:
h i
E bn = ,
and variance:
1
Var (bn ) =
n

"
E

2 #!1

1
log (fX (x|))
=
.

n If ? ()

We write this as:





d
n bn N 0,
1166/1175

1
If ? ()


.

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
Important properties of MLE estimates

Important properties of the MLEs


Note that:

"
If ? () = E

2 #

.
log (fX (x|))

It can be shown (not required for this course) that:


"
2 #
 2


E
log (fX (x|))
= E
log (fX (x|)) .

2
In evaluating the variance of MLE, you can therefore use either
form of this variance formula.

1167/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
Important properties of MLE estimates

Important properties of the MLEs


For functions of the parameter, say g (), we can easily extend
the theorem, except there is a delta-method adjustment to the
variance.
Thus, assuming g () is a differentiable function of , then:



2
  
1
d
n g bn g () N 0, g 0 ()
,
If ? ()
where g 0 () is the first derivative of g with respect to the
parameter .
Using week 4 approximate method with Taylor series.
1168/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
Important properties of MLE estimates

Important properties of the MLEs


Asymptotic properties of MLEs work well for N(, 1), N(0, ),
Exp(), Poisson(), Bernoulli().
What about the Cauchy density:
fY (y |) =

1
(1 + (y )2 )

Its notorious for having no mean (hence no variance). What


about the asymptotic behavior of MLEs?

d
Theory above suggests: bn N , n2 .
1169/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
Important properties of MLE estimates

Important properties of the MLEs


Sample of size 40 drawn from
Cauchy with = 0. We compare
the Bayesian posterior density
with the normal approx., based
on first n observations for n =3,
15 and 40. First three simulated
values were 5.01, 0.40 and -8.75:
pretty spread out. Here, for large
n, posterior density more
concentrated around the mean
than the normal, but normal
must necessarily tail off more
quickly.
1170/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
CI for Maximum Likelihood Estimates

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
CI for Maximum Likelihood Estimates

CI for Maximum Likelihood Estimates


Suppose we are interested in constructing a confidence interval
for and let b denotes its maximum likelihood estimate.
Recall that b is asymptotically normally distributed with mean:
h i
E b = ,
and variance:
"
2 #!1
  1

1
=
Var b = E
log (fX (x|))
,
n

n If ? ()
"
2 #
 2


where If ? () =E
log (fX (x|))
= E
log (fX (x|)) .

2
1171/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
CI for Maximum Likelihood Estimates

CI for Maximum Likelihood Estimates


We then have:


p
d
n If ? () b N (0, 1) .
Using this as a pivotal quantity, we have approximately:
!
r
  

Pr z1/2 < n If ? b b < z1/2 1 ,
or, equivalently:
1
b z1/2 r
 
n If ? b

1172/1175

is an approximate 100 (1 ) % confidence interval for the


parameter .

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Maximum Likelihood estimate
CI for Maximum Likelihood Estimates

CI for Maximum Likelihood Estimates


 
Note that the variance Var b actually depends on the
parameter and is being estimated by replacing by b so
that:
 
\
Var b =

1
b
n If ? ()

The standard error is usually defined to be:


r
 
 
\
s.e. b = Var b .

1173/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Summary
Summary

Evaluating estimators & Interval estimation using CIs


Evaluating estimators
Fisher (1922) on good estimators
UMVUEs
Cram
er-Rao Lower Bound (CRLB)
Consistency
Sufficient Statistics

Interval estimation using confidence intervals


Introduction
The Pivotal Quantity Method
Examples & Exercises

Maximum Likelihood estimate


Important properties of MLE estimates
CI for Maximum Likelihood Estimates

Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Summary
Summary

Evaluating estimators
1. UMVUE estimator: unbiased (E[T ] = ()), minimum
variance (Var (T ) Var (T ? ) for all T ? ). If estimator T is on
CRLB then T is UMVUE.
CRLB: Var (T (X1 , X2 , . . . , Xn ))

1
n If ? ()

2. Consistent estimator:
lim Pr (|Tn | < ) = 1,

a.s.

i.e., Tn .

3. Sufficient statistic: T is sufficient for if the conditional


distribution of X1 , X2 , . . . , Xn given T = t does not depend on
for any value of t.
1174/1175

ACTL2002/ACTL5101 Probability and Statistics: Week 6


Summary
Summary

Interval estimators
Pivotal quantity method:
1. Find the pivot;
2. Find the function g (X1 , . . . , Xn ) such that
Pr(q1 < g (X1 , . . . , Xn , ) < q2 ) = 1 ;
3. The 100(1 )% confidence interval of :

g 1 (X1 , . . . , Xn ; q1 ) g 1 (X1 , . . . , Xn ; q2 ).

Properties of MLE: Asymptotically normally distributed


E[bnML ] and Var (bnML ) (nIf ? ())1 as n .

1. Asymptotically unbiased and asymptotically on the CRLB,


hence asymptotically UMVUE;
2. Asymptotically consistent;

1175/1175

3. Asymptotically sufficient.

You might also like