Professional Documents
Culture Documents
1
The setting is the same. Given a population that follows a distribution P ,
where P contains 1 or more unknown parameters, we want to construct an
estimator for each of them. In this course, I consider the simple case, where
there is only 1 unknown parameter . To do this, we proceed by collecting
an i.i.d sample X1 , ..., Xn P
where (|x1 , ..., xn ) is the pdf of given the sample data. This is called the
posterior distribution of
Let me clarify the last step further. The symbol means proportional
to. Since the left-hand side is the distribution of conditional on the sam-
ple data {x1 , ..., xn }, all the xi are assumed to be known and the denominator
fX1 ,...,Xn (x1 , ..., xn ) is, therefore, no more than a constant
In this setting, we are given the population distribution P and the prior
distribution (). We have to find the posterior distribution (|x1 , ..., xn ).
We then use the posterior mean E[|x1 , ..., xn ] to estimate the unknown
parameter . That is,
= E[|x1 , ..., xn ]
2
Example 1
n
Y
= pxi (1 p)1xi
i=1
P P
xi
=p (1 p)n xi
P P
xi
=p (1 p)n xi
, for 0 < p < 1
Recall the pdf of Beta(, ):
(+) 1
fX (x) = ()()
x (1 x)1 , for 0 < x < 1
P
By comparing,
P we can see that the posterior distribution p is Beta( xi +
1, n xi + 1)
We know that the expectation of Beta(, ) is +
. Therefore, the posterior
mean is given by:
P
xi +1
E[p|x1 , ..., xn ] = n+2
P
Xi +1
Thus, p = n+2
is the Bayesian estimator for p
3
Example 2
Same as example 1, except that p Beta(a, b), where both a and b are
given constants
P P
xi
p (1 p)n xi a1
p (1 p)b1
P P
p xi +a1 (1 p)n xi +b1 , for 0 < p < 1
P P
We recognise this is Beta( xi + a, n xi + b)
P
xi +a
The posterior mean is E[p|x1 , ..., xn ] = n+a+b
. The Bayesian estimator for p
is given by:
P
Xi +a
p = n+a+b
Again, you can recover the normalising constant in the pdf of the posterior
distribution:
c= P (n+a+b) P
( xi +a)(n xi +b)
4
Example 3
n (x )2
Y 1 i
= exp
i=1 2 2 2 2
n
Y (xi )2
exp 2
, because 2 is known
i=1
2
( )2
exp , because 2 is known
2 2
Then, calculate the pdf of the posterior distribution:
Yn (x )2 ( )2
i
(|x1 , ..., xn ) exp exp
i=1
2 2 2 2
n
1 X
1
= exp 2 (xi )2 exp 2 (2 2 + 2 )
2 i=1 2
n
1 X 1
exp 2 (xi )2 exp 2 (2 2)
2 i=1 2
2
by removing exp
2 2
5
n
1 X 2
2
1 2
= exp 2 (x 2xi + ) exp 2 ( 2)
2 i=1 i 2
n
1 X
1
exp 2 (2xi + 2 ) exp 2 (2 2)
2 i=1 2
n
1 X 2
by removing exp 2 x
2 i=1 i
n
X n2 1 2
= exp xi exp ( 2)
2 i=1
2 2 2 2
n n
h 1 2 1 X i
= exp + + xi + 2
2 2 2 2 2 i=1
= exp(A2 + B)
2 B/A
= exp
1/A
2 B/A + B 2 /4A2
exp
1/A
( B/2A)2 i
h
= exp
1/A
Comparing with the pdf of a Normal distribution, we deduce that the pos-
terior distribution of is given by:
B 1
|x1 , ...xn N 2A , 2A
6
Example 4
Treatment 1 cured 100% of the patients. But the sample was so small that
we should cast doubt on the result. On the other hand, Treatment 3 was
very reassuring, but the percentage was a bit lower
Let p be the probability that a patient is cured. Then, the probability that
a patient is not cured is 1 p
We can see that p for Treatment 2 is the highest. Therefore, we predict that
Treatment 2 is the best one. Treatment 1, despite curing everyone in the
sample, is predicted to be the worst due to its small sample size