You are on page 1of 169

Bayesian Statistical Analysis

Chapter 2: Single-Parameter Models

Tang Yin-cai
yctang@stat.ecnu.edu.cn

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 1/58


Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Estimating a probability Model
Example
from binomial data discrete prior
(Section 2.1-2.5) uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 2/58


Probelm

Estimating a
Problem: Estimate an unknown population propor- probability
tion θ from the results of a sequence of /Bernoulli from binomial data
(Section 2.1-2.5)
trials0; that is, data y1 , · · · , yn , each = either 0 or Problem
The Binomial
1. Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 3/58


Probelm

Estimating a
Problem: Estimate an unknown population propor- probability
tion θ from the results of a sequence of /Bernoulli from binomial data
(Section 2.1-2.5)
trials0; that is, data y1 , · · · , yn , each = either 0 or Problem
The Binomial
1. Model
Example
This problem provides a relatively simple but discrete prior
uniform prior
important starting point for the discussion of Beta dist’n
Bayesian inference. Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 3/58


Probelm

Estimating a
Problem: Estimate an unknown population propor- probability
tion θ from the results of a sequence of /Bernoulli from binomial data
(Section 2.1-2.5)
trials0; that is, data y1 , · · · , yn , each = either 0 or Problem
The Binomial
1. Model
Example
This problem provides a relatively simple but discrete prior
uniform prior
important starting point for the discussion of Beta dist’n
Bayesian inference. Posterior Beta
A compromise
Choice of prior
The binomial distribution provides natural model for conjugate prior
a sequence of n exchangeable trials each giving Estimating
posterior dist’n
rise to one of two possible outcomes, convention- Using simulation
Example
ally labeled ’success’ and ’failure.’

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 3/58


The Binomial Model

Summarize data by: Estimating a


probability
from binomial data
y = total number of successes in the n trials. (Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 4/58


The Binomial Model

Summarize data by: Estimating a


probability
from binomial data
y = total number of successes in the n trials. (Section 2.1-2.5)
Problem
Parameterize the model using θ, which represents: The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 4/58


The Binomial Model

Summarize data by: Estimating a


probability
from binomial data
y = total number of successes in the n trials. (Section 2.1-2.5)
Problem
Parameterize the model using θ, which represents: The Binomial
Model
Example
■ proportion of successes in the population discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 4/58


The Binomial Model

Summarize data by: Estimating a


probability
from binomial data
y = total number of successes in the n trials. (Section 2.1-2.5)
Problem
Parameterize the model using θ, which represents: The Binomial
Model
Example
■ proportion of successes in the population discrete prior
uniform prior
■ probability of success in each trial Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 4/58


Example: Birth /ratio0

Consider estimating the sex ratio in a population of Estimating a


probability
human births: Is Pr{female birth} = 0.5? from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58


Example: Birth /ratio0

Consider estimating the sex ratio in a population of Estimating a


probability
human births: Is Pr{female birth} = 0.5? from binomial data
(Section 2.1-2.5)
Problem
We define the parameter The Binomial
Model
θ = proportion of female births Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58


Example: Birth /ratio0

Consider estimating the sex ratio in a population of Estimating a


probability
human births: Is Pr{female birth} = 0.5? from binomial data
(Section 2.1-2.5)
Problem
We define the parameter The Binomial
Model
θ = proportion of female births Example
discrete prior
uniform prior
Note: We may work with transformation, e.g. the Beta dist’n

ratio of male to female birth rates, φ = 1−θ


θ
. Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58


Example: Birth /ratio0

Consider estimating the sex ratio in a population of Estimating a


probability
human births: Is Pr{female birth} = 0.5? from binomial data
(Section 2.1-2.5)
Problem
We define the parameter The Binomial
Model
θ = proportion of female births Example
discrete prior
uniform prior
Note: We may work with transformation, e.g. the Beta dist’n

ratio of male to female birth rates, φ = 1−θ


θ
. Posterior Beta
A compromise
Choice of prior
conjugate prior
Let y = number of girls in n births. Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58


Example: Birth /ratio0

Consider estimating the sex ratio in a population of Estimating a


probability
human births: Is Pr{female birth} = 0.5? from binomial data
(Section 2.1-2.5)
Problem
We define the parameter The Binomial
Model
θ = proportion of female births Example
discrete prior
uniform prior
Note: We may work with transformation, e.g. the Beta dist’n

ratio of male to female birth rates, φ = 1−θ


θ
. Posterior Beta
A compromise
Choice of prior
conjugate prior
Let y = number of girls in n births. Estimating
posterior dist’n
Using simulation
Q: When would binomial model be appropriate? Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58


Example: Birth /ratio0

Consider estimating the sex ratio in a population of Estimating a


probability
human births: Is Pr{female birth} = 0.5? from binomial data
(Section 2.1-2.5)
Problem
We define the parameter The Binomial
Model
θ = proportion of female births Example
discrete prior
uniform prior
Note: We may work with transformation, e.g. the Beta dist’n

ratio of male to female birth rates, φ = 1−θ


θ
. Posterior Beta
A compromise
Choice of prior
conjugate prior
Let y = number of girls in n births. Estimating
posterior dist’n
Using simulation
Q: When would binomial model be appropriate? Example

Ans: Exchangeability

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58


First analysis: discrete prior

Suppose only two values of θ are considered Estimating a


probability
possible, e.g. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58


First analysis: discrete prior

Suppose only two values of θ are considered Estimating a


probability
possible, e.g. from binomial data
(Section 2.1-2.5)
Problem
■ θ = 0.5 (what we always thought) or The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58


First analysis: discrete prior

Suppose only two values of θ are considered Estimating a


probability
possible, e.g. from binomial data
(Section 2.1-2.5)
Problem
■ θ = 0.5 (what we always thought) or The Binomial
Model
■ θ = 0.485 (someone told us but we.re not sure Example
discrete prior
whether to believe it). uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58


First analysis: discrete prior

Suppose only two values of θ are considered Estimating a


probability
possible, e.g. from binomial data
(Section 2.1-2.5)
Problem
■ θ = 0.5 (what we always thought) or The Binomial
Model
■ θ = 0.485 (someone told us but we.re not sure Example
discrete prior
whether to believe it). uniform prior
Beta dist’n
Posterior Beta
Posterior distribution: A compromise
Choice of prior
p(θ|y) ∝ p(θ)p(y|θ) conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58


First analysis: discrete prior

Suppose only two values of θ are considered Estimating a


probability
possible, e.g. from binomial data
(Section 2.1-2.5)
Problem
■ θ = 0.5 (what we always thought) or The Binomial
Model
■ θ = 0.485 (someone told us but we.re not sure Example
discrete prior
whether to believe it). uniform prior
Beta dist’n
Posterior Beta
Posterior distribution: A compromise
Choice of prior
p(θ|y) ∝ p(θ)p(y|θ) conjugate prior
Estimating
posterior dist’n
This is best obtained by a table with one line per Using simulation
value of θ: Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58


Suppose we specify prior distribution uniformly Estimating a
probability
across the 2 values: from binomial data
(Section 2.1-2.5)
Problem
If n=100, y=48, then The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 7/58


Suppose we specify prior distribution uniformly Estimating a
probability
across the 2 values: from binomial data
(Section 2.1-2.5)
Problem
If n=100, y=48, then The Binomial
Model
Example
θ p(θ) p(y|θ) = θy (1 − θ)n−y p(θ|y) discrete prior
uniform prior
0.485 0.5 8.503 × 10−31 0.52 Beta dist’n
Posterior Beta
0.50 0.5 7.889 × 10−31 0.48 A compromise
Choice of prior
16.392 × 10−31 conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 7/58


Suppose we specify prior distribution uniformly Estimating a
probability
across the 2 values: from binomial data
(Section 2.1-2.5)
Problem
If n=100, y=48, then The Binomial
Model
Example
θ p(θ) p(y|θ) = θy (1 − θ)n−y p(θ|y) discrete prior
uniform prior
0.485 0.5 8.503 × 10−31 0.52 Beta dist’n
Posterior Beta
0.50 0.5 7.889 × 10−31 0.48 A compromise
Choice of prior
16.392 × 10−31 conjugate prior
Estimating
posterior dist’n
Conclusion: These data don’t shift our prior Using simulation
Example
distribution much.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 7/58


Now suppose that n=1000, y=480. We have Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 8/58


Now suppose that n=1000, y=480. We have Estimating a
probability
from binomial data
θ p(θ) log(p(y|θ)) p(θ|y) (Section 2.1-2.5)
Problem
The Binomial
0.485 0.5 −692.397 0.68 Model
Example
0.50 0.5 −693.147 0.32 discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 8/58


Now suppose that n=1000, y=480. We have Estimating a
probability
from binomial data
θ p(θ) log(p(y|θ)) p(θ|y) (Section 2.1-2.5)
Problem
The Binomial
0.485 0.5 −692.397 0.68 Model
Example
0.50 0.5 −693.147 0.32 discrete prior
uniform prior
Beta dist’n
Conclusion: Data and prior now favor θ = 0.485 by Posterior Beta
2:1 A compromise
Choice of prior
(but still substantial probability on θ = 0.5). conjugate prior
Estimating
posterior dist’n
Q: Is discrete prior distribution reasonable? Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 8/58


Second analysis: uniform continuous prior

Simplest example of a prior distribution is to Estimating a


probability
assume p(θ) ∝ 1 (in fact p(θ) = 1!) from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 9/58


Second analysis: uniform continuous prior

Simplest example of a prior distribution is to Estimating a


probability
assume p(θ) ∝ 1 (in fact p(θ) = 1!) from binomial data
(Section 2.1-2.5)
Problem
Bayes.rule gives The Binomial
Model
Example
discrete prior

p(θ|y) ∝ θy (1−θ)n−y uniform prior


Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 9/58


Second analysis: uniform continuous prior

Simplest example of a prior distribution is to Estimating a


probability
assume p(θ) ∝ 1 (in fact p(θ) = 1!) from binomial data
(Section 2.1-2.5)
Problem
Bayes.rule gives The Binomial
Model
Example
discrete prior

p(θ|y) ∝ θy (1−θ)n−y uniform prior


Beta dist’n
Posterior Beta
A compromise
n Choice of prior

[Q: what happened to the factor: y
?] conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 9/58


Second analysis: uniform continuous prior

Simplest example of a prior distribution is to Estimating a


probability
assume p(θ) ∝ 1 (in fact p(θ) = 1!) from binomial data
(Section 2.1-2.5)
Problem
Bayes.rule gives The Binomial
Model
Example
discrete prior

p(θ|y) ∝ θy (1−θ)n−y uniform prior


Beta dist’n
Posterior Beta
A compromise
n Choice of prior

[Q: what happened to the factor: y
?] Ans: It is a conjugate prior
Estimating
constant! posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 9/58


Second analysis: uniform continuous prior

Simplest example of a prior distribution is to Estimating a


probability
assume p(θ) ∝ 1 (in fact p(θ) = 1!) from binomial data
(Section 2.1-2.5)
Problem
Bayes.rule gives The Binomial
Model
Example
discrete prior

p(θ|y) ∝ θy (1−θ)n−y (unnormalized posterior density) uniform prior


Beta dist’n
Posterior Beta
A compromise
n Choice of prior

[Q: what happened to the factor: y
?] Ans: It is a conjugate prior
Estimating
constant! posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 9/58


Second analysis: uniform continuous prior

Simplest example of a prior distribution is to Estimating a


probability
assume p(θ) ∝ 1 (in fact p(θ) = 1!) from binomial data
(Section 2.1-2.5)
Problem
Bayes.rule gives The Binomial
Model
Example
discrete prior

p(θ|y) ∝ θy (1−θ)n−y (unnormalized posterior density) uniform prior


Beta dist’n
Posterior Beta
A compromise
n Choice of prior

[Q: what happened to the factor: y
?] Ans: It is a conjugate prior
Estimating
constant! posterior dist’n
Using simulation
Example
This is a beta distribution:
θ|y ∼ Beta(y+1, n−y+1).

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 9/58


Second analysis: uniform continuous prior

Simplest example of a prior distribution is to Estimating a


probability
assume p(θ) ∝ 1 (in fact p(θ) = 1!) from binomial data
(Section 2.1-2.5)
Problem
Bayes.rule gives The Binomial
Model
Example
discrete prior

p(θ|y) ∝ θy (1−θ)n−y (unnormalized posterior density) uniform prior


Beta dist’n
Posterior Beta
A compromise
n Choice of prior

[Q: what happened to the factor: y
?] Ans: It is a conjugate prior
Estimating
constant! posterior dist’n
Using simulation
Example
This is a beta distribution:
θ|y ∼ Beta(y+1, n−y+1). (normalized posterior density)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 9/58


What is a beta distribution?

The beta distribution is a continuous distribution on Estimating a


probability
[0, 1] with a wide variety of shapes, determined by from binomial data
(Section 2.1-2.5)
2 parameters: Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 10/58


What is a beta distribution?

The beta distribution is a continuous distribution on Estimating a


probability
[0, 1] with a wide variety of shapes, determined by from binomial data
(Section 2.1-2.5)
2 parameters: Problem
The Binomial
p(θ|α, β) ∝ θα−1 (1 − θ)β−1 Model
Example
discrete prior
where α > 0, β > 0. uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 10/58


What is a beta distribution?

The beta distribution is a continuous distribution on Estimating a


probability
[0, 1] with a wide variety of shapes, determined by from binomial data
(Section 2.1-2.5)
2 parameters: Problem
The Binomial
p(θ|α, β) ∝ θα−1 (1 − θ)β−1 Model
Example
discrete prior
where α > 0, β > 0. uniform prior
Beta dist’n
Posterior Beta
It is unimodal with mode ∈ (0, 1) if α > 1, β > 1 and A compromise
approach a normal curve for α, β → ∞. Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 10/58


Shape of Beta(y + 1, n − y + 1)

Consider different values of n and y, but with the Estimating a


probability
same proportion of success: 0.6. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 11/58


Shape of Beta(y + 1, n − y + 1)

Consider different values of n and y, but with the Estimating a


probability
same proportion of success: 0.6. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 11/58


Shape of Beta(y + 1, n − y + 1)

Consider different values of n and y, but with the Estimating a


probability
same proportion of success: 0.6. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
■ n = 20, y = 12 discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 11/58


Shape of Beta(y + 1, n − y + 1)

Consider different values of n and y, but with the Estimating a


probability
same proportion of success: 0.6. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
■ n = 20, y = 12 discrete prior
uniform prior
■ n = 100, y = 60 Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 11/58


Shape of Beta(y + 1, n − y + 1)

Consider different values of n and y, but with the Estimating a


probability
same proportion of success: 0.6. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
■ n = 20, y = 12 discrete prior
uniform prior
■ n = 100, y = 60 Beta dist’n
Posterior Beta
■ n = 1000, y = 600 A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 11/58


Shape of Beta(y + 1, n − y + 1)

Consider different values of n and y, but with the Estimating a


probability
same proportion of success: 0.6. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
■ n = 20, y = 12 discrete prior
uniform prior
■ n = 100, y = 60 Beta dist’n
Posterior Beta
■ n = 1000, y = 600 A compromise
Choice of prior
conjugate prior
Q: What do you observe? Explain. Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 11/58


Shape of Beta(y + 1, n − y + 1)

Consider different values of n and y, but with the Estimating a


probability
same proportion of success: 0.6. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
■ n = 20, y = 12 discrete prior
uniform prior
■ n = 100, y = 60 Beta dist’n
Posterior Beta
■ n = 1000, y = 600 A compromise
Choice of prior
conjugate prior
Q: What do you observe? Explain. Estimating
posterior dist’n
Using simulation
R code: fig2 1.R Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 11/58


Estimating a
probability
from binomial data
2.0 (Section 2.1-2.5)

3
1.5
n=5 n=20
y=3
y=12
Problem
The Binomial
1.0

2
Model
0.5

1
Example
0.0

discrete prior

0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
uniform prior
Beta dist’n
Posterior Beta
A compromise

25
8

Choice of prior

20
n=1000
n=100 conjugate prior
6

y=600
y=60

15
Estimating
4

10
posterior dist’n
2

Using simulation
5

Example
0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Figure 1: Posterior density for binomial parameter θ, based on


uniform prior distribution O L O FyFsuccesses
S C H Oand out
INANCE AND ST T I Cn
A T I Sof S trials.

March 23, 2009 Chapter 2 - p. 12/58


Prior Prediction: Review

Before any data is observed the distribution of Estimating a


probability
unknown but observable y is from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 13/58


Prior Prediction: Review

Before any data is observed the distribution of Estimating a


probability
unknown but observable y is from binomial data
(Section 2.1-2.5)
Problem
Z Z The Binomial
Model
p(y) = p(y, θ)dθ = p(θ)p(y|θ)dθ Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 13/58


Prior Prediction: Review

Before any data is observed the distribution of Estimating a


probability
unknown but observable y is from binomial data
(Section 2.1-2.5)
Problem
Z Z The Binomial
Model
p(y) = p(y, θ)dθ = p(θ)p(y|θ)dθ Example
discrete prior
uniform prior
Beta dist’n
This is the marginal or prior predictive distribution Posterior Beta
A compromise
of y. Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 13/58


Posterior Prediction: Review

After y is observed, we can derive the dist’n of Estimating a


probability
unknown but potentially observable ỹ using the from binomial data
(Section 2.1-2.5)
same process: Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 14/58


Posterior Prediction: Review

After y is observed, we can derive the dist’n of Estimating a


probability
unknown but potentially observable ỹ using the from binomial data
(Section 2.1-2.5)
same process: Problem
The Binomial
Model
Example
Z discrete prior
uniform prior
p(ỹ|y) = p(ỹ, θ|y)dθ Beta dist’n
Posterior Beta
Z A compromise
= p(θ|y)p(ỹ|θ, y)dθ Choice of prior
conjugate prior
Z Estimating
posterior dist’n
= p(θ|y)p(ỹ|θ)dθ Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 14/58


Posterior Prediction: Review

After y is observed, we can derive the dist’n of Estimating a


probability
unknown but potentially observable ỹ using the from binomial data
(Section 2.1-2.5)
same process: Problem
The Binomial
Model
Example
Z discrete prior
uniform prior
p(ỹ|y) = p(ỹ, θ|y)dθ Beta dist’n
Posterior Beta
Z A compromise
= p(θ|y)p(ỹ|θ, y)dθ Choice of prior
conjugate prior
Z Estimating
posterior dist’n
= p(θ|y)p(ỹ|θ)dθ Using simulation
Example

This is the posterior predictive distribution of ỹ.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 14/58


Posterior Prediction: The sex ratio example

The natural application here is for ỹ to be the result Estimating a


probability
of one new trial, exchangeable with first n. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 15/58


Posterior Prediction: The sex ratio example

The natural application here is for ỹ to be the result Estimating a


probability
of one new trial, exchangeable with first n. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Z 1 Example
P r(ỹ = 1|y) = P r(ỹ = 1|θ, y)p(θ|y)dθ discrete prior
uniform prior
0 Beta dist’n
1
y+1
Z Posterior Beta
= θp(θ|y)dθ = E(θ|y) = . A compromise
0 n+2 Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 15/58


Posterior Prediction: The sex ratio example

The natural application here is for ỹ to be the result Estimating a


probability
of one new trial, exchangeable with first n. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Z 1 Example
P r(ỹ = 1|y) = P r(ỹ = 1|θ, y)p(θ|y)dθ discrete prior
uniform prior
0 Beta dist’n
1
y+1
Z Posterior Beta
= θp(θ|y)dθ = E(θ|y) = . A compromise
0 n+2 Choice of prior
conjugate prior
Estimating
posterior dist’n
This is Laplace.s notorious /Law of Using simulation
Example
Succession0.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 15/58


Posterior distribution as compromise

In binomial model with uniform prior distribution: Estimating a


probability
from binomial data
1
Prior mean = 2
→ (Section 2.1-2.5)
Problem
y+1 The Binomial
posterior mean = n+2 Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 16/58


Posterior distribution as compromise

In binomial model with uniform prior distribution: Estimating a


probability
from binomial data
1
Prior mean = 2
→ (Section 2.1-2.5)
Problem
y+1 The Binomial
posterior mean = n+2 Model
Example
discrete prior
1
A compromise between the prior mean 2
and the uniform prior

sample proportion, ny .
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 16/58


Posterior distribution as compromise

In binomial model with uniform prior distribution: Estimating a


probability
from binomial data
1
Prior mean = 2
→ (Section 2.1-2.5)
Problem
y+1 The Binomial
posterior mean = n+2 Model
Example
discrete prior
1
A compromise between the prior mean 2
and the uniform prior

sample proportion, ny .
Beta dist’n
Posterior Beta
A compromise
Choice of prior
This is a general feature of Bayesian inference: conjugate prior
Estimating
posterior distribution centered at a compromise posterior dist’n
between the prior information and the data, with Using simulation
Example
/compromise0increasingly controlled by the data
as sample size increases (can be investigated
more formally using conditional expectation formu-
lae).
SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 16/58


Results for the uniform prior distribution

Back to the sex ratio example: Under uniform prior Estimating a


probability
dist.n a from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
y+1 Example
E(θ|y) = discrete prior
n+2 uniform prior
Beta dist’n
(y + 1)(n − y + 1) Posterior Beta
V ar(θ|y) = A compromise
(n + 2)2 (n + 3) Choice of prior
a α conjugate prior
From Appendix A: If X ∼ Beta(α, β), then E(X) = α+β
and V ar(θ) = Estimating
αβ posterior dist’n
.
(α+β)2 (α+β+1) Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 17/58


In 2 hypothetical cases: Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 18/58


In 2 hypothetical cases: Estimating a
probability
from binomial data
(Section 2.1-2.5)
n=100 n=1000 Problem
The Binomial
y=48 y=480 Model
Example
E(y|θ) 0.4804 0.4800 discrete prior
uniform prior
SD(y|θ) 0.049 0.016 Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 18/58


In 2 hypothetical cases: Estimating a
probability
from binomial data
(Section 2.1-2.5)
n=100 n=1000 Problem
The Binomial
y=48 y=480 Model
Example
E(y|θ) 0.4804 0.4800 discrete prior
uniform prior
SD(y|θ) 0.049 0.016 Beta dist’n
Posterior Beta
A compromise
Q: What would our conclusions about the sex ratio Choice of prior
in each case be? conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 18/58


Choice of prior distribution

Bayes is believed to have justified choosing the Estimating a


probability
uniform prior dist.n in the binomial model from binomial data
(Section 2.1-2.5)
because the prior predictive dist.n is Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 19/58


Choice of prior distribution

Bayes is believed to have justified choosing the Estimating a


probability
uniform prior dist.n in the binomial model from binomial data
(Section 2.1-2.5)
because the prior predictive dist.n is Problem
The Binomial
Model
Example
discrete prior
1  
n y
Z
uniform prior
p(y) = θ (1 − θ)n−y dθ Beta dist’n
0 y Posterior Beta
A compromise
1 Choice of prior
= , y = 0, 1, · · · , n. conjugate prior
n+1 Estimating
posterior dist’n
Using simulation
Thus all possible values of y are equally likely a Example

priori.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 19/58


Choice of prior distribution

Bayes is believed to have justified choosing the Estimating a


probability
uniform prior dist.n in the binomial model from binomial data
(Section 2.1-2.5)
because the prior predictive dist.n is Problem
The Binomial
Model
Example
discrete prior
1  
n y
Z
uniform prior
p(y) = θ (1 − θ)n−y dθ Beta dist’n
0 y Posterior Beta
A compromise
1 Choice of prior
= , y = 0, 1, · · · , n. conjugate prior
n+1 Estimating
posterior dist’n
Using simulation
Thus all possible values of y are equally likely a Example

priori.
Not necessarily a compelling argument õ
/true0Bayesian analysis must use a subjectively
assessed prior dist.n.
S F
CHOOL OF S
INANCE AND TAT I S T I C S

March 23, 2009 Chapter 2 - p. 19/58


Third analysis: conjugate prior

Based on the binomial likelihood: Estimating a


probability
  from binomial data
n y
p(y|θ) = Bin(y|n, θ) = θ (1 − θ)n−y (Section 2.1-2.5)
y Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 20/58


Third analysis: conjugate prior

Based on the binomial likelihood: Estimating a


probability
  from binomial data
n y
p(y|θ) = Bin(y|n, θ) = θ (1 − θ)n−y (Section 2.1-2.5)
y Problem
The Binomial
Model
suppose that the prior density has a form Example
discrete prior
uniform prior
p(θ) ∝ θα−1 (1 − θ)β−1 , Beta dist’n
Posterior Beta
that is, θ ∼ Beta(α, β) (equivalent to the binomial A compromise
Choice of prior
likelihood with α − 1 prior successes and β − 1 conjugate prior
Estimating
prior failures) posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 20/58


Third analysis: conjugate prior

Based on the binomial likelihood: Estimating a


probability
  from binomial data
n y
p(y|θ) = Bin(y|n, θ) = θ (1 − θ)n−y (Section 2.1-2.5)
y Problem
The Binomial
Model
suppose that the prior density has a form Example
discrete prior
uniform prior
p(θ) ∝ θα−1 (1 − θ)β−1 , Beta dist’n
Posterior Beta
that is, θ ∼ Beta(α, β) (equivalent to the binomial A compromise
Choice of prior
likelihood with α − 1 prior successes and β − 1 conjugate prior
Estimating
prior failures) posterior dist’n
Using simulation
Parameters of prior distribution called Example
hyperparameters.
Two hyperparameters of beta prior distribution can
be fixed by specifying two features of the distribu-
tion, e.g. its meanS and variance.
CHOOL OFF S
INANCE AND TAT I S T I C S

March 23, 2009 Chapter 2 - p. 20/58


If α, β are fixed at reasonable choices, we obtain Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 21/58


If α, β are fixed at reasonable choices, we obtain Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
p(θ|y) ∝ θy (1 − θ)n−y θα−1 (1 − θ)β−1 The Binomial
Model

= θy+α−1 (1 − θ)n−y+β−1 Example


discrete prior
uniform prior
= Beta(θ|α + y, β + n − y). Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 21/58


If α, β are fixed at reasonable choices, we obtain Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
p(θ|y) ∝ θy (1 − θ)n−y θα−1 (1 − θ)β−1 The Binomial
Model

= θy+α−1 (1 − θ)n−y+β−1 Example


discrete prior
uniform prior
= Beta(θ|α + y, β + n − y). Beta dist’n
Posterior Beta
A compromise
The property that posterior distribution follows Choice of prior
conjugate prior
same parametric form as prior distribution is called Estimating
posterior dist’n
conjugacy; the beta prior distribution is a conjugate Using simulation
family for the binomial likelihood. Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 21/58


If α, β are fixed at reasonable choices, we obtain Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
p(θ|y) ∝ θy (1 − θ)n−y θα−1 (1 − θ)β−1 The Binomial
Model

= θy+α−1 (1 − θ)n−y+β−1 Example


discrete prior
uniform prior
= Beta(θ|α + y, β + n − y). Beta dist’n
Posterior Beta
A compromise
The property that posterior distribution follows Choice of prior
conjugate prior
same parametric form as prior distribution is called Estimating
posterior dist’n
conjugacy; the beta prior distribution is a conjugate Using simulation
family for the binomial likelihood. Example

Remark: Maths is convenient, but not necessarily


a good model!
SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 21/58


Estimating properties of the posterior dist’n

Beta distribution: exact summaries (mean, SD etc) Estimating a


probability
can be obtained but how do we determine from binomial data
(Section 2.1-2.5)
quantiles and hence posterior probability intervals? Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 22/58


Estimating properties of the posterior dist’n

Beta distribution: exact summaries (mean, SD etc) Estimating a


probability
can be obtained but how do we determine from binomial data
(Section 2.1-2.5)
quantiles and hence posterior probability intervals? Problem
The Binomial
We require either: Model
Example
■ Use numerical integration (incomplete beta discrete prior
uniform prior
integral). Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 22/58


Estimating properties of the posterior dist’n

Beta distribution: exact summaries (mean, SD etc) Estimating a


probability
can be obtained but how do we determine from binomial data
(Section 2.1-2.5)
quantiles and hence posterior probability intervals? Problem
The Binomial
We require either: Model
Example
■ Use numerical integration (incomplete beta discrete prior
uniform prior
integral). Beta dist’n
■ Approximate the beta integral (normal Posterior Beta
A compromise
distribution?). Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 22/58


Estimating properties of the posterior dist’n

Beta distribution: exact summaries (mean, SD etc) Estimating a


probability
can be obtained but how do we determine from binomial data
(Section 2.1-2.5)
quantiles and hence posterior probability intervals? Problem
The Binomial
We require either: Model
Example
■ Use numerical integration (incomplete beta discrete prior
uniform prior
integral). Beta dist’n
■ Approximate the beta integral (normal Posterior Beta
A compromise
distribution?). Choice of prior
conjugate prior
■ Resort to simulation: obtain a random sample Estimating
posterior dist’n
from the dist.n and obtain all desired Using simulation
Example
summaries by numerically summarizing this
sample.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 22/58


Using simulation to estimate posterior dist’n

Last strategy is the most general (and requires Estimating a


probability
least analytical effort: computers replace algebra!). from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 23/58


Using simulation to estimate posterior dist’n

Last strategy is the most general (and requires Estimating a


probability
least analytical effort: computers replace algebra!). from binomial data
(Section 2.1-2.5)
We can simulate from the beta distribution using Problem
The Binomial
either R or BUGS. Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 23/58


Using simulation to estimate posterior dist’n

Last strategy is the most general (and requires Estimating a


probability
least analytical effort: computers replace algebra!). from binomial data
(Section 2.1-2.5)
We can simulate from the beta distribution using Problem
The Binomial
either R or BUGS. Model
Example
discrete prior
Further advantage of simulation: distribution of uniform prior
Beta dist’n
functions of θ can be obtained with little further Posterior Beta
effort, e.g. sex ratio φ = 1−θ
θ
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 23/58


Female births and placenta praevia

Estimating a
As a specific example of a factor that may influ- probability
ence the sex ratio, we consider the maternal con- from binomial data
(Section 2.1-2.5)
dition placenta() previa(c˜), an unusual con- Problem
The Binomial
dition of pregnancy(~)) in which the placenta is Model
Example
implanted very low in the uterus(fû), obstruct- discrete prior
ing({…) the fetus() from a normal vaginal de- uniform prior
Beta dist’n
livery(^). An early study concerning the sex of Posterior Beta
A compromise
placenta previa births in Germany found that of a Choice of prior
total of 980 births, 437 were female. How much conjugate prior
Estimating
evidence does this provide for the claim that the posterior dist’n
Using simulation
proportion of female births in the population of pla- Example
centa previa births is less than 0.485, the propor-
tion of female births in the general population?

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 24/58


Uniform prior → p(θ|y) = Beta(438, 544). Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 25/58


Uniform prior → p(θ|y) = Beta(438, 544). Estimating a
probability
from binomial data
■ Mean=0.446 (Section 2.1-2.5)
Problem
■ STD=0.016 The Binomial
Model
■ Median=0.446 Example
discrete prior
■ central 95% posterior interval = (0.415,0.477). uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 25/58


Uniform prior → p(θ|y) = Beta(438, 544). Estimating a
probability
from binomial data
■ Mean=0.446 (Section 2.1-2.5)
Problem
■ STD=0.016 The Binomial
Model
■ Median=0.446 Example
discrete prior
■ central 95% posterior interval = (0.415,0.477). uniform prior
Beta dist’n
Posterior Beta
The following figure is a histogram of 1000 A compromise
simulated values. These give sample mean, Choice of prior
conjugate prior
median and sd almost identical to exact values and Estimating
posterior dist’n
95% posterior interval (0.415,0.476). Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 25/58


140 Estimating a
probability

60
70
from binomial data
(Section 2.1-2.5)
120

50
60
Problem
The Binomial
100

50
Model

40
Example
80

40
discrete prior
Frequency

Frequency

Frequency

30
uniform prior
60

30
Beta dist’n

20
Posterior Beta
40

20
A compromise

10
Choice of prior
20

10

conjugate prior
Estimating
0

0
0.35 0.40 0.45 0.50 0.55 −0.5 −0.3 −0.1 0.1 1.0 1.1 1.2 1.3 1.4 1.5 1.6 posterior dist’n
θ logit(θ) = log(θ (1 − θ)) φ = (1 − θ) θ
Using simulation
Example

Figure 2: Draws from the posterior distribution of (a) the prob-


ability of female birth, θ; (b) the logit transform, logit(θ); (c) the
male-to-female sex ratio, φ = (1 − θ)/θ.
SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 26/58


Figure (c) shows histogram of simulated sex ratio. Estimating a
probability
95% posterior interval for this = (1.10,1.41); from binomial data
(Section 2.1-2.5)
median = 1.24. Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 27/58


Figure (c) shows histogram of simulated sex ratio. Estimating a
probability
95% posterior interval for this = (1.10,1.41); from binomial data
(Section 2.1-2.5)
median = 1.24. Problem
The Binomial
Model
Note: intervals are well away from θ = 0.485 (ratio Example
= 1.06), implying that probability of female birth in discrete prior
uniform prior
placenta praevia is lower than in general Beta dist’n
Posterior Beta
population. A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 27/58


Figure (c) shows histogram of simulated sex ratio. Estimating a
probability
95% posterior interval for this = (1.10,1.41); from binomial data
(Section 2.1-2.5)
median = 1.24. Problem
The Binomial
Model
Note: intervals are well away from θ = 0.485 (ratio Example
= 1.06), implying that probability of female birth in discrete prior
uniform prior
placenta praevia is lower than in general Beta dist’n
Posterior Beta
population. A compromise
Choice of prior
conjugate prior
See Table 2.1 in Gelman et al. for sensitivity Estimating
analysis to varying prior distribution; posterior dist’n
Using simulation
Example

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 27/58


Analysis for the
normal distribution
(Section 2.6)
Unknown mean/
known variance

Analysis for the normal distribution one data point


Conjugate prior
(Section 2.6) Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 28/58


Normal dist’n: Unknown mean, known variance

Analysis for the


Normal model underlies much statistical modelling. normal distribution
(Section 2.6)
Unknown mean/
known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 29/58


Normal dist’n: Unknown mean, known variance

Analysis for the


Normal model underlies much statistical modelling. normal distribution
(Section 2.6)
Unknown mean/
why? known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 29/58


Normal dist’n: Unknown mean, known variance

Analysis for the


Normal model underlies much statistical modelling. normal distribution
(Section 2.6)
Unknown mean/
why? known variance
one data point
■ CLT justifies normal approximation; Conjugate prior
Posterior density
■ useful component of more complicated Precisions
Interpreting µ1
model(Student-t or mixture distributions). More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 29/58


Normal dist’n: Unknown mean, known variance

Analysis for the


Normal model underlies much statistical modelling. normal distribution
(Section 2.6)
Unknown mean/
why? known variance
one data point
■ CLT justifies normal approximation; Conjugate prior
Posterior density
■ useful component of more complicated Precisions
Interpreting µ1
model(Student-t or mixture distributions). More on µ1
Posterior prediction
multiple
Model development: observations
1. Just one data point. multiple
observations
Known
2. General case of a /sample0of data with many mean/unknown
data points. variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 29/58


Likelihood of one data point

Consider a single observation y from normal Analysis for the


normal distribution
distribution N (θ, σ 2 ) with mean θ and variance σ 2 , (Section 2.6)
Unknown mean/
with σ 2 known. known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 30/58


Likelihood of one data point

Consider a single observation y from normal Analysis for the


normal distribution
distribution N (θ, σ 2 ) with mean θ and variance σ 2 , (Section 2.6)
Unknown mean/
with σ 2 known. known variance
one data point
Conjugate prior
The sampling distribution is: Posterior density
Precisions
Interpreting µ1
1 − 12 (y−θ)2 More on µ1
p(y |θ) = √ e 2σ . (2.1) Posterior prediction
2πσ multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 30/58


Conjugate prior

This likelihood is the exponential of a quadratic Analysis for the


normal distribution
form in θ, so conjugate prior density must have (Section 2.6)
Unknown mean/
same form; known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 31/58


Conjugate prior

This likelihood is the exponential of a quadratic Analysis for the


normal distribution
form in θ, so conjugate prior density must have (Section 2.6)
Unknown mean/
same form; known variance
one data point
Parameterize this family of conjugate densities as Conjugate prior
Posterior density
 
1 Precisions
p(θ) ∝ exp − 2 (θ − µ0 )2 ; (2.2) Interpreting µ1
2τ0 More on µ1
Posterior prediction
multiple
i.e., θ ∼ N (µ0 , τ02 ), with hyperparameters µ0 and τ02 . observations
multiple
For now we assume that the hyperparameters are observations
Known
known. mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 31/58


Posterior density

From the conjugate form of prior density, the Analysis for the
normal distribution
posterior distribution for θ is also normal: (Section 2.6)
Unknown mean/
known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 32/58


Posterior density

From the conjugate form of prior density, the Analysis for the
normal distribution
posterior distribution for θ is also normal: (Section 2.6)
Unknown mean/
known variance
one data point
2 2
   Conjugate prior
1 (y − θ) (θ − µ0 )
p(θ|y) ∝ exp − 2
+ 2
. (2.3) Posterior density
2 σ τ0 Precisions
Interpreting µ1
More on µ1
Some algebra is required, however, to reveal its Posterior prediction
multiple
form. observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 32/58


Posterior density

From the conjugate form of prior density, the Analysis for the
normal distribution
posterior distribution for θ is also normal: (Section 2.6)
Unknown mean/
known variance
one data point
2 2
   Conjugate prior
1 (y − θ) (θ − µ0 )
p(θ|y) ∝ exp − 2
+ 2
. (2.3) Posterior density
2 σ τ0 Precisions
Interpreting µ1
More on µ1
Some algebra is required, however, to reveal its Posterior prediction
multiple
form. observations
multiple
observations
Remark: Recall that in the posterior density Known
mean/unknown
everything except θ is regarded as constant. variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 32/58


Parameters of the posterior density

Algebraic rearrangement gives Analysis for the


normal distribution
  (Section 2.6)
1 Unknown mean/
p(θ|y) ∝ exp − 2 (θ − µ1 )2 , (2.4) known variance
2τ1 one data point
Conjugate prior
that is, θ|y ∼ N (µ1 , τ12 ), where Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 33/58


Parameters of the posterior density

Algebraic rearrangement gives Analysis for the


normal distribution
  (Section 2.6)
1 Unknown mean/
p(θ|y) ∝ exp − 2 (θ − µ1 )2 , (2.4) known variance
2τ1 one data point
Conjugate prior
that is, θ|y ∼ N (µ1 , τ12 ), where Posterior density
Precisions
Interpreting µ1
More on µ1
1 1 Posterior prediction
τ02 µ 0 + σ2
y multiple
µ1 = 1 1 (2.5) observations

τ02
+ σ2
multiple
observations
Known
and mean/unknown
1 1 1 variance

2
= 2 + 2. (2.6) Normal likelihood
τ1 τ0 σ Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 33/58


Precisions of prior and posterior distributions

In manipulating normal distributions, the inverse of Analysis for the


normal distribution
the variance plays a prominent role and is call (Section 2.6)
Unknown mean/
precision. For normal data and normal prior known variance
distribution, each with known precision, we have one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 34/58


Precisions of prior and posterior distributions

In manipulating normal distributions, the inverse of Analysis for the


normal distribution
the variance plays a prominent role and is call (Section 2.6)
Unknown mean/
precision. For normal data and normal prior known variance
distribution, each with known precision, we have one data point
Conjugate prior
Posterior density
Precisions
1 1 1 Interpreting µ1

2
= 2+ 2 More on µ1
τ1 τ0 σ Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 34/58


Precisions of prior and posterior distributions

In manipulating normal distributions, the inverse of Analysis for the


normal distribution
the variance plays a prominent role and is call (Section 2.6)
Unknown mean/
precision. For normal data and normal prior known variance
distribution, each with known precision, we have one data point
Conjugate prior
Posterior density
Precisions
1 1 1 Interpreting µ1

2
= 2+ 2 More on µ1
τ1 τ0 σ Posterior prediction
multiple
observations
multiple
observations
Known
posterior precision = prior precision+data precision. mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 34/58


Interpreting the posterior mean µ1

There are several ways of interpreting the form of Analysis for the
normal distribution
the posterior mean, µ1 . In equation (2.5) (Section 2.6)
Unknown mean/
known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 35/58


Interpreting the posterior mean µ1

There are several ways of interpreting the form of Analysis for the
normal distribution
the posterior mean, µ1 . In equation (2.5) (Section 2.6)
Unknown mean/
known variance
one data point
1 1
τ02 µ 0 + σ2
y Conjugate prior
Posterior density
µ1 = 1 1 Precisions
τ02
+ σ2 Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 35/58


Interpreting the posterior mean µ1

There are several ways of interpreting the form of Analysis for the
normal distribution
the posterior mean, µ1 . In equation (2.5) (Section 2.6)
Unknown mean/
known variance
one data point
1 1
τ02 µ 0 + σ2
y Conjugate prior
Posterior density
µ1 = 1 1 Precisions
τ02
+ σ2 Interpreting µ1
More on µ1
Posterior prediction
multiple
posterior mean = weighted average of prior mean observations
multiple
and observed value, y, with weights proportional to observations
Known
the precisions. mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 35/58


More interpretation of the posterior mean, µ1

Alternatively, µ1 = prior mean adjusted toward the Analysis for the


normal distribution
observed y: (Section 2.6)
Unknown mean/
τ02 known variance
µ1 = µ0 + (y − µ0 ) 2 2
, one data point
σ + τ0 Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 36/58


More interpretation of the posterior mean, µ1

Alternatively, µ1 = prior mean adjusted toward the Analysis for the


normal distribution
observed y: (Section 2.6)
Unknown mean/
τ02 known variance
µ1 = µ0 + (y − µ0 ) 2 2
, one data point
σ + τ0 Conjugate prior
Posterior density
Precisions
or µ1 = the data ’shrunk’ toward the prior mean: Interpreting µ1
More on µ1
σ2 Posterior prediction
µ1 = y − (y − µ0 ) 2 2
. multiple
σ + τ0 observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 36/58


More interpretation of the posterior mean, µ1

Alternatively, µ1 = prior mean adjusted toward the Analysis for the


normal distribution
observed y: (Section 2.6)
Unknown mean/
τ02 known variance
µ1 = µ0 + (y − µ0 ) 2 2
, one data point
σ + τ0 Conjugate prior
Posterior density
Precisions
or µ1 = the data ’shrunk’ toward the prior mean: Interpreting µ1
More on µ1
σ2 Posterior prediction
µ1 = y − (y − µ0 ) 2 2
. multiple
σ + τ0 observations
multiple
observations
Known
mean/unknown
Each formulation represents the posterior mean as variance
a compromise between the prior mean and the ob- Normal likelihood
Conjugate prior
served value. Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 36/58


Posterior predictive distribution 1

The posterior predictive distribution of a future Analysis for the


normal distribution
observation, ỹ, can be calculated directly by (Section 2.6)
Unknown mean/
integration : known variance
Z one data point
Conjugate prior
p(ỹ|y) = p(ỹ|θ)p(θ|y)dθ Posterior density
Precisions
  Interpreting µ1
1
Z
More on µ1
∝ exp − 2 (ỹ − θ)2 Posterior prediction
2σ multiple
  observations
1 multiple
× exp − 2 (θ − µ1 )2 dθ observations
2τ1 Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 37/58


Posterior predictive distribution 1

The posterior predictive distribution of a future Analysis for the


normal distribution
observation, ỹ, can be calculated directly by (Section 2.6)
Unknown mean/
integration : known variance
Z one data point
Conjugate prior
p(ỹ|y) = p(ỹ|θ)p(θ|y)dθ Posterior density
Precisions
  Interpreting µ1
1
Z
More on µ1
∝ exp − 2 (ỹ − θ)2 Posterior prediction
2σ multiple
  observations
1 multiple
× exp − 2 (θ − µ1 )2 dθ observations
2τ1 Known
mean/unknown
variance
Normal likelihood
Avoid algebra in simplifying this by using properties Conjugate prior
Posterior
of the bivariate normal distribution. distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 37/58


Posterior predictive distribution 2

y and θ have a joint normal posterior distribution. Analysis for the


normal distribution
(Section 2.6)
Why? Unknown mean/
known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 38/58


Posterior predictive distribution 2

y and θ have a joint normal posterior distribution. Analysis for the


normal distribution
(Section 2.6)
Why? Unknown mean/
Because: The product in the integrand is the known variance
one data point
exponential of a quadratic function of (ỹ, θ); Thus, Conjugate prior
Posterior density
the marginal posterior distribution of ỹ is normal. Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 38/58


Posterior predictive distribution 2

y and θ have a joint normal posterior distribution. Analysis for the


normal distribution
(Section 2.6)
Why? Unknown mean/
Because: The product in the integrand is the known variance
one data point
exponential of a quadratic function of (ỹ, θ); Thus, Conjugate prior
Posterior density
the marginal posterior distribution of ỹ is normal. Precisions
Interpreting µ1
We can now determine the mean and variance of More on µ1
Posterior prediction
p(ỹ|y) based on multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 38/58


Posterior predictive distribution 2

y and θ have a joint normal posterior distribution. Analysis for the


normal distribution
(Section 2.6)
Why? Unknown mean/
Because: The product in the integrand is the known variance
one data point
exponential of a quadratic function of (ỹ, θ); Thus, Conjugate prior
Posterior density
the marginal posterior distribution of ỹ is normal. Precisions
Interpreting µ1
We can now determine the mean and variance of More on µ1
Posterior prediction
p(ỹ|y) based on multiple
observations
multiple
■ E(ỹ|θ) = θ and V ar(ỹ|θ) = σ 2 observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 38/58


Posterior predictive distribution 2

y and θ have a joint normal posterior distribution. Analysis for the


normal distribution
(Section 2.6)
Why? Unknown mean/
Because: The product in the integrand is the known variance
one data point
exponential of a quadratic function of (ỹ, θ); Thus, Conjugate prior
Posterior density
the marginal posterior distribution of ỹ is normal. Precisions
Interpreting µ1
We can now determine the mean and variance of More on µ1
Posterior prediction
p(ỹ|y) based on multiple
observations
multiple
■ E(ỹ|θ) = θ and V ar(ỹ|θ) = σ 2 observations
Known
mean/unknown

variance
E(θ) = E(E(θ|y)) Normal likelihood
Conjugate prior
Posterior
and distribution

V ar(θ) = E(V ar(θ|y)) + V ar(E(θ|y)).


SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 38/58


The mean and variance

Analysis for the


E(ỹ|y) = E(E(ỹ|θ, y)|y) = E(θ|y) = µ1 , normal distribution
(Section 2.6)
and Unknown mean/
known variance
one data point
V ar(ỹ|y) = E(V ar(ỹ|θ, y)|y) + V ar(E(ỹ|θ, y)|y) Conjugate prior
Posterior density
= E(σ 2 |y) + V ar(θ|y) Precisions
Interpreting µ1
= σ 2 + τ12 . More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 39/58


The mean and variance

Analysis for the


E(ỹ|y) = E(E(ỹ|θ, y)|y) = E(θ|y) = µ1 , normal distribution
(Section 2.6)
and Unknown mean/
known variance
one data point
V ar(ỹ|y) = E(V ar(ỹ|θ, y)|y) + V ar(E(ỹ|θ, y)|y) Conjugate prior
Posterior density
= E(σ 2 |y) + V ar(θ|y) Precisions
Interpreting µ1
= σ 2 + τ12 . More on µ1
Posterior prediction
multiple
observations
This shows that the posterior of the unobserved ỹ multiple
observations
has mean equal to the posterior mean and two Known
mean/unknown
components of variance: variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 39/58


The mean and variance

Analysis for the


E(ỹ|y) = E(E(ỹ|θ, y)|y) = E(θ|y) = µ1 , normal distribution
(Section 2.6)
and Unknown mean/
known variance
one data point
V ar(ỹ|y) = E(V ar(ỹ|θ, y)|y) + V ar(E(ỹ|θ, y)|y) Conjugate prior
Posterior density
= E(σ 2 |y) + V ar(θ|y) Precisions
Interpreting µ1
= σ 2 + τ12 . More on µ1
Posterior prediction
multiple
observations
This shows that the posterior of the unobserved ỹ multiple
observations
has mean equal to the posterior mean and two Known
mean/unknown
components of variance: variance
Normal likelihood
■ the predictive variance σ 2 from the sampling Conjugate prior
model Posterior
distribution
■ the variance τ12 due to posterior uncertainty in θ.
SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 39/58


Normal model with multiple observations

The normal model with a single observation can Analysis for the
normal distribution
easily be extended to the more realistic situation (Section 2.6)
Unknown mean/
where we have a sample of independent and known variance
identically distributed observations y = (y1 , · · · , yn ). one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 40/58


Normal model with multiple observations

The normal model with a single observation can Analysis for the
normal distribution
easily be extended to the more realistic situation (Section 2.6)
Unknown mean/
where we have a sample of independent and known variance
identically distributed observations y = (y1 , · · · , yn ). one data point
Conjugate prior
Posterior density
We can proceed formally( similar to those used in Precisions
Interpreting µ1
the single observation case), More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 40/58


Normal model with multiple observations

Analysis for the


n
Y normal distribution
(Section 2.6)
p(θ|y) ∝ p(θ)p(y|θ) = p(θ) p(yi |θ) Unknown mean/
known variance
i=1
  one data point
1 Conjugate prior
∝ exp − 2 (θ − µ0 )2 Posterior density
2τ0 Precisions
Interpreting µ1
n  
Y 1 More on µ1
× exp − 2 (yi − θ)2 Posterior prediction

i=1
2σ multiple
observations
" multiple
n
#!
1 1 2 1 X
2
observations
= exp − 2
(θ − µ 0 ) + 2
(y i − θ) . Known
2 2τ0 2σ i=1 mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 41/58


Normal model with multiple observations

Analysis for the


n
Y normal distribution
(Section 2.6)
p(θ|y) ∝ p(θ)p(y|θ) = p(θ) p(yi |θ) Unknown mean/
known variance
i=1
  one data point
1 Conjugate prior
∝ exp − 2 (θ − µ0 )2 Posterior density
2τ0 Precisions
Interpreting µ1
n  
Y 1 More on µ1
× exp − 2 (yi − θ)2 Posterior prediction

i=1
2σ multiple
observations
" multiple
n
#!
1 1 2 1 X
2
observations
= exp − 2
(θ − µ 0 ) + 2
(y i − θ) . Known
2 2τ0 2σ i=1 mean/unknown
variance
Normal likelihood
Conjugate prior
The posterior distribution depends on y only Posterior

through the sample mean, ȳ = n1 i yi , which is a


P distribution

sufficient statistic in this model.


SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 41/58


Normal model via the sample mean

In fact, since ȳ |θ, σ 2 ∼ N (θ, σ 2 /n), we can apply Analysis for the
normal distribution
results for the single normal observation (Section 2.6)
Unknown mean/
known variance
p(θ|y1 , · · · , yn ) = p(θ|ȳ) = N (θ|µn , τn2 ), one data point
Conjugate prior
where Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 42/58


Normal model via the sample mean

In fact, since ȳ |θ, σ 2 ∼ N (θ, σ 2 /n), we can apply Analysis for the
normal distribution
results for the single normal observation (Section 2.6)
Unknown mean/
known variance
p(θ|y1 , · · · , yn ) = p(θ|ȳ) = N (θ|µn , τn2 ), one data point
Conjugate prior
where Posterior density
Precisions
Interpreting µ1
1 n More on µ1
τ0 0
2 µ + σ2
ȳ Posterior prediction
µn = 1 n
multiple

τ02 + σ2
observations
multiple
observations
and Known
1 1 n mean/unknown
variance
= 2 + 2.
τ1 τ0 σ Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 42/58


Limits for large n and large τ02

The prior precision, τ12 and data precision, σn2 play Analysis for the
normal distribution
0
(Section 2.6)
equivalent roles; if n is large, the posterior Unknown mean/
distribution is largely determined by σ 2 and the known variance
one data point
sample value ȳ. Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 43/58


Limits for large n and large τ02

The prior precision, τ12 and data precision, σn2 play Analysis for the
normal distribution
0
(Section 2.6)
equivalent roles; if n is large, the posterior Unknown mean/
distribution is largely determined by σ 2 and the known variance
one data point
sample value ȳ. Conjugate prior
Posterior density
Precisions
As τ0 → ∞ with n fixed, or as n → ∞ with τ02 fixed, Interpreting µ1
we have: More on µ1
Posterior prediction
p(θ|y) ≈ N (θ|ȳ, σ 2 /n), (2.7) multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 43/58


Limits for large n and large τ02

The prior precision, τ12 and data precision, σn2 play Analysis for the
normal distribution
0
(Section 2.6)
equivalent roles; if n is large, the posterior Unknown mean/
distribution is largely determined by σ 2 and the known variance
one data point
sample value ȳ. Conjugate prior
Posterior density
Precisions
As τ0 → ∞ with n fixed, or as n → ∞ with τ02 fixed, Interpreting µ1
we have: More on µ1
Posterior prediction
p(θ|y) ≈ N (θ|ȳ, σ 2 /n), (2.7) multiple
observations
multiple
observations
A prior distribution with large τ02 and thus low pre- Known
mean/unknown
cision captures prior beliefs diffuse over the range variance
Normal likelihood
of θ where the likelihood is substantial. Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 43/58


Compare the well-known result of classical Analysis for the
normal distribution
statistics: (Section 2.6)
Unknown mean/
ȳ|θ, σ 2 ∼ N (θ, σ 2 /n) known variance
one data point
leads to use of Conjugate prior
σ Posterior density
ȳ ± 1.96 √ . Precisions
n Interpreting µ1
More on µ1
as a 95% confidence interval for θ. Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 44/58


Compare the well-known result of classical Analysis for the
normal distribution
statistics: (Section 2.6)
Unknown mean/
ȳ|θ, σ 2 ∼ N (θ, σ 2 /n) known variance
one data point
leads to use of Conjugate prior
σ Posterior density
ȳ ± 1.96 √ . Precisions
n Interpreting µ1
More on µ1
as a 95% confidence interval for θ. Posterior prediction
multiple
observations
Bayesian approach gives the same result for dif- multiple
observations
fuse prior. Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 44/58


Normal dist’n: Known mean, unknown variance

■ not directly useful for applications but Analysis for the


normal distribution
■ an important building block, especially for the (Section 2.6)
Unknown mean/
normal distribution with unknown mean and known variance
one data point
unknown variance. Conjugate prior
Posterior density
■ It also introduces estimation of a scale Precisions
parameter, a role played by σ 2 for the normal Interpreting µ1
More on µ1
distribution. Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 45/58


Normal likelihood

For p(y|θ, σ 2 ) = N (y|θ, σ 2 ), with θ known and σ 2 Analysis for the


normal distribution
unknown, the likelihood for a vector y on iid (Section 2.6)
Unknown mean/
observations is known variance
one data point
n
!
2 1 X Conjugate prior
−n
p(y|σ ) ∝ σ exp − 2 (yi − θ)2 Posterior density
2σ i=1 Precisions
Interpreting µ1
 n  More on µ1
= (σ 2 )−n/2 exp − 2 v . Posterior prediction
2σ multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 46/58


Normal likelihood

For p(y|θ, σ 2 ) = N (y|θ, σ 2 ), with θ known and σ 2 Analysis for the


normal distribution
unknown, the likelihood for a vector y on iid (Section 2.6)
Unknown mean/
observations is known variance
one data point
n
!
2 1 X Conjugate prior
−n
p(y|σ ) ∝ σ exp − 2 (yi − θ)2 Posterior density
2σ i=1 Precisions
Interpreting µ1
 n  More on µ1
= (σ 2 )−n/2 exp − 2 v . Posterior prediction
2σ multiple
observations
multiple
observations
The sufficient statistic is Known
mean/unknown
n
1 X variance
v= (yi − θ)2 . Normal likelihood
n i=1 Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 46/58


Conjugate prior for σ 2

Conjugate prior density is the inverse-gamma: Analysis for the


normal distribution
(Section 2.6)
2 2 −(α+1) −β/σ 2
p(σ ) ∝ (σ ) e , Unknown mean/
known variance
one data point
which has hyperparameters (α, β). Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 47/58


Conjugate prior for σ 2

Conjugate prior density is the inverse-gamma: Analysis for the


normal distribution
(Section 2.6)
2 2 −(α+1) −β/σ 2
p(σ ) ∝ (σ ) e , Unknown mean/
known variance
one data point
which has hyperparameters (α, β). Conjugate prior
Posterior density
Precisions
A convenient parameterization is as a scaled Interpreting µ1
inverse-χ2 distribution with scale σ02 and ν0 degrees More on µ1
Posterior prediction
of freedom. multiple
observations
multiple
Then the prior distribution of σ 2 is the same observations
Known
distribution as σ02 ν0 /X, where X ∼ χ2ν0 random mean/unknown
variance
variable. We use the convenient (but nonstandard) Normal likelihood
notation: σ 2 ∼ Inv-χ2 (ν0 , σ02 ). Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 47/58


Posterior distribution for σ 2

Resulting posterior density: Analysis for the


normal distribution
(Section 2.6)
p(σ 2 |y) ∝ p(σ 2 )p(y|σ 2 ) Unknown mean/
 2 ν0 /2+1 known variance
2
 
σ0 ν0 σ0 one data point
∝ 2
exp − 2 Conjugate prior
σ 2σ Posterior density
 n v Precisions
Interpreting µ1
·(σ 2 )−n/2 exp − 2 More on µ1
2 σ  Posterior prediction

2 −((n+ν0 )/2+1) 1 multiple


∝ (σ ) exp − 2 (ν0 σ02 + nv) . observations
2σ multiple
observations
Known
Thus ... mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 48/58


Posterior distribution for σ 2

Resulting posterior density: Analysis for the


normal distribution
(Section 2.6)
p(σ 2 |y) ∝ p(σ 2 )p(y|σ 2 ) Unknown mean/
 2 ν0 /2+1 known variance
2
 
σ0 ν0 σ0 one data point
∝ 2
exp − 2 Conjugate prior
σ 2σ Posterior density
 n v Precisions
Interpreting µ1
·(σ 2 )−n/2 exp − 2 More on µ1
2 σ  Posterior prediction

2 −((n+ν0 )/2+1) 1 multiple


∝ (σ ) exp − 2 (ν0 σ02 + nv) . observations
2σ multiple
observations
Known
Thus ... mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 48/58


2 Analysis for the
 
2 2 ν0 σ0 + nv normal distribution
σ |y ∼ Inv − χ ν0 + n, , (Section 2.6)
ν0 + n Unknown mean/
known variance
—scaled inverse-χ2 distribution one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 49/58


2 Analysis for the
 
2 2 ν0 σ0 + nv normal distribution
σ |y ∼ Inv − χ ν0 + n, , (Section 2.6)
ν0 + n Unknown mean/
known variance
—scaled inverse-χ2 distribution one data point
Conjugate prior
Posterior density
■ Posterior scale = precision-weighted average of Precisions
prior and data scales. Interpreting µ1
More on µ1
■ posterior degrees of freedom = sum of prior and Posterior prediction
multiple
data degrees of freedom. observations
multiple
■ Prior distribution ≈ information equivalent to ν0 observations
Known
observations with average squared deviation σ02 . mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 49/58


Exercises

1. Ex 2.1 — 2.5 Analysis for the


normal distribution
(Section 2.6)
Unknown mean/
2. Ex2.8, 2.9, 2.11 known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 50/58


Exercises

1. Ex 2.1 — 2.5 Analysis for the


normal distribution
(Section 2.6)
Unknown mean/
2. Ex2.8, 2.9, 2.11 known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 50/58


The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
The Standard Single-parameter Models Poisson Model
(Section 2.7) Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 51/58


The exponential family

The familiar /exponential family0distributions The Standard


Single-parameter
includes Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 52/58


The exponential family

The familiar /exponential family0distributions The Standard


Single-parameter
includes Models
(Section 2.7)
■ binomial, The exponential
family
■ normal, Poisson Model
Exponential Model
■ Poisson, and

■ exponential

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 52/58


Poisson Model

The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 53/58


Poisson Model

■ Background: number of counts for events The Standard


Single-parameter
occurring exchangeably in all time intervals Models
(Section 2.7)
(independent in time with constant rate of The exponential
family
occurrence) Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 53/58


Poisson Model

■ Background: number of counts for events The Standard


Single-parameter
occurring exchangeably in all time intervals Models
(Section 2.7)
(independent in time with constant rate of The exponential
family
occurrence) Poisson Model
■ Distribution for a data point y: Exponential Model

θy e−θ
p(y|θ) = , y = 0, 1, 2, . . . ,
y!

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 53/58


Poisson Model

■ Background: number of counts for events The Standard


Single-parameter
occurring exchangeably in all time intervals Models
(Section 2.7)
(independent in time with constant rate of The exponential
family
occurrence) Poisson Model
■ Distribution for a data point y: Exponential Model

θy e−θ
p(y|θ) = , y = 0, 1, 2, . . . ,
y!

■ Likelihood for iid obs y = (y1 , . . . , yn ):


n
Y 1 yi −θ
p(y|θ) = θ e ∝ θt(y) e−nθ ,
i=1
y!
Pn
where t(y) = i=1 yi is the sufficient statistic.
SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 53/58


Prior and posterior

The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 54/58


Prior and posterior

■ Conjugate prior—Gamma(α, β): The Standard


Single-parameter
Models
p(θ) ∝ e−βθ θα−1 (Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 54/58


Prior and posterior

■ Conjugate prior—Gamma(α, β): The Standard


Single-parameter
Models
p(θ) ∝ e−βθ θα−1 (Section 2.7)
The exponential
family
Poisson Model
Exponential Model
■ Meaning: total counts of α − 1 in β prior
observations.
(Comparing with p(y|θ))

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 54/58


Prior and posterior

■ Conjugate prior—Gamma(α, β): The Standard


Single-parameter
Models
p(θ) ∝ e−βθ θα−1 (Section 2.7)
The exponential
family
Poisson Model
Exponential Model
■ Meaning: total counts of α − 1 in β prior
observations.
(Comparing with p(y|θ))
■ Posterior distribution:
θ|y ∼ Gamma(α + nȳ, β + n)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 54/58


Prior and posterior

■ Conjugate prior—Gamma(α, β): The Standard


Single-parameter
Models
p(θ) ∝ e−βθ θα−1 (Section 2.7)
The exponential
family
Poisson Model
Exponential Model
■ Meaning: total counts of α − 1 in β prior
observations.
(Comparing with p(y|θ))
■ Posterior distribution:
θ|y ∼ Gamma(α + nȳ, β + n)

■ intuitive explanation: ?

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 54/58


Extension

The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 55/58


Extension

■ yi ∼ P oisson(xi θ) The Standard


Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 55/58


Extension

■ yi ∼ P oisson(xi θ) The Standard


Single-parameter
■ θ: rate; xi : exposure of unit i (in epidemiology!) Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 55/58


Extension

■ yi ∼ P oisson(xi θ) The Standard


Single-parameter
■ θ: rate; xi : exposure of unit i (in epidemiology!) Models
(Section 2.7)
The exponential
■ not exchangeable in yi ’s; exchangeable in (x, y)i family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 55/58


Extension

■ yi ∼ P oisson(xi θ) The Standard


Single-parameter
■ θ: rate; xi : exposure of unit i (in epidemiology!) Models
(Section 2.7)
The exponential
■ not exchangeable in yi ’s; exchangeable in (x, y)i family
Poisson Model
■ Likelihood: Exponential Model

p(y|θ) ∝ θ( ) ( i=1 xi )θ
Pn Pn
y −
i=1 i e

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 55/58


Extension

■ yi ∼ P oisson(xi θ) The Standard


Single-parameter
■ θ: rate; xi : exposure of unit i (in epidemiology!) Models
(Section 2.7)
The exponential
■ not exchangeable in yi ’s; exchangeable in (x, y)i family
Poisson Model
■ Likelihood: Exponential Model

p(y|θ) ∝ θ( ) ( i=1 xi )θ
Pn Pn
y −
i=1 i e

■ Conjugate prior: Gamma(α, β)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 55/58


Extension

■ yi ∼ P oisson(xi θ) The Standard


Single-parameter
■ θ: rate; xi : exposure of unit i (in epidemiology!) Models
(Section 2.7)
The exponential
■ not exchangeable in yi ’s; exchangeable in (x, y)i family
Poisson Model
■ Likelihood: Exponential Model

p(y|θ) ∝ θ( ) ( i=1 xi )θ
Pn Pn
y −
i=1 i e

■ Conjugate prior: Gamma(α, β)


■ Posterior distribution:
n n
!
X X
θ|y ∼ Gamma α + yi , β + xi .
i=1 i=1

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 55/58


Exponential Model

The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 56/58


Exponential Model

■ Background: waiting times for events occurring The Standard


Single-parameter
exchangeably in all time intervals (independent in Models
(Section 2.7)
time with constant rate of occurrence) The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 56/58


Exponential Model

■ Background: waiting times for events occurring The Standard


Single-parameter
exchangeably in all time intervals (independent in Models
(Section 2.7)
time with constant rate of occurrence) The exponential
family
■ Distribution for a data point y: Poisson Model
Exponential Model
p(y|θ) = θ exp{−yθ}, y > 0

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 56/58


Exponential Model

■ Background: waiting times for events occurring The Standard


Single-parameter
exchangeably in all time intervals (independent in Models
(Section 2.7)
time with constant rate of occurrence) The exponential
family
■ Distribution for a data point y: Poisson Model
Exponential Model
p(y|θ) = θ exp{−yθ}, y > 0

■ Memoryless property:
P r(y > t + s|y > s, θ) = P r(y > t|θ) ∀s, t.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 56/58


The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 57/58


■ Conjugate prior: Gamma(α, β) The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 57/58


■ Conjugate prior: Gamma(α, β) The Standard
Single-parameter
■ Posterior distribution: Models
(Section 2.7)
The exponential
θ|y ∼ Gamma (α + 1, β + y) . family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 57/58


■ Conjugate prior: Gamma(α, β) The Standard
Single-parameter
■ Posterior distribution: Models
(Section 2.7)
The exponential
θ|y ∼ Gamma (α + 1, β + y) . family
Poisson Model
Exponential Model

■ Extension to iid observations y = (y1 , . . . , yn ):


n
!
X
θ|y ∼ Gamma α + n, β + yi .
i=1

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 57/58


Selfreading

The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 58/58


Selfreading

■ Example: informative prior distribution and The Standard


Single-parameter
multilevel structure fore estimating cancer rates Models
(Section 2.7)
The exponential
family
■ Noninformative prior distributions Poisson Model
Exponential Model

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 58/58

You might also like