Bayesian Statistical Analysis: Chapter 2: Single-Parameter Models

Bayesian Statistical Analysis
Chapter 2: Single-Parameter Models
Tang Yin-cai
yctang@stat.ecnu.edu.cn
SCHOOL OF FINANCE AND S TAT I S T I C S
March 23, 2009 Chapter 2 - p. 1/58

Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Estimating a probability Model
Example
from binomial data discrete prior
(Section 2.1-2.5) uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 2/58

Probelm
Estimating a
Problem: Estimate an unknown population propor- probability
tion θ from the results of a sequence of /Bernoulli from binomial data
(Section 2.1-2.5)
trials0; that is, data y1 , · · · , yn , each = either 0 or Problem
The Binomial
1. Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 3/58

Probelm
Estimating a
(Section 2.1-2.5)
The Binomial
1. Model
Example
This problem provides a relatively simple but discrete prior
uniform prior
important starting point for the discussion of Beta dist’n
Bayesian inference. Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 3/58

Probelm
Estimating a
(Section 2.1-2.5)
The Binomial
1. Model
Example
This problem provides a relatively simple but discrete prior
uniform prior
important starting point for the discussion of Beta dist’n
Bayesian inference. Posterior Beta
A compromise
Choice of prior
The binomial distribution provides natural model for conjugate prior
a sequence of n exchangeable trials each giving Estimating
posterior dist’n
rise to one of two possible outcomes, convention- Using simulation
Example
ally labeled ’success’ and ’failure.’
March 23, 2009 Chapter 2 - p. 3/58

The Binomial Model
Summarize data by: Estimating a

probability
from binomial data
y = total number of successes in the n trials. (Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 4/58

The Binomial Model

probability
from binomial data
Problem
Parameterize the model using θ, which represents: The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 4/58

The Binomial Model

probability
from binomial data
Problem
Model
Example
■ proportion of successes in the population discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 4/58

The Binomial Model

probability
from binomial data
Problem
Model
Example
■ proportion of successes in the population discrete prior
uniform prior
■ probability of success in each trial Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 4/58

Example: Birth /ratio0
Consider estimating the sex ratio in a population of Estimating a

probability
human births: Is Pr{female birth} = 0.5? from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 5/58


probability
(Section 2.1-2.5)
Problem
We define the parameter The Binomial
Model
θ = proportion of female births Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 5/58


probability
(Section 2.1-2.5)
Problem
Model
discrete prior
uniform prior
Note: We may work with transformation, e.g. the Beta dist’n
ratio of male to female birth rates, φ = 1−θ

θ
. Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 5/58


probability
(Section 2.1-2.5)
Problem
Model
discrete prior
uniform prior

θ
. Posterior Beta
A compromise
Choice of prior
conjugate prior
Let y = number of girls in n births. Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 5/58


probability
(Section 2.1-2.5)
Problem
Model
discrete prior
uniform prior

θ
. Posterior Beta
A compromise
Choice of prior
conjugate prior
posterior dist’n
Using simulation
Q: When would binomial model be appropriate? Example
March 23, 2009 Chapter 2 - p. 5/58


probability
(Section 2.1-2.5)
Problem
Model
discrete prior
uniform prior

θ
. Posterior Beta
A compromise
Choice of prior
conjugate prior
posterior dist’n
Using simulation
Q: When would binomial model be appropriate? Example
Ans: Exchangeability
March 23, 2009 Chapter 2 - p. 5/58

First analysis: discrete prior
Suppose only two values of θ are considered Estimating a

probability
possible, e.g. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 6/58


probability
(Section 2.1-2.5)
Problem
■ θ = 0.5 (what we always thought) or The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 6/58


probability
(Section 2.1-2.5)
Problem
Model
■ θ = 0.485 (someone told us but we.re not sure Example
discrete prior
whether to believe it). uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 6/58


probability
(Section 2.1-2.5)
Problem
Model
discrete prior
Beta dist’n
Posterior Beta
Posterior distribution: A compromise
Choice of prior
p(θ|y) ∝ p(θ)p(y|θ) conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 6/58


probability
(Section 2.1-2.5)
Problem
Model
discrete prior
Beta dist’n
Posterior Beta
Posterior distribution: A compromise
Choice of prior
p(θ|y) ∝ p(θ)p(y|θ) conjugate prior
Estimating
posterior dist’n
This is best obtained by a table with one line per Using simulation
value of θ: Example
March 23, 2009 Chapter 2 - p. 6/58

Suppose we specify prior distribution uniformly Estimating a
probability
across the 2 values: from binomial data
(Section 2.1-2.5)
Problem
If n=100, y=48, then The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 7/58

probability
(Section 2.1-2.5)
Problem
Model
Example
θ p(θ) p(y|θ) = θy (1 − θ)n−y p(θ|y) discrete prior
uniform prior
0.485 0.5 8.503 × 10−31 0.52 Beta dist’n
Posterior Beta
0.50 0.5 7.889 × 10−31 0.48 A compromise
Choice of prior
16.392 × 10−31 conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 7/58

probability
(Section 2.1-2.5)
Problem
Model
Example
θ p(θ) p(y|θ) = θy (1 − θ)n−y p(θ|y) discrete prior
uniform prior
0.485 0.5 8.503 × 10−31 0.52 Beta dist’n
Posterior Beta
0.50 0.5 7.889 × 10−31 0.48 A compromise
Choice of prior
16.392 × 10−31 conjugate prior
Estimating
posterior dist’n
Conclusion: These data don’t shift our prior Using simulation
Example
distribution much.
March 23, 2009 Chapter 2 - p. 7/58

Now suppose that n=1000, y=480. We have Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 8/58

probability
from binomial data
θ p(θ) log(p(y|θ)) p(θ|y) (Section 2.1-2.5)
Problem
The Binomial
0.485 0.5 −692.397 0.68 Model
Example
0.50 0.5 −693.147 0.32 discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 8/58

probability
from binomial data
θ p(θ) log(p(y|θ)) p(θ|y) (Section 2.1-2.5)
Problem
The Binomial
0.485 0.5 −692.397 0.68 Model
Example
0.50 0.5 −693.147 0.32 discrete prior
uniform prior
Beta dist’n
Conclusion: Data and prior now favor θ = 0.485 by Posterior Beta
2:1 A compromise
Choice of prior
(but still substantial probability on θ = 0.5). conjugate prior
Estimating
posterior dist’n
Q: Is discrete prior distribution reasonable? Using simulation
Example
March 23, 2009 Chapter 2 - p. 8/58

Second analysis: uniform continuous prior
Simplest example of a prior distribution is to Estimating a

probability
assume p(θ) ∝ 1 (in fact p(θ) = 1!) from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 9/58


probability
(Section 2.1-2.5)
Problem
Bayes.rule gives The Binomial
Model
Example
discrete prior
p(θ|y) ∝ θy (1−θ)n−y uniform prior

Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 9/58


probability
(Section 2.1-2.5)
Problem
Model
Example
discrete prior

Beta dist’n
Posterior Beta
A compromise
n Choice of prior

[Q: what happened to the factor: y
?] conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 9/58


probability
(Section 2.1-2.5)
Problem
Model
Example
discrete prior

Beta dist’n
Posterior Beta
A compromise
n Choice of prior

?] Ans: It is a conjugate prior
Estimating
constant! posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 9/58


probability
(Section 2.1-2.5)
Problem
Model
Example
discrete prior
p(θ|y) ∝ θy (1−θ)n−y (unnormalized posterior density) uniform prior

Beta dist’n
Posterior Beta
A compromise
n Choice of prior

Estimating
Using simulation
Example
March 23, 2009 Chapter 2 - p. 9/58


probability
(Section 2.1-2.5)
Problem
Model
Example
discrete prior

Beta dist’n
Posterior Beta
A compromise
n Choice of prior

Estimating
Using simulation
Example
This is a beta distribution:
θ|y ∼ Beta(y+1, n−y+1).
March 23, 2009 Chapter 2 - p. 9/58


probability
(Section 2.1-2.5)
Problem
Model
Example
discrete prior

Beta dist’n
Posterior Beta
A compromise
n Choice of prior

Estimating
Using simulation
Example
This is a beta distribution:
θ|y ∼ Beta(y+1, n−y+1). (normalized posterior density)
March 23, 2009 Chapter 2 - p. 9/58

What is a beta distribution?
The beta distribution is a continuous distribution on Estimating a

probability
[0, 1] with a wide variety of shapes, determined by from binomial data
(Section 2.1-2.5)
2 parameters: Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 10/58


probability
(Section 2.1-2.5)
The Binomial
p(θ|α, β) ∝ θα−1 (1 − θ)β−1 Model
Example
discrete prior
where α > 0, β > 0. uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 10/58


probability
(Section 2.1-2.5)
The Binomial
p(θ|α, β) ∝ θα−1 (1 − θ)β−1 Model
Example
discrete prior
where α > 0, β > 0. uniform prior
Beta dist’n
Posterior Beta
It is unimodal with mode ∈ (0, 1) if α > 1, β > 1 and A compromise
approach a normal curve for α, β → ∞. Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 10/58

Shape of Beta(y + 1, n − y + 1)
Consider different values of n and y, but with the Estimating a

probability
same proportion of success: 0.6. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 11/58


probability
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 11/58


probability
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
■ n = 20, y = 12 discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 11/58


probability
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
uniform prior
■ n = 100, y = 60 Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 11/58


probability
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
uniform prior
■ n = 100, y = 60 Beta dist’n
Posterior Beta
■ n = 1000, y = 600 A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 11/58


probability
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
uniform prior
■ n = 100, y = 60 Beta dist’n
Posterior Beta
Choice of prior
conjugate prior
Q: What do you observe? Explain. Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 11/58


probability
(Section 2.1-2.5)
Problem
The Binomial
■ n = 5, y = 3 Model
Example
uniform prior
■ n = 100, y = 60 Beta dist’n
Posterior Beta
Choice of prior
conjugate prior
Q: What do you observe? Explain. Estimating
posterior dist’n
Using simulation
R code: fig2 1.R Example
March 23, 2009 Chapter 2 - p. 11/58

Estimating a
probability
from binomial data
2.0 (Section 2.1-2.5)
3
1.5
n=5 n=20
y=3
y=12
Problem
The Binomial
1.0
2
Model
0.5
1
Example
0.0
discrete prior
0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
uniform prior
Beta dist’n
Posterior Beta
A compromise
25
8
Choice of prior
20
n=1000
n=100 conjugate prior
6
y=600
y=60
15
Estimating
4
10
posterior dist’n
2
Using simulation
5
Example
0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Figure 1: Posterior density for binomial parameter θ, based on

uniform prior distribution O L O FyFsuccesses
S C H Oand out
INANCE AND ST T I Cn
A T I Sof S trials.
March 23, 2009 Chapter 2 - p. 12/58

Prior Prediction: Review
Before any data is observed the distribution of Estimating a

probability
unknown but observable y is from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 13/58


probability
(Section 2.1-2.5)
Problem
Z Z The Binomial
Model
p(y) = p(y, θ)dθ = p(θ)p(y|θ)dθ Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 13/58


probability
(Section 2.1-2.5)
Problem
Z Z The Binomial
Model
p(y) = p(y, θ)dθ = p(θ)p(y|θ)dθ Example
discrete prior
uniform prior
Beta dist’n
This is the marginal or prior predictive distribution Posterior Beta
A compromise
of y. Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 13/58

Posterior Prediction: Review
After y is observed, we can derive the dist’n of Estimating a

probability
unknown but potentially observable ỹ using the from binomial data
(Section 2.1-2.5)
same process: Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 14/58


probability
(Section 2.1-2.5)
The Binomial
Model
Example
Z discrete prior
uniform prior
p(ỹ|y) = p(ỹ, θ|y)dθ Beta dist’n
Posterior Beta
Z A compromise
= p(θ|y)p(ỹ|θ, y)dθ Choice of prior
conjugate prior
Z Estimating
posterior dist’n
= p(θ|y)p(ỹ|θ)dθ Using simulation
Example
March 23, 2009 Chapter 2 - p. 14/58


probability
(Section 2.1-2.5)
The Binomial
Model
Example
Z discrete prior
uniform prior
p(ỹ|y) = p(ỹ, θ|y)dθ Beta dist’n
Posterior Beta
Z A compromise
= p(θ|y)p(ỹ|θ, y)dθ Choice of prior
conjugate prior
Z Estimating
posterior dist’n
= p(θ|y)p(ỹ|θ)dθ Using simulation
Example
This is the posterior predictive distribution of ỹ.
March 23, 2009 Chapter 2 - p. 14/58

Posterior Prediction: The sex ratio example
The natural application here is for ỹ to be the result Estimating a

probability
of one new trial, exchangeable with first n. from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 15/58


probability
(Section 2.1-2.5)
Problem
The Binomial
Model
Z 1 Example
P r(ỹ = 1|y) = P r(ỹ = 1|θ, y)p(θ|y)dθ discrete prior
uniform prior
0 Beta dist’n
1
y+1
Z Posterior Beta
= θp(θ|y)dθ = E(θ|y) = . A compromise
0 n+2 Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 15/58


probability
(Section 2.1-2.5)
Problem
The Binomial
Model
Z 1 Example
P r(ỹ = 1|y) = P r(ỹ = 1|θ, y)p(θ|y)dθ discrete prior
uniform prior
0 Beta dist’n
1
y+1
Z Posterior Beta
= θp(θ|y)dθ = E(θ|y) = . A compromise
0 n+2 Choice of prior
conjugate prior
Estimating
posterior dist’n
This is Laplace.s notorious /Law of Using simulation
Example
Succession0.
March 23, 2009 Chapter 2 - p. 15/58

Posterior distribution as compromise
In binomial model with uniform prior distribution: Estimating a

probability
from binomial data
1
Prior mean = 2
→ (Section 2.1-2.5)
Problem
y+1 The Binomial
posterior mean = n+2 Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 16/58


probability
from binomial data
1
Prior mean = 2
→ (Section 2.1-2.5)
Problem
y+1 The Binomial
Example
discrete prior
1
A compromise between the prior mean 2
and the uniform prior
sample proportion, ny .
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 16/58


probability
from binomial data
1
Prior mean = 2
→ (Section 2.1-2.5)
Problem
y+1 The Binomial
Example
discrete prior
1
A compromise between the prior mean 2
and the uniform prior
sample proportion, ny .
Beta dist’n
Posterior Beta
A compromise
Choice of prior
This is a general feature of Bayesian inference: conjugate prior
Estimating
posterior distribution centered at a compromise posterior dist’n
between the prior information and the data, with Using simulation
Example
/compromise0increasingly controlled by the data
as sample size increases (can be investigated
more formally using conditional expectation formu-
lae).
March 23, 2009 Chapter 2 - p. 16/58

Results for the uniform prior distribution
Back to the sex ratio example: Under uniform prior Estimating a

probability
dist.n a from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
y+1 Example
E(θ|y) = discrete prior
n+2 uniform prior
Beta dist’n
(y + 1)(n − y + 1) Posterior Beta
V ar(θ|y) = A compromise
(n + 2)2 (n + 3) Choice of prior
a α conjugate prior
From Appendix A: If X ∼ Beta(α, β), then E(X) = α+β
and V ar(θ) = Estimating
αβ posterior dist’n
.
(α+β)2 (α+β+1) Using simulation
Example
March 23, 2009 Chapter 2 - p. 17/58

In 2 hypothetical cases: Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 18/58

probability
from binomial data
(Section 2.1-2.5)
n=100 n=1000 Problem
The Binomial
y=48 y=480 Model
Example
E(y|θ) 0.4804 0.4800 discrete prior
uniform prior
SD(y|θ) 0.049 0.016 Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 18/58

probability
from binomial data
(Section 2.1-2.5)
n=100 n=1000 Problem
The Binomial
y=48 y=480 Model
Example
E(y|θ) 0.4804 0.4800 discrete prior
uniform prior
SD(y|θ) 0.049 0.016 Beta dist’n
Posterior Beta
A compromise
Q: What would our conclusions about the sex ratio Choice of prior
in each case be? conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 18/58

Choice of prior distribution
Bayes is believed to have justified choosing the Estimating a

probability
uniform prior dist.n in the binomial model from binomial data
(Section 2.1-2.5)
because the prior predictive dist.n is Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 19/58


probability
(Section 2.1-2.5)
The Binomial
Model
Example
discrete prior
1
n y
Z
uniform prior
p(y) = θ (1 − θ)n−y dθ Beta dist’n
0 y Posterior Beta
A compromise
1 Choice of prior
= , y = 0, 1, · · · , n. conjugate prior
n+1 Estimating
posterior dist’n
Using simulation
Thus all possible values of y are equally likely a Example
priori.
March 23, 2009 Chapter 2 - p. 19/58


probability
(Section 2.1-2.5)
The Binomial
Model
Example
discrete prior
1
n y
Z
uniform prior
p(y) = θ (1 − θ)n−y dθ Beta dist’n
0 y Posterior Beta
A compromise
1 Choice of prior
= , y = 0, 1, · · · , n. conjugate prior
n+1 Estimating
posterior dist’n
Using simulation
Thus all possible values of y are equally likely a Example
priori.
Not necessarily a compelling argument õ
/true0Bayesian analysis must use a subjectively
assessed prior dist.n.
S F
CHOOL OF S
INANCE AND TAT I S T I C S
March 23, 2009 Chapter 2 - p. 19/58

Third analysis: conjugate prior
Based on the binomial likelihood: Estimating a

probability
from binomial data
n y
p(y|θ) = Bin(y|n, θ) = θ (1 − θ)n−y (Section 2.1-2.5)
y Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 20/58


probability
from binomial data
n y
y Problem
The Binomial
Model
suppose that the prior density has a form Example
discrete prior
uniform prior
p(θ) ∝ θα−1 (1 − θ)β−1 , Beta dist’n
Posterior Beta
that is, θ ∼ Beta(α, β) (equivalent to the binomial A compromise
Choice of prior
likelihood with α − 1 prior successes and β − 1 conjugate prior
Estimating
prior failures) posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 20/58


probability
from binomial data
n y
y Problem
The Binomial
Model
suppose that the prior density has a form Example
discrete prior
uniform prior
p(θ) ∝ θα−1 (1 − θ)β−1 , Beta dist’n
Posterior Beta
that is, θ ∼ Beta(α, β) (equivalent to the binomial A compromise
Choice of prior
likelihood with α − 1 prior successes and β − 1 conjugate prior
Estimating
prior failures) posterior dist’n
Using simulation
Parameters of prior distribution called Example
hyperparameters.
Two hyperparameters of beta prior distribution can
be fixed by specifying two features of the distribu-
tion, e.g. its meanS and variance.
CHOOL OFF S
INANCE AND TAT I S T I C S
March 23, 2009 Chapter 2 - p. 20/58

If α, β are fixed at reasonable choices, we obtain Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 21/58

probability
from binomial data
(Section 2.1-2.5)
Problem
p(θ|y) ∝ θy (1 − θ)n−y θα−1 (1 − θ)β−1 The Binomial
Model
= θy+α−1 (1 − θ)n−y+β−1 Example

discrete prior
uniform prior
= Beta(θ|α + y, β + n − y). Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 21/58

probability
from binomial data
(Section 2.1-2.5)
Problem
Model

discrete prior
uniform prior
Posterior Beta
A compromise
The property that posterior distribution follows Choice of prior
conjugate prior
same parametric form as prior distribution is called Estimating
posterior dist’n
conjugacy; the beta prior distribution is a conjugate Using simulation
family for the binomial likelihood. Example
March 23, 2009 Chapter 2 - p. 21/58

probability
from binomial data
(Section 2.1-2.5)
Problem
Model

discrete prior
uniform prior
Posterior Beta
A compromise
The property that posterior distribution follows Choice of prior
conjugate prior
same parametric form as prior distribution is called Estimating
posterior dist’n
conjugacy; the beta prior distribution is a conjugate Using simulation
family for the binomial likelihood. Example
Remark: Maths is convenient, but not necessarily

a good model!
March 23, 2009 Chapter 2 - p. 21/58

Estimating properties of the posterior dist’n
Beta distribution: exact summaries (mean, SD etc) Estimating a

probability
can be obtained but how do we determine from binomial data
(Section 2.1-2.5)
quantiles and hence posterior probability intervals? Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 22/58


probability
(Section 2.1-2.5)
The Binomial
We require either: Model
Example
■ Use numerical integration (incomplete beta discrete prior
uniform prior
integral). Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 22/58


probability
(Section 2.1-2.5)
The Binomial
Example
uniform prior
■ Approximate the beta integral (normal Posterior Beta
A compromise
distribution?). Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 22/58


probability
(Section 2.1-2.5)
The Binomial
Example
uniform prior
■ Approximate the beta integral (normal Posterior Beta
A compromise
distribution?). Choice of prior
conjugate prior
■ Resort to simulation: obtain a random sample Estimating
posterior dist’n
from the dist.n and obtain all desired Using simulation
Example
summaries by numerically summarizing this
sample.
March 23, 2009 Chapter 2 - p. 22/58

Using simulation to estimate posterior dist’n
Last strategy is the most general (and requires Estimating a

probability
least analytical effort: computers replace algebra!). from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 23/58


probability
(Section 2.1-2.5)
We can simulate from the beta distribution using Problem
The Binomial
either R or BUGS. Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 23/58


probability
(Section 2.1-2.5)
We can simulate from the beta distribution using Problem
The Binomial
either R or BUGS. Model
Example
discrete prior
Further advantage of simulation: distribution of uniform prior
Beta dist’n
functions of θ can be obtained with little further Posterior Beta
effort, e.g. sex ratio φ = 1−θ
θ
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 23/58

Female births and placenta praevia
Estimating a
As a specific example of a factor that may influ- probability
ence the sex ratio, we consider the maternal con- from binomial data
(Section 2.1-2.5)
dition placenta() previa(c), an unusual con- Problem
The Binomial
dition of pregnancy(~)) in which the placenta is Model
Example
implanted very low in the uterus(fû), obstruct- discrete prior
ing({) the fetus() from a normal vaginal de- uniform prior
Beta dist’n
livery(^). An early study concerning the sex of Posterior Beta
A compromise
placenta previa births in Germany found that of a Choice of prior
total of 980 births, 437 were female. How much conjugate prior
Estimating
evidence does this provide for the claim that the posterior dist’n
Using simulation
proportion of female births in the population of pla- Example
centa previa births is less than 0.485, the propor-
tion of female births in the general population?
March 23, 2009 Chapter 2 - p. 24/58

Uniform prior → p(θ|y) = Beta(438, 544). Estimating a
probability
from binomial data
(Section 2.1-2.5)
Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 25/58

probability
from binomial data
■ Mean=0.446 (Section 2.1-2.5)
Problem
■ STD=0.016 The Binomial
Model
■ Median=0.446 Example
discrete prior
■ central 95% posterior interval = (0.415,0.477). uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 25/58

probability
from binomial data
■ Mean=0.446 (Section 2.1-2.5)
Problem
■ STD=0.016 The Binomial
Model
■ Median=0.446 Example
discrete prior
■ central 95% posterior interval = (0.415,0.477). uniform prior
Beta dist’n
Posterior Beta
The following figure is a histogram of 1000 A compromise
simulated values. These give sample mean, Choice of prior
conjugate prior
median and sd almost identical to exact values and Estimating
posterior dist’n
95% posterior interval (0.415,0.476). Using simulation
Example
March 23, 2009 Chapter 2 - p. 25/58

140 Estimating a
probability
60
70
from binomial data
(Section 2.1-2.5)
120
50
60
Problem
The Binomial
100
50
Model
40
Example
80
40
discrete prior
Frequency
Frequency
Frequency
30
uniform prior
60
30
Beta dist’n
20
Posterior Beta
40
20
A compromise
10
Choice of prior
20
10
conjugate prior
Estimating
0
0
0.35 0.40 0.45 0.50 0.55 −0.5 −0.3 −0.1 0.1 1.0 1.1 1.2 1.3 1.4 1.5 1.6 posterior dist’n
θ logit(θ) = log(θ (1 − θ)) φ = (1 − θ) θ
Using simulation
Example
Figure 2: Draws from the posterior distribution of (a) the prob-

ability of female birth, θ; (b) the logit transform, logit(θ); (c) the
male-to-female sex ratio, φ = (1 − θ)/θ.
March 23, 2009 Chapter 2 - p. 26/58

Figure (c) shows histogram of simulated sex ratio. Estimating a
probability
95% posterior interval for this = (1.10,1.41); from binomial data
(Section 2.1-2.5)
median = 1.24. Problem
The Binomial
Model
Example
discrete prior
uniform prior
Beta dist’n
Posterior Beta
A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 27/58

probability
(Section 2.1-2.5)
The Binomial
Model
Note: intervals are well away from θ = 0.485 (ratio Example
= 1.06), implying that probability of female birth in discrete prior
uniform prior
placenta praevia is lower than in general Beta dist’n
Posterior Beta
population. A compromise
Choice of prior
conjugate prior
Estimating
posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 27/58

probability
(Section 2.1-2.5)
The Binomial
Model
Note: intervals are well away from θ = 0.485 (ratio Example
= 1.06), implying that probability of female birth in discrete prior
uniform prior
placenta praevia is lower than in general Beta dist’n
Posterior Beta
population. A compromise
Choice of prior
conjugate prior
See Table 2.1 in Gelman et al. for sensitivity Estimating
analysis to varying prior distribution; posterior dist’n
Using simulation
Example
March 23, 2009 Chapter 2 - p. 27/58

Analysis for the
normal distribution
(Section 2.6)
Unknown mean/
known variance
Analysis for the normal distribution one data point

Conjugate prior
(Section 2.6) Posterior density
Precisions
Interpreting µ1
More on µ1
Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 28/58

Normal dist’n: Unknown mean, known variance
Analysis for the

Normal model underlies much statistical modelling. normal distribution
(Section 2.6)
Unknown mean/
known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 29/58

Analysis for the

(Section 2.6)
Unknown mean/
why? known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 29/58

Analysis for the

(Section 2.6)
Unknown mean/
why? known variance
one data point
■ CLT justifies normal approximation; Conjugate prior
Posterior density
■ useful component of more complicated Precisions
Interpreting µ1
model(Student-t or mixture distributions). More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 29/58

Analysis for the

(Section 2.6)
Unknown mean/
why? known variance
one data point
■ CLT justifies normal approximation; Conjugate prior
Posterior density
■ useful component of more complicated Precisions
Interpreting µ1
model(Student-t or mixture distributions). More on µ1
multiple
Model development: observations
1. Just one data point. multiple
observations
Known
2. General case of a /sample0of data with many mean/unknown
data points. variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 29/58

Likelihood of one data point
Consider a single observation y from normal Analysis for the

normal distribution
distribution N (θ, σ 2 ) with mean θ and variance σ 2 , (Section 2.6)
Unknown mean/
with σ 2 known. known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 30/58

Likelihood of one data point
Consider a single observation y from normal Analysis for the

normal distribution
distribution N (θ, σ 2 ) with mean θ and variance σ 2 , (Section 2.6)
Unknown mean/
with σ 2 known. known variance
one data point
Conjugate prior
The sampling distribution is: Posterior density
Precisions
Interpreting µ1
1 − 12 (y−θ)2 More on µ1
p(y |θ) = √ e 2σ . (2.1) Posterior prediction
2πσ multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 30/58

Conjugate prior
This likelihood is the exponential of a quadratic Analysis for the

normal distribution
form in θ, so conjugate prior density must have (Section 2.6)
Unknown mean/
same form; known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 31/58

Conjugate prior
This likelihood is the exponential of a quadratic Analysis for the

normal distribution
form in θ, so conjugate prior density must have (Section 2.6)
Unknown mean/
same form; known variance
one data point
Parameterize this family of conjugate densities as Conjugate prior
Posterior density

1 Precisions
p(θ) ∝ exp − 2 (θ − µ0 )2 ; (2.2) Interpreting µ1
2τ0 More on µ1
multiple
i.e., θ ∼ N (µ0 , τ02 ), with hyperparameters µ0 and τ02 . observations
multiple
For now we assume that the hyperparameters are observations
Known
known. mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 31/58

Posterior density
From the conjugate form of prior density, the Analysis for the
normal distribution
posterior distribution for θ is also normal: (Section 2.6)
Unknown mean/
known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 32/58

Posterior density
normal distribution
Unknown mean/
known variance
one data point
2 2
Conjugate prior
1 (y − θ) (θ − µ0 )
p(θ|y) ∝ exp − 2
+ 2
. (2.3) Posterior density
2 σ τ0 Precisions
Interpreting µ1
More on µ1
Some algebra is required, however, to reveal its Posterior prediction
multiple
form. observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 32/58

Posterior density
normal distribution
Unknown mean/
known variance
one data point
2 2
Conjugate prior
1 (y − θ) (θ − µ0 )
p(θ|y) ∝ exp − 2
+ 2
. (2.3) Posterior density
2 σ τ0 Precisions
Interpreting µ1
More on µ1
Some algebra is required, however, to reveal its Posterior prediction
multiple
form. observations
multiple
observations
Remark: Recall that in the posterior density Known
mean/unknown
everything except θ is regarded as constant. variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 32/58

Parameters of the posterior density
Algebraic rearrangement gives Analysis for the

normal distribution
(Section 2.6)
1 Unknown mean/
p(θ|y) ∝ exp − 2 (θ − µ1 )2 , (2.4) known variance
2τ1 one data point
Conjugate prior
that is, θ|y ∼ N (µ1 , τ12 ), where Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 33/58

Parameters of the posterior density
Algebraic rearrangement gives Analysis for the

normal distribution
(Section 2.6)
1 Unknown mean/
p(θ|y) ∝ exp − 2 (θ − µ1 )2 , (2.4) known variance
2τ1 one data point
Conjugate prior
that is, θ|y ∼ N (µ1 , τ12 ), where Posterior density
Precisions
Interpreting µ1
More on µ1
1 1 Posterior prediction
τ02 µ 0 + σ2
y multiple
µ1 = 1 1 (2.5) observations
τ02
+ σ2
multiple
observations
Known
and mean/unknown
1 1 1 variance
2
= 2 + 2. (2.6) Normal likelihood
τ1 τ0 σ Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 33/58

Precisions of prior and posterior distributions
In manipulating normal distributions, the inverse of Analysis for the

normal distribution
the variance plays a prominent role and is call (Section 2.6)
Unknown mean/
precision. For normal data and normal prior known variance
distribution, each with known precision, we have one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 34/58


normal distribution
Unknown mean/
Conjugate prior
Posterior density
Precisions
1 1 1 Interpreting µ1
2
= 2+ 2 More on µ1
τ1 τ0 σ Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 34/58


normal distribution
Unknown mean/
Conjugate prior
Posterior density
Precisions
1 1 1 Interpreting µ1
2
= 2+ 2 More on µ1
τ1 τ0 σ Posterior prediction
multiple
observations
multiple
observations
Known
posterior precision = prior precision+data precision. mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 34/58

Interpreting the posterior mean µ1
There are several ways of interpreting the form of Analysis for the
normal distribution
the posterior mean, µ1 . In equation (2.5) (Section 2.6)
Unknown mean/
known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 35/58

normal distribution
Unknown mean/
known variance
one data point
1 1
τ02 µ 0 + σ2
y Conjugate prior
Posterior density
µ1 = 1 1 Precisions
τ02
+ σ2 Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 35/58

normal distribution
Unknown mean/
known variance
one data point
1 1
τ02 µ 0 + σ2
y Conjugate prior
Posterior density
µ1 = 1 1 Precisions
τ02
+ σ2 Interpreting µ1
More on µ1
multiple
posterior mean = weighted average of prior mean observations
multiple
and observed value, y, with weights proportional to observations
Known
the precisions. mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 35/58

More interpretation of the posterior mean, µ1
Alternatively, µ1 = prior mean adjusted toward the Analysis for the

normal distribution
observed y: (Section 2.6)
Unknown mean/
τ02 known variance
µ1 = µ0 + (y − µ0 ) 2 2
, one data point
σ + τ0 Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 36/58


normal distribution
Unknown mean/
τ02 known variance
µ1 = µ0 + (y − µ0 ) 2 2
, one data point
Posterior density
Precisions
or µ1 = the data ’shrunk’ toward the prior mean: Interpreting µ1
More on µ1
σ2 Posterior prediction
µ1 = y − (y − µ0 ) 2 2
. multiple
σ + τ0 observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 36/58


normal distribution
Unknown mean/
τ02 known variance
µ1 = µ0 + (y − µ0 ) 2 2
, one data point
Posterior density
Precisions
or µ1 = the data ’shrunk’ toward the prior mean: Interpreting µ1
More on µ1
σ2 Posterior prediction
µ1 = y − (y − µ0 ) 2 2
. multiple
σ + τ0 observations
multiple
observations
Known
mean/unknown
Each formulation represents the posterior mean as variance
a compromise between the prior mean and the ob- Normal likelihood
Conjugate prior
served value. Posterior
distribution
March 23, 2009 Chapter 2 - p. 36/58

Posterior predictive distribution 1
The posterior predictive distribution of a future Analysis for the

normal distribution
observation, ỹ, can be calculated directly by (Section 2.6)
Unknown mean/
integration : known variance
Z one data point
Conjugate prior
p(ỹ|y) = p(ỹ|θ)p(θ|y)dθ Posterior density
Precisions
Interpreting µ1
1
Z
More on µ1
∝ exp − 2 (ỹ − θ)2 Posterior prediction
2σ multiple
observations
1 multiple
× exp − 2 (θ − µ1 )2 dθ observations
2τ1 Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 37/58

The posterior predictive distribution of a future Analysis for the

normal distribution
observation, ỹ, can be calculated directly by (Section 2.6)
Unknown mean/
integration : known variance
Z one data point
Conjugate prior
p(ỹ|y) = p(ỹ|θ)p(θ|y)dθ Posterior density
Precisions
Interpreting µ1
1
Z
More on µ1
∝ exp − 2 (ỹ − θ)2 Posterior prediction
2σ multiple
observations
1 multiple
× exp − 2 (θ − µ1 )2 dθ observations
2τ1 Known
mean/unknown
variance
Normal likelihood
Avoid algebra in simplifying this by using properties Conjugate prior
Posterior
of the bivariate normal distribution. distribution
March 23, 2009 Chapter 2 - p. 37/58

y and θ have a joint normal posterior distribution. Analysis for the

normal distribution
(Section 2.6)
Why? Unknown mean/
known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 38/58


normal distribution
(Section 2.6)
Why? Unknown mean/
Because: The product in the integrand is the known variance
one data point
exponential of a quadratic function of (ỹ, θ); Thus, Conjugate prior
Posterior density
the marginal posterior distribution of ỹ is normal. Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 38/58


normal distribution
(Section 2.6)
Why? Unknown mean/
one data point
Posterior density
Interpreting µ1
We can now determine the mean and variance of More on µ1
p(ỹ|y) based on multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 38/58


normal distribution
(Section 2.6)
Why? Unknown mean/
one data point
Posterior density
Interpreting µ1
observations
multiple
■ E(ỹ|θ) = θ and V ar(ỹ|θ) = σ 2 observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 38/58


normal distribution
(Section 2.6)
Why? Unknown mean/
one data point
Posterior density
Interpreting µ1
observations
multiple
■ E(ỹ|θ) = θ and V ar(ỹ|θ) = σ 2 observations
Known
mean/unknown
■
variance
E(θ) = E(E(θ|y)) Normal likelihood
Conjugate prior
Posterior
and distribution
V ar(θ) = E(V ar(θ|y)) + V ar(E(θ|y)).

March 23, 2009 Chapter 2 - p. 38/58

The mean and variance
Analysis for the

E(ỹ|y) = E(E(ỹ|θ, y)|y) = E(θ|y) = µ1 , normal distribution
(Section 2.6)
and Unknown mean/
known variance
one data point
V ar(ỹ|y) = E(V ar(ỹ|θ, y)|y) + V ar(E(ỹ|θ, y)|y) Conjugate prior
Posterior density
= E(σ 2 |y) + V ar(θ|y) Precisions
Interpreting µ1
= σ 2 + τ12 . More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 39/58

Analysis for the

(Section 2.6)
and Unknown mean/
known variance
one data point
Posterior density
Interpreting µ1
= σ 2 + τ12 . More on µ1
multiple
observations
This shows that the posterior of the unobserved ỹ multiple
observations
has mean equal to the posterior mean and two Known
mean/unknown
components of variance: variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 39/58

Analysis for the

(Section 2.6)
and Unknown mean/
known variance
one data point
Posterior density
Interpreting µ1
= σ 2 + τ12 . More on µ1
multiple
observations
This shows that the posterior of the unobserved ỹ multiple
observations
has mean equal to the posterior mean and two Known
mean/unknown
components of variance: variance
Normal likelihood
■ the predictive variance σ 2 from the sampling Conjugate prior
model Posterior
distribution
■ the variance τ12 due to posterior uncertainty in θ.
March 23, 2009 Chapter 2 - p. 39/58

Normal model with multiple observations
The normal model with a single observation can Analysis for the
normal distribution
easily be extended to the more realistic situation (Section 2.6)
Unknown mean/
where we have a sample of independent and known variance
identically distributed observations y = (y1 , · · · , yn ). one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 40/58

The normal model with a single observation can Analysis for the
normal distribution
easily be extended to the more realistic situation (Section 2.6)
Unknown mean/
where we have a sample of independent and known variance
identically distributed observations y = (y1 , · · · , yn ). one data point
Conjugate prior
Posterior density
We can proceed formally( similar to those used in Precisions
Interpreting µ1
the single observation case), More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 40/58

Analysis for the

n
Y normal distribution
(Section 2.6)
p(θ|y) ∝ p(θ)p(y|θ) = p(θ) p(yi |θ) Unknown mean/
known variance
i=1
one data point
1 Conjugate prior
∝ exp − 2 (θ − µ0 )2 Posterior density
2τ0 Precisions
Interpreting µ1
n
Y 1 More on µ1
× exp − 2 (yi − θ)2 Posterior prediction
i=1
2σ multiple
observations
" multiple
n
#!
1 1 2 1 X
2
observations
= exp − 2
(θ − µ 0 ) + 2
(y i − θ) . Known
2 2τ0 2σ i=1 mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 41/58

Analysis for the

n
Y normal distribution
(Section 2.6)
p(θ|y) ∝ p(θ)p(y|θ) = p(θ) p(yi |θ) Unknown mean/
known variance
i=1
one data point
1 Conjugate prior
∝ exp − 2 (θ − µ0 )2 Posterior density
2τ0 Precisions
Interpreting µ1
n
Y 1 More on µ1
× exp − 2 (yi − θ)2 Posterior prediction
i=1
2σ multiple
observations
" multiple
n
#!
1 1 2 1 X
2
observations
= exp − 2
(θ − µ 0 ) + 2
(y i − θ) . Known
2 2τ0 2σ i=1 mean/unknown
variance
Normal likelihood
Conjugate prior
The posterior distribution depends on y only Posterior
through the sample mean, ȳ = n1 i yi , which is a

P distribution
sufficient statistic in this model.

March 23, 2009 Chapter 2 - p. 41/58

Normal model via the sample mean
In fact, since ȳ |θ, σ 2 ∼ N (θ, σ 2 /n), we can apply Analysis for the
normal distribution
results for the single normal observation (Section 2.6)
Unknown mean/
known variance
p(θ|y1 , · · · , yn ) = p(θ|ȳ) = N (θ|µn , τn2 ), one data point
Conjugate prior
where Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 42/58

Normal model via the sample mean
In fact, since ȳ |θ, σ 2 ∼ N (θ, σ 2 /n), we can apply Analysis for the
normal distribution
results for the single normal observation (Section 2.6)
Unknown mean/
known variance
p(θ|y1 , · · · , yn ) = p(θ|ȳ) = N (θ|µn , τn2 ), one data point
Conjugate prior
where Posterior density
Precisions
Interpreting µ1
1 n More on µ1
τ0 0
2 µ + σ2
ȳ Posterior prediction
µn = 1 n
multiple
τ02 + σ2
observations
multiple
observations
and Known
1 1 n mean/unknown
variance
= 2 + 2.
τ1 τ0 σ Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 42/58

Limits for large n and large τ02
The prior precision, τ12 and data precision, σn2 play Analysis for the
normal distribution
0
(Section 2.6)
equivalent roles; if n is large, the posterior Unknown mean/
distribution is largely determined by σ 2 and the known variance
one data point
sample value ȳ. Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 43/58

normal distribution
0
(Section 2.6)
one data point
Posterior density
Precisions
As τ0 → ∞ with n fixed, or as n → ∞ with τ02 fixed, Interpreting µ1
we have: More on µ1
p(θ|y) ≈ N (θ|ȳ, σ 2 /n), (2.7) multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 43/58

normal distribution
0
(Section 2.6)
one data point
Posterior density
Precisions
As τ0 → ∞ with n fixed, or as n → ∞ with τ02 fixed, Interpreting µ1
we have: More on µ1
p(θ|y) ≈ N (θ|ȳ, σ 2 /n), (2.7) multiple
observations
multiple
observations
A prior distribution with large τ02 and thus low pre- Known
mean/unknown
cision captures prior beliefs diffuse over the range variance
Normal likelihood
of θ where the likelihood is substantial. Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 43/58

Compare the well-known result of classical Analysis for the
normal distribution
statistics: (Section 2.6)
Unknown mean/
ȳ|θ, σ 2 ∼ N (θ, σ 2 /n) known variance
one data point
leads to use of Conjugate prior
σ Posterior density
ȳ ± 1.96 √ . Precisions
n Interpreting µ1
More on µ1
as a 95% confidence interval for θ. Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 44/58

Compare the well-known result of classical Analysis for the
normal distribution
statistics: (Section 2.6)
Unknown mean/
ȳ|θ, σ 2 ∼ N (θ, σ 2 /n) known variance
one data point
leads to use of Conjugate prior
σ Posterior density
ȳ ± 1.96 √ . Precisions
n Interpreting µ1
More on µ1
as a 95% confidence interval for θ. Posterior prediction
multiple
observations
Bayesian approach gives the same result for dif- multiple
observations
fuse prior. Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 44/58

Normal dist’n: Known mean, unknown variance
■ not directly useful for applications but Analysis for the

normal distribution
■ an important building block, especially for the (Section 2.6)
Unknown mean/
normal distribution with unknown mean and known variance
one data point
unknown variance. Conjugate prior
Posterior density
■ It also introduces estimation of a scale Precisions
parameter, a role played by σ 2 for the normal Interpreting µ1
More on µ1
distribution. Posterior prediction
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 45/58

Normal likelihood
For p(y|θ, σ 2 ) = N (y|θ, σ 2 ), with θ known and σ 2 Analysis for the

normal distribution
unknown, the likelihood for a vector y on iid (Section 2.6)
Unknown mean/
observations is known variance
one data point
n
!
2 1 X Conjugate prior
−n
p(y|σ ) ∝ σ exp − 2 (yi − θ)2 Posterior density
2σ i=1 Precisions
Interpreting µ1
n More on µ1
= (σ 2 )−n/2 exp − 2 v . Posterior prediction
2σ multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 46/58

Normal likelihood
For p(y|θ, σ 2 ) = N (y|θ, σ 2 ), with θ known and σ 2 Analysis for the

normal distribution
unknown, the likelihood for a vector y on iid (Section 2.6)
Unknown mean/
observations is known variance
one data point
n
!
2 1 X Conjugate prior
−n
p(y|σ ) ∝ σ exp − 2 (yi − θ)2 Posterior density
2σ i=1 Precisions
Interpreting µ1
n More on µ1
= (σ 2 )−n/2 exp − 2 v . Posterior prediction
2σ multiple
observations
multiple
observations
The sufficient statistic is Known
mean/unknown
n
1 X variance
v= (yi − θ)2 . Normal likelihood
n i=1 Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 46/58

Conjugate prior for σ 2
Conjugate prior density is the inverse-gamma: Analysis for the

normal distribution
(Section 2.6)
2 2 −(α+1) −β/σ 2
p(σ ) ∝ (σ ) e , Unknown mean/
known variance
one data point
which has hyperparameters (α, β). Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 47/58

Conjugate prior for σ 2
Conjugate prior density is the inverse-gamma: Analysis for the

normal distribution
(Section 2.6)
2 2 −(α+1) −β/σ 2
p(σ ) ∝ (σ ) e , Unknown mean/
known variance
one data point
which has hyperparameters (α, β). Conjugate prior
Posterior density
Precisions
A convenient parameterization is as a scaled Interpreting µ1
inverse-χ2 distribution with scale σ02 and ν0 degrees More on µ1
of freedom. multiple
observations
multiple
Then the prior distribution of σ 2 is the same observations
Known
distribution as σ02 ν0 /X, where X ∼ χ2ν0 random mean/unknown
variance
variable. We use the convenient (but nonstandard) Normal likelihood
notation: σ 2 ∼ Inv-χ2 (ν0 , σ02 ). Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 47/58

Posterior distribution for σ 2
Resulting posterior density: Analysis for the

normal distribution
(Section 2.6)
p(σ 2 |y) ∝ p(σ 2 )p(y|σ 2 ) Unknown mean/
2 ν0 /2+1 known variance
2

σ0 ν0 σ0 one data point
∝ 2
exp − 2 Conjugate prior
σ 2σ Posterior density
n v Precisions
Interpreting µ1
·(σ 2 )−n/2 exp − 2 More on µ1
2 σ Posterior prediction
2 −((n+ν0 )/2+1) 1 multiple

∝ (σ ) exp − 2 (ν0 σ02 + nv) . observations
2σ multiple
observations
Known
Thus ... mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 48/58

Posterior distribution for σ 2
Resulting posterior density: Analysis for the

normal distribution
(Section 2.6)
p(σ 2 |y) ∝ p(σ 2 )p(y|σ 2 ) Unknown mean/
2 ν0 /2+1 known variance
2

σ0 ν0 σ0 one data point
∝ 2
exp − 2 Conjugate prior
σ 2σ Posterior density
n v Precisions
Interpreting µ1
·(σ 2 )−n/2 exp − 2 More on µ1
2 σ Posterior prediction
2 −((n+ν0 )/2+1) 1 multiple

∝ (σ ) exp − 2 (ν0 σ02 + nv) . observations
2σ multiple
observations
Known
Thus ... mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 48/58

2 Analysis for the

2 2 ν0 σ0 + nv normal distribution
σ |y ∼ Inv − χ ν0 + n, , (Section 2.6)
ν0 + n Unknown mean/
known variance
—scaled inverse-χ2 distribution one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 49/58

2 Analysis for the

2 2 ν0 σ0 + nv normal distribution
σ |y ∼ Inv − χ ν0 + n, , (Section 2.6)
ν0 + n Unknown mean/
known variance
—scaled inverse-χ2 distribution one data point
Conjugate prior
Posterior density
■ Posterior scale = precision-weighted average of Precisions
prior and data scales. Interpreting µ1
More on µ1
■ posterior degrees of freedom = sum of prior and Posterior prediction
multiple
data degrees of freedom. observations
multiple
■ Prior distribution ≈ information equivalent to ν0 observations
Known
observations with average squared deviation σ02 . mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 49/58

Exercises
1. Ex 2.1 — 2.5 Analysis for the

normal distribution
(Section 2.6)
Unknown mean/
2. Ex2.8, 2.9, 2.11 known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 50/58

Exercises
1. Ex 2.1 — 2.5 Analysis for the

normal distribution
(Section 2.6)
Unknown mean/
2. Ex2.8, 2.9, 2.11 known variance
one data point
Conjugate prior
Posterior density
Precisions
Interpreting µ1
More on µ1
multiple
observations
multiple
observations
Known
mean/unknown
variance
Normal likelihood
Conjugate prior
Posterior
distribution
March 23, 2009 Chapter 2 - p. 50/58

The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
The Standard Single-parameter Models Poisson Model
(Section 2.7) Exponential Model
March 23, 2009 Chapter 2 - p. 51/58

The exponential family
The familiar /exponential family0distributions The Standard

Single-parameter
includes Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 52/58

The exponential family
The familiar /exponential family0distributions The Standard

Single-parameter
includes Models
(Section 2.7)
■ binomial, The exponential
family
■ normal, Poisson Model
Exponential Model
■ Poisson, and
■ exponential
March 23, 2009 Chapter 2 - p. 52/58

Poisson Model
The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 53/58

Poisson Model
■ Background: number of counts for events The Standard

Single-parameter
occurring exchangeably in all time intervals Models
(Section 2.7)
(independent in time with constant rate of The exponential
family
occurrence) Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 53/58

Poisson Model

Single-parameter
(Section 2.7)
family
■ Distribution for a data point y: Exponential Model
θy e−θ
p(y|θ) = , y = 0, 1, 2, . . . ,
y!
March 23, 2009 Chapter 2 - p. 53/58

Poisson Model

Single-parameter
(Section 2.7)
family
■ Distribution for a data point y: Exponential Model
θy e−θ
p(y|θ) = , y = 0, 1, 2, . . . ,
y!
■ Likelihood for iid obs y = (y1 , . . . , yn ):

n
Y 1 yi −θ
p(y|θ) = θ e ∝ θt(y) e−nθ ,
i=1
y!
Pn
where t(y) = i=1 yi is the sufficient statistic.
March 23, 2009 Chapter 2 - p. 53/58

Prior and posterior
The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 54/58

Prior and posterior
■ Conjugate prior—Gamma(α, β): The Standard

Single-parameter
Models
p(θ) ∝ e−βθ θα−1 (Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 54/58

Prior and posterior

Single-parameter
Models
The exponential
family
Poisson Model
Exponential Model
■ Meaning: total counts of α − 1 in β prior
observations.
(Comparing with p(y|θ))
March 23, 2009 Chapter 2 - p. 54/58

Prior and posterior

Single-parameter
Models
The exponential
family
Poisson Model
Exponential Model
observations.
■ Posterior distribution:
θ|y ∼ Gamma(α + nȳ, β + n)
March 23, 2009 Chapter 2 - p. 54/58

Prior and posterior

Single-parameter
Models
The exponential
family
Poisson Model
Exponential Model
observations.
θ|y ∼ Gamma(α + nȳ, β + n)
■ intuitive explanation: ?
March 23, 2009 Chapter 2 - p. 54/58

Extension
The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 55/58

Extension
■ yi ∼ P oisson(xi θ) The Standard

Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 55/58

Extension

Single-parameter
■ θ: rate; xi : exposure of unit i (in epidemiology!) Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 55/58

Extension

Single-parameter
(Section 2.7)
The exponential
■ not exchangeable in yi ’s; exchangeable in (x, y)i family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 55/58

Extension

Single-parameter
(Section 2.7)
The exponential
Poisson Model
■ Likelihood: Exponential Model
p(y|θ) ∝ θ( ) ( i=1 xi )θ
Pn Pn
y −
i=1 i e
March 23, 2009 Chapter 2 - p. 55/58

Extension

Single-parameter
(Section 2.7)
The exponential
Poisson Model
p(y|θ) ∝ θ( ) ( i=1 xi )θ
Pn Pn
y −
i=1 i e
■ Conjugate prior: Gamma(α, β)
March 23, 2009 Chapter 2 - p. 55/58

Extension

Single-parameter
(Section 2.7)
The exponential
Poisson Model
p(y|θ) ∝ θ( ) ( i=1 xi )θ
Pn Pn
y −
i=1 i e
■ Conjugate prior: Gamma(α, β)

n n
!
X X
θ|y ∼ Gamma α + yi , β + xi .
i=1 i=1
March 23, 2009 Chapter 2 - p. 55/58

Exponential Model
The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 56/58

Exponential Model
■ Background: waiting times for events occurring The Standard

Single-parameter
exchangeably in all time intervals (independent in Models
(Section 2.7)
time with constant rate of occurrence) The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 56/58

Exponential Model

Single-parameter
(Section 2.7)
family
■ Distribution for a data point y: Poisson Model
Exponential Model
p(y|θ) = θ exp{−yθ}, y > 0
March 23, 2009 Chapter 2 - p. 56/58

Exponential Model

Single-parameter
(Section 2.7)
family
■ Distribution for a data point y: Poisson Model
Exponential Model
p(y|θ) = θ exp{−yθ}, y > 0
■ Memoryless property:
P r(y > t + s|y > s, θ) = P r(y > t|θ) ∀s, t.
March 23, 2009 Chapter 2 - p. 56/58

The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 57/58

■ Conjugate prior: Gamma(α, β) The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 57/58

Single-parameter
■ Posterior distribution: Models
(Section 2.7)
The exponential
θ|y ∼ Gamma (α + 1, β + y) . family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 57/58

Single-parameter
■ Posterior distribution: Models
(Section 2.7)
The exponential
θ|y ∼ Gamma (α + 1, β + y) . family
Poisson Model
Exponential Model
■ Extension to iid observations y = (y1 , . . . , yn ):

n
!
X
θ|y ∼ Gamma α + n, β + yi .
i=1
March 23, 2009 Chapter 2 - p. 57/58

Selfreading
The Standard
Single-parameter
Models
(Section 2.7)
The exponential
family
Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 58/58

Selfreading
■ Example: informative prior distribution and The Standard

Single-parameter
multilevel structure fore estimating cancer rates Models
(Section 2.7)
The exponential
family
■ Noninformative prior distributions Poisson Model
Exponential Model
March 23, 2009 Chapter 2 - p. 58/58

Bayesian Statistical Analysis: Chapter 2: Single-Parameter Models

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bayesian Statistical Analysis: Chapter 2: Single-Parameter Models

Uploaded by

Copyright:

Available Formats

Bayesian Statistical Analysis

Chapter 2: Single-Parameter Models

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 1/58

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 2/58

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 3/58

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 3/58

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 3/58

Summarize data by: Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 4/58

Summarize data by: Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 4/58

Summarize data by: Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 4/58

Summarize data by: Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 4/58

Consider estimating the sex ratio in a population of Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58

Consider estimating the sex ratio in a population of Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58

Consider estimating the sex ratio in a population of Estimating a

ratio of male to female birth rates, φ = 1−θ

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58

Consider estimating the sex ratio in a population of Estimating a

ratio of male to female birth rates, φ = 1−θ

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58

Consider estimating the sex ratio in a population of Estimating a

ratio of male to female birth rates, φ = 1−θ

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58

Consider estimating the sex ratio in a population of Estimating a

ratio of male to female birth rates, φ = 1−θ

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 5/58

Suppose only two values of θ are considered Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58

Suppose only two values of θ are considered Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58

Suppose only two values of θ are considered Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58

Suppose only two values of θ are considered Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58

Suppose only two values of θ are considered Estimating a

SCHOOL OF FINANCE AND S TAT I S T I C S

March 23, 2009 Chapter 2 - p. 6/58

SCHOOL OF FINANCE AND S TAT I S T I C S