You are on page 1of 162

Bayesian Statistical Analysis

Chapter 3: Introduction to Multiparameter


Models

Tang Yin-cai
yctang@stat.ecnu.edu.cn

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 1/55


Introduction to
Multiparameter
Models (I)

— Noninformative
prior distributions
Introduction to (Reference: Gel-
man et. al.
Multiparameter Models (I) 2.9)
Noninformative
prior distributions
Proper and
— Noninformative prior distributions improper priors
Second example
(Reference: Gelman et. al. 2.9) binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 2/55


Noninformative prior distributions

■ Prior distributions may be hard to construct if Introduction to


Multiparameter
there is no population basis. Models (I)

— Noninformative
prior distributions
(Reference: Gel-
man et. al.
2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 3/55


Noninformative prior distributions

■ Prior distributions may be hard to construct if Introduction to


Multiparameter
there is no population basis. Models (I)

■ Statisticians have long sought prior distributions — Noninformative


prior distributions
guaranteed to play a minimal role in determining (Reference: Gel-
man et. al.
the posterior distribution. 2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 3/55


Noninformative prior distributions

■ Prior distributions may be hard to construct if Introduction to


Multiparameter
there is no population basis. Models (I)

■ Statisticians have long sought prior distributions — Noninformative


prior distributions
guaranteed to play a minimal role in determining (Reference: Gel-
man et. al.
the posterior distribution. 2.9)
Noninformative
■ The rationale is that we should /let the data prior distributions
Proper and
speak for themselves0. improper priors
Second example
Such as /objective0Bayesian analysis would binomial parameter
use a reference or noninformative prior with a
density described as vague, flat or diffuse.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 3/55


Proper and improper priors

■ Recall: In estimating mean θ of a normal model Introduction to


Multiparameter
with known variance σ 2 , if prior precision, 1/τ02 , Models (I)

is small relative to the data precision, n/σ 2 , — Noninformative


prior distributions
then posterior distribution is approximately as if (Reference: Gel-
τ02 = ∞: man et. al.
2.9)
p(θ|y) ≈ N (θ|ȳ, σ 2 /n) Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 4/55


Proper and improper priors

■ Recall: In estimating mean θ of a normal model Introduction to


Multiparameter
with known variance σ 2 , if prior precision, 1/τ02 , Models (I)

is small relative to the data precision, n/σ 2 , — Noninformative


prior distributions
then posterior distribution is approximately as if (Reference: Gel-
τ02 = ∞: man et. al.
2.9)
p(θ|y) ≈ N (θ|ȳ, σ 2 /n) Noninformative
prior distributions
Proper and
■ Conclusion: the posterior distribution is improper priors
Second example
approximately what would result from assuming binomial parameter
that
p(θ) ∝ Constant for θ ∈ (−∞, ∞).

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 4/55


■ The integral of the /flat0distribution p(θ) ∝ 1 for Introduction to
Multiparameter
θ ∈ (−∞, ∞) is not finite. In this case the Models (I)

distribution is referred to as improper. — Noninformative


prior distributions
(Reference: Gel-
man et. al.
2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 5/55


■ The integral of the /flat0distribution p(θ) ∝ 1 for Introduction to
Multiparameter
θ ∈ (−∞, ∞) is not finite. In this case the Models (I)

distribution is referred to as improper. — Noninformative


prior distributions
(Reference: Gel-
A prior density p(θ) is proper if it does not depend man et. al.
2.9)
on data and integrates to 1. Noninformative
prior distributions
(provided the integral is finite the density can Proper and
improper priors
always be normalized to integrate to 1). Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 5/55


■ The integral of the /flat0distribution p(θ) ∝ 1 for Introduction to
Multiparameter
θ ∈ (−∞, ∞) is not finite. In this case the Models (I)

distribution is referred to as improper. — Noninformative


prior distributions
(Reference: Gel-
A prior density p(θ) is proper if it does not depend man et. al.
2.9)
on data and integrates to 1. Noninformative
prior distributions
(provided the integral is finite the density can Proper and
improper priors
always be normalized to integrate to 1). Second example
binomial parameter

Despite impropriety of prior distribution in the ex-


ample with the normal sampling model, the posterior
distribution is proper, given at least one data point.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 5/55


Second example (Section 2.7)

■ Normal model with known mean but unknown Introduction to


Multiparameter
variance, with the conjugate scaled inverse-χ2 Models (I)

prior distribution σ 2 ∼ Inv-χ2 (ν0 , σ02 ) (That is, — Noninformative


σ02 ν0 /σ 2 ∼ χ2ν0 ). prior distributions
(Reference: Gel-
man et. al.
2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 6/55


Second example (Section 2.7)

■ Normal model with known mean but unknown Introduction to


Multiparameter
variance, with the conjugate scaled inverse-χ2 Models (I)

prior distribution σ 2 ∼ Inv-χ2 (ν0 , σ02 ) (That is, — Noninformative


σ02 ν0 /σ 2 ∼ χ2ν0 ). prior distributions
(Reference: Gel-
man et. al.
■ If prior degrees of freedom, ν0 , are small relative 2.9)
to the data degrees of freedom, n, then posterior Noninformative
prior distributions
distribution (ν = n1 ni=1 (yi − θ)2 )
P
Proper and
improper priors
Second example
2
2 2 ν0 σ0 + nν binomial parameter
σ |y ∼ Inv-χ (ν0 + n, )
ν0 + n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 6/55


Second example (Section 2.7)

■ Normal model with known mean but unknown Introduction to


Multiparameter
variance, with the conjugate scaled inverse-χ2 Models (I)

prior distribution σ 2 ∼ Inv-χ2 (ν0 , σ02 ) (That is, — Noninformative


σ02 ν0 /σ 2 ∼ χ2ν0 ). prior distributions
(Reference: Gel-
man et. al.
■ If prior degrees of freedom, ν0 , are small relative 2.9)
to the data degrees of freedom, n, then posterior Noninformative
prior distributions
distribution (ν = n1 ni=1 (yi − θ)2 )
P
Proper and
improper priors
Second example
2
2 2 ν0 σ0 + nν binomial parameter
σ |y ∼ Inv-χ (ν0 + n, )
ν0 + n
is approximately as if ν0 = 0:
p(σ 2 |y) ≈ Inv-χ2 (n, ν),

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 6/55


Improper prior can lead to proper posterior

Note: Introduction to
Multiparameter
Models (I)
p(σ 2 ) ∝ 1/σ 2 (improper) ⇒ σ 2 |y ∼ Inv-χ2 (n, ν).
— Noninformative
prior distributions
(Reference: Gel-
man et. al.
2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 7/55


Improper prior can lead to proper posterior

Note: Introduction to
Multiparameter
Models (I)
p(σ 2 ) ∝ 1/σ 2 (improper) ⇒ σ 2 |y ∼ Inv-χ2 (n, ν).
— Noninformative
prior distributions
■ The combination of an improper prior density and (Reference: Gel-
man et. al.
a normal likelihood does not define a proper joint 2.9)
Noninformative
probability model, p(y, θ). prior distributions
Proper and
■ But applying improper priors
Second example
binomial parameter
p(θ|y) ∝ p(y|θ)p(θ),
still gives a proper posterior density.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 7/55


Improper prior can lead to proper posterior

Note: Introduction to
Multiparameter
Models (I)
p(σ 2 ) ∝ 1/σ 2 (improper) ⇒ σ 2 |y ∼ Inv-χ2 (n, ν).
— Noninformative
prior distributions
■ The combination of an improper prior density and (Reference: Gel-
man et. al.
a normal likelihood does not define a proper joint 2.9)
Noninformative
probability model, p(y, θ). prior distributions
Proper and
■ But applying improper priors
Second example
binomial parameter
p(θ|y) ∝ p(y|θ)p(θ),

Rstill gives a proper posterior density. since


p(θ|y)dθ is finite for all y.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 7/55


Improper prior can lead to proper posterior

Note: Introduction to
Multiparameter
Models (I)
p(σ 2 ) ∝ 1/σ 2 (improper) ⇒ σ 2 |y ∼ Inv-χ2 (n, ν).
— Noninformative
prior distributions
■ The combination of an improper prior density and (Reference: Gel-
man et. al.
a normal likelihood does not define a proper joint 2.9)
Noninformative
probability model, p(y, θ). prior distributions
Proper and
■ But applying improper priors
Second example
binomial parameter
p(θ|y) ∝ p(y|θ)p(θ),

Rstill gives a proper posterior density. since


p(θ|y)dθ is finite for all y.
This not always true! Posterior distributions ob-
tained from improper priors must be treated with
great care. S F
CHOOL OF S
INANCE AND TAT I S T I C S

March 30, 2009 Chapter 2 - p. 7/55


Noninformative prior distributions for the binomial parameter

Binomial sampling model y|θ ∼ Bin(n, θ) Introduction to


Multiparameter
Models (I)
Various noninformative prior distributions for θ(See — Noninformative
section 2.9/Page 63): prior distributions
(Reference: Gel-
man et. al.
2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 8/55


Noninformative prior distributions for the binomial parameter

Binomial sampling model y|θ ∼ Bin(n, θ) Introduction to


Multiparameter
Models (I)
Various noninformative prior distributions for θ(See — Noninformative
section 2.9/Page 63): prior distributions
(Reference: Gel-
1. BayesõLaplace uniform prior density man et. al.
2.9)
θ ∼ Beta(1, 1), that is p(θ) ∝ 1, θ ∈ (0, 1). Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 8/55


Noninformative prior distributions for the binomial parameter

Binomial sampling model y|θ ∼ Bin(n, θ) Introduction to


Multiparameter
Models (I)
Various noninformative prior distributions for θ(See — Noninformative
section 2.9/Page 63): prior distributions
(Reference: Gel-
1. BayesõLaplace uniform prior density man et. al.
2.9)
θ ∼ Beta(1, 1), that is p(θ) ∝ 1, θ ∈ (0, 1). Noninformative
prior distributions
2. Jeffreys.Rule suggests Beta( 12 , 12 ) : Proper and
improper priors
p(θ) ∝ θ1/2 (1 − θ)1/2 . Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 8/55


Noninformative prior distributions for the binomial parameter

Binomial sampling model y|θ ∼ Bin(n, θ) Introduction to


Multiparameter
Models (I)
Various noninformative prior distributions for θ(See — Noninformative
section 2.9/Page 63): prior distributions
(Reference: Gel-
1. BayesõLaplace uniform prior density man et. al.
2.9)
θ ∼ Beta(1, 1), that is p(θ) ∝ 1, θ ∈ (0, 1). Noninformative
prior distributions
2. Jeffreys.Rule suggests Beta( 12 , 12 ) : Proper and
improper priors
p(θ) ∝ θ1/2 (1 − θ)1/2 . Second example
binomial parameter
3. From exponential family representation of the
binomial distribution:
p(logit(θ)) ∝ Constant.
That is, p(θ) ∝ θ−1 (1 − θ)−1 : improper Beta(0, 0)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 8/55


Difference is small

Note that: Introduction to


Multiparameter
Models (I)

— Noninformative
prior distributions
(Reference: Gel-
man et. al.
2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 9/55


Difference is small

Note that: Introduction to


Multiparameter
Models (I)
1. For binomial and other single-parameter — Noninformative
models, different principles give (slightly) prior distributions
(Reference: Gel-
different noninformative prior distributions. man et. al.
2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 9/55


Difference is small

Note that: Introduction to


Multiparameter
Models (I)
1. For binomial and other single-parameter — Noninformative
models, different principles give (slightly) prior distributions
(Reference: Gel-
different noninformative prior distributions. man et. al.
2.9)
Noninformative
2. The difference between these alternatives is prior distributions
generally small (getting from θ ∼ Beta(0, 0) to Proper and
improper priors
θ ∼ Beta(1, 1) requires just one more success Second example
binomial parameter
and one more failure).

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 9/55


Difference is small

Note that: Introduction to


Multiparameter
Models (I)
1. For binomial and other single-parameter — Noninformative
models, different principles give (slightly) prior distributions
(Reference: Gel-
different noninformative prior distributions. man et. al.
2.9)
Noninformative
2. The difference between these alternatives is prior distributions
generally small (getting from θ ∼ Beta(0, 0) to Proper and
improper priors
θ ∼ Beta(1, 1) requires just one more success Second example
binomial parameter
and one more failure).

3. If y = 0 or n, Beta(0, 0) prior leads to improper


posterior!

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 9/55


Difference is small

Note that: Introduction to


Multiparameter
Models (I)
1. For binomial and other single-parameter — Noninformative
models, different principles give (slightly) prior distributions
(Reference: Gel-
different noninformative prior distributions. man et. al.
2.9)
Noninformative
2. The difference between these alternatives is prior distributions
generally small (getting from θ ∼ Beta(0, 0) to Proper and
improper priors
θ ∼ Beta(1, 1) requires just one more success Second example
binomial parameter
and one more failure).

3. If y = 0 or n, Beta(0, 0) prior leads to improper


posterior!

4. But for two cases—location parameters and


scale parameters—all principles seem to agree.
SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 9/55


1. θ is a location parameter, that is, p(y − θ|θ) is Introduction to
Multiparameter
free of θ and y. Models (I)

— Noninformative
Noninformative prior: prior distributions
(Reference: Gel-
man et. al.
p(θ) ∝ Constant, θ ∈ (−∞, ∞). 2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 10/55


1. θ is a location parameter, that is, p(y − θ|θ) is Introduction to
Multiparameter
free of θ and y. Models (I)

— Noninformative
Noninformative prior: prior distributions
(Reference: Gel-
man et. al.
p(θ) ∝ Constant, θ ∈ (−∞, ∞). 2.9)
Noninformative
prior distributions
Proper and
2. θ is a scale parameter, that is, p(y/θ|θ) is free of improper priors
θ and y. Second example
binomial parameter

Noninformative prior:
1
p(θ) ∝ ∈ (−∞, ∞).
θ
p(log(θ)) ∝ 1.
p(θ2 ) ∝ 1/θ2 . SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 10/55


General principles?

But beware that even these principles can be Introduction to


Multiparameter
misleading in some problems: Models (I)

— Noninformative
(Noninformative/improper) prior distributions can prior distributions
lead to improper and thus uninterpretable posterior (Reference: Gel-
man et. al.
distributions. 2.9)
Noninformative
prior distributions
Proper and
improper priors
Second example
binomial parameter

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 11/55


General principles?

But beware that even these principles can be Introduction to


Multiparameter
misleading in some problems: Models (I)

— Noninformative
(Noninformative/improper) prior distributions can prior distributions
lead to improper and thus uninterpretable posterior (Reference: Gel-
man et. al.
distributions. 2.9)
Noninformative
prior distributions
All noninformative prior specifications are arbitrary Proper and
improper priors
and if the results are sensitive to the particular Second example
binomial parameter
choice, then more effort in specifying genuine prior
information is required to justify any particular infer-
ence.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 11/55


Multiparameter
models:
Introduction

(Reference: Gel-
man et. al.
Multiparameter models: Introduction 3.1)
Multiparameter
models
(Reference: Gelman et. al. 3.1) The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 12/55


Introduction

■ The reality of applied statistics: there are always Multiparameter


models:
several (maybe many) unknown parameters! Introduction

(Reference: Gel-
man et. al.
3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 13/55


Introduction

■ The reality of applied statistics: there are always Multiparameter


models:
several (maybe many) unknown parameters! Introduction

■ BUT the interest usually lies in only a few of (Reference: Gel-


man et. al.
these (parameters of interest) while others are 3.1)
Multiparameter
regarded as nuisance parameters for which we models
The idea
have no interest in making inferences but which Computation
are required in order to construct a realistic Gibbs Sampling

model.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 13/55


Introduction

■ The reality of applied statistics: there are always Multiparameter


models:
several (maybe many) unknown parameters! Introduction

■ BUT the interest usually lies in only a few of (Reference: Gel-


man et. al.
these (parameters of interest) while others are 3.1)
Multiparameter
regarded as nuisance parameters for which we models
The idea
have no interest in making inferences but which Computation
are required in order to construct a realistic Gibbs Sampling

model.
■ At this point the simple conceptual framework of
the Bayesian approach reveals its principal
advantage over other forms of inference.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 13/55


The Bayesian approach

The Bayesian approach is clear: Multiparameter


models:
Introduction

(Reference: Gel-
man et. al.
3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 14/55


The Bayesian approach

The Bayesian approach is clear: Multiparameter


models:
1. Obtain the joint posterior distribution of all Introduction

unknowns (Reference: Gel-


man et. al.
3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 14/55


The Bayesian approach

The Bayesian approach is clear: Multiparameter


models:
1. Obtain the joint posterior distribution of all Introduction

unknowns (Reference: Gel-


man et. al.
2. integrate over the nuisance parameters to leave 3.1)
Multiparameter
the marginal posterior distribution for the models
The idea
parameters of interest. Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 14/55


The Bayesian approach

The Bayesian approach is clear: Multiparameter


models:
1. Obtain the joint posterior distribution of all Introduction

unknowns (Reference: Gel-


man et. al.
2. integrate over the nuisance parameters to leave 3.1)
Multiparameter
the marginal posterior distribution for the models
The idea
parameters of interest. Computation
Gibbs Sampling
Alternatively using simulation:

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 14/55


The Bayesian approach

The Bayesian approach is clear: Multiparameter


models:
1. Obtain the joint posterior distribution of all Introduction

unknowns (Reference: Gel-


man et. al.
2. integrate over the nuisance parameters to leave 3.1)
Multiparameter
the marginal posterior distribution for the models
The idea
parameters of interest. Computation
Gibbs Sampling
Alternatively using simulation:
1. draw samples from the entire joint posterior
distribution (difficult ?)
2. look at the parameters of interest and ignore the
rest.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 14/55


Averaging over nuisance parameters

■ Suppose that θ has two parts: θ = (θ1 , θ2 ) Multiparameter


models:
θ1 — parameter of interest; Introduction

θ2 — nuisance parameter. (Reference: Gel-


man et. al.
3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 15/55


Averaging over nuisance parameters

■ Suppose that θ has two parts: θ = (θ1 , θ2 ) Multiparameter


models:
θ1 — parameter of interest; Introduction

θ2 — nuisance parameter. (Reference: Gel-


man et. al.
3.1)
■ For example: Multiparameter
models
y|µ, σ 2 ∼ N (µ, σ 2 ), The idea
Computation
Gibbs Sampling
with both µ(= θ1 ) and σ 2 (= θ2 ) are unknown.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 15/55


Averaging over nuisance parameters

■ Suppose that θ has two parts: θ = (θ1 , θ2 ) Multiparameter


models:
θ1 — parameter of interest; Introduction

θ2 — nuisance parameter. (Reference: Gel-


man et. al.
3.1)
■ For example: Multiparameter
models
y|µ, σ 2 ∼ N (µ, σ 2 ), The idea
Computation
Gibbs Sampling
with both µ(= θ1 ) and σ 2 (= θ2 ) are unknown.
■ Parameter of interest: µ.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 15/55


Averaging over nuisance parameters

■ AIM: To obtain the conditional distribution p(θ1 |y) Multiparameter


models:
Introduction

(Reference: Gel-
man et. al.
3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 16/55


Averaging over nuisance parameters

■ AIM: To obtain the conditional distribution p(θ1 |y) Multiparameter


models:
Introduction
■ joint posterior density,
(Reference: Gel-
p(θ1 , θ2 |y) ∝ p(y|θ1 , θ2 )p(θ1 , θ2 ), man et. al.
3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 16/55


Averaging over nuisance parameters

■ AIM: To obtain the conditional distribution p(θ1 |y) Multiparameter


models:
Introduction
■ joint posterior density,
(Reference: Gel-
p(θ1 , θ2 |y) ∝ p(y|θ1 , θ2 )p(θ1 , θ2 ), man et. al.
3.1)
Multiparameter
models
■ averaging or integrating over θ2 The idea
Z Computation
Gibbs Sampling
p(θ1 |y) = p(θ1 , θ2 |y)dθ2 .

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 16/55


Factoring the joint posterior

Alternatively, Multiparameter
models:
Z Introduction

p(θ1 |y) = p(θ1 |θ2 , y)p(θ2 |y)dθ2 . (3.1) (Reference: Gel-


man et. al.
3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 17/55


Factoring the joint posterior

Alternatively, Multiparameter
models:
Z Introduction

p(θ1 |y) = p(θ1 |θ2 , y)p(θ2 |y)dθ2 . (3.1) (Reference: Gel-


man et. al.
3.1)
Multiparameter
models
p(θ1 |y), as a mixture of the conditional posterior dis- The idea
tributions given the nuisance parameter, θ2 , where Computation
Gibbs Sampling
p(θ2 |y) is a weighting function for the different pos-
sible values of θ2 .

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 17/55


Mixtures of conditionals

■ The weights depend on the posterior density of Multiparameter


models:
θ2 —so on a combination of evidence from data Introduction

and prior model. (Reference: Gel-


man et. al.
3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 18/55


Mixtures of conditionals

■ The weights depend on the posterior density of Multiparameter


models:
θ2 —so on a combination of evidence from data Introduction

and prior model. (Reference: Gel-


man et. al.
3.1)
■ What if θ2 is known to have a particular value? Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 18/55


Mixtures of conditionals

■ The weights depend on the posterior density of Multiparameter


models:
θ2 —so on a combination of evidence from data Introduction

and prior model. (Reference: Gel-


man et. al.
3.1)
■ What if θ2 is known to have a particular value? Multiparameter
models
The idea
The averaging over nuisance parameters can be Computation
interpreted very generally: θ2 can be categorical Gibbs Sampling

(discrete) and may take only a few possible values


representing, for example, different sub-models.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 18/55


A strategy for computation

We rarely evaluate integral (3.1) explicitly, but it Multiparameter


models:
suggests an important strategy for constructing Introduction

and computing with multiparameter models, using (Reference: Gel-


man et. al.
simulation: 3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 19/55


A strategy for computation

We rarely evaluate integral (3.1) explicitly, but it Multiparameter


models:
suggests an important strategy for constructing Introduction

and computing with multiparameter models, using (Reference: Gel-


man et. al.
simulation: 3.1)
Multiparameter
■ Draw θ2 from its marginal posterior distribution models
The idea
p(θ2 |y). Computation
Gibbs Sampling
■ Draw θ1 from conditional posterior distribution
p(θ1 |θ2 , y), given the drawn value of θ2 .

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 19/55


A strategy for computation

We rarely evaluate integral (3.1) explicitly, but it Multiparameter


models:
suggests an important strategy for constructing Introduction

and computing with multiparameter models, using (Reference: Gel-


man et. al.
simulation: 3.1)
Multiparameter
■ Draw θ2 from its marginal posterior distribution models
The idea
p(θ2 |y). Computation
Gibbs Sampling
■ Draw θ1 from conditional posterior distribution
p(θ1 |θ2 , y), given the drawn value of θ2 .

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 19/55


Conditional simulation

is performed indirectly. Altering step 1 to draw θ2


(3.1) Multiparameter
models:
from its conditional posterior distribution given θ1 Introduction

leads to Gibbs Sampler, often used in Bayesian (Reference: Gel-


man et. al.
analysys. 3.1)
Multiparameter
models
The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 20/55


Conditional simulation

is performed indirectly. Altering step 1 to draw θ2


(3.1) Multiparameter
models:
from its conditional posterior distribution given θ1 Introduction

leads to Gibbs Sampler, often used in Bayesian (Reference: Gel-


man et. al.
analysys. 3.1)
Multiparameter
models
1. Give an initial value of θ1 . The idea
Computation
Gibbs Sampling

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 20/55


Conditional simulation

is performed indirectly. Altering step 1 to draw θ2


(3.1) Multiparameter
models:
from its conditional posterior distribution given θ1 Introduction

leads to Gibbs Sampler, often used in Bayesian (Reference: Gel-


man et. al.
analysys. 3.1)
Multiparameter
models
1. Give an initial value of θ1 . The idea
Computation
2. Draw θ2 from p(θ2 |θ1 , y), given the drawn value Gibbs Sampling
of θ1 .

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 20/55


Conditional simulation

is performed indirectly. Altering step 1 to draw θ2


(3.1) Multiparameter
models:
from its conditional posterior distribution given θ1 Introduction

leads to Gibbs Sampler, often used in Bayesian (Reference: Gel-


man et. al.
analysys. 3.1)
Multiparameter
models
1. Give an initial value of θ1 . The idea
Computation
2. Draw θ2 from p(θ2 |θ1 , y), given the drawn value Gibbs Sampling
of θ1 .
3. Draw θ1 from p(θ1 |θ2 , y), given θ2 .

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 20/55


Conditional simulation

is performed indirectly. Altering step 1 to draw θ2


(3.1) Multiparameter
models:
from its conditional posterior distribution given θ1 Introduction

leads to Gibbs Sampler, often used in Bayesian (Reference: Gel-


man et. al.
analysys. 3.1)
Multiparameter
models
1. Give an initial value of θ1 . The idea
Computation
2. Draw θ2 from p(θ2 |θ1 , y), given the drawn value Gibbs Sampling
of θ1 .
3. Draw θ1 from p(θ1 |θ2 , y), given θ2 .
4. Go back to Step 2, and iterate for certain
number of loops.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 20/55


Conditional simulation

is performed indirectly. Altering step 1 to draw θ2


(3.1) Multiparameter
models:
from its conditional posterior distribution given θ1 Introduction

leads to Gibbs Sampler, often used in Bayesian (Reference: Gel-


man et. al.
analysys. 3.1)
Multiparameter
models
1. Give an initial value of θ1 . The idea
Computation
2. Draw θ2 from p(θ2 |θ1 , y), given the drawn value Gibbs Sampling
of θ1 .
3. Draw θ1 from p(θ1 |θ2 , y), given θ2 .
4. Go back to Step 2, and iterate for certain
number of loops.

The procedure will ultimately to generate samples


from the marginal posterior distribution of both θ1
and θ2 —The GibbsS samplers.
F
CHOOL OF S
INANCE AND TAT I S T I C S

March 30, 2009 Chapter 2 - p. 20/55


Conditional simulation

is performed indirectly. Altering step 1 to draw θ2


(3.1) Multiparameter
models:
from its conditional posterior distribution given θ1 Introduction

leads to Gibbs Sampler, often used in Bayesian (Reference: Gel-


man et. al.
analysys. 3.1)
Multiparameter
models
1. Give an initial value of θ1 . The idea
Computation
2. Draw θ2 from p(θ2 |θ1 , y), given the drawn value Gibbs Sampling
of θ1 .
3. Draw θ1 from p(θ1 |θ2 , y), given θ2 .
4. Go back to Step 2, and iterate for certain
number of loops.

The procedure will ultimately to generate samples


from the marginal posterior distribution of both θ1
and θ2 —The GibbsS samplers.
F
CHOOL OF S
INANCE AND TAT I S T I C S

March 30, 2009 Chapter 2 - p. 20/55


Multiparameter
models: Normal
mean & variance

(Reference: Gel-
man et. al.
Multiparameter models: Normal mean 3.2)
Normal Dist’n
& variance

(Reference: Gelman et. al. 3.2)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 21/55


Normal data with noninformative prior

iid 2 Multiparameter
■ The data: y ∼ N (µ, σ ). models: Normal
mean & variance

(Reference: Gel-
man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 22/55


Normal data with noninformative prior

iid 2 Multiparameter
■ The data: y ∼ N (µ, σ ). models: Normal
mean & variance
■ Prior distribution: noninformative (which is easily (Reference: Gel-
extended to informative priors). man et. al.
3.2)
Normal Dist’n
We assume prior independence of location and
scale parameters and take p(µ, σ 2 ) to be uniform
on (µ, log(σ 2 )):
p(µ, σ 2 ) ∝ (σ 2 )−1 .

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 22/55


Normal data with noninformative prior

iid 2 Multiparameter
■ The data: y ∼ N (µ, σ ). models: Normal
mean & variance
■ Prior distribution: noninformative (which is easily (Reference: Gel-
extended to informative priors). man et. al.
3.2)
Normal Dist’n
We assume prior independence of location and
scale parameters and take p(µ, σ 2 ) to be uniform
on (µ, log(σ 2 )):
p(µ, σ 2 ) ∝ (σ 2 )−1 .

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 22/55


The joint posterior distribution p(µ, σ 2 |y)

Under the improper prior distribution, Multiparameter


models: Normal
mean & variance
Joint posterior distribution ∝ the likelihood × 1/σ 2
(Reference: Gel-
man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 23/55


The joint posterior distribution p(µ, σ 2 |y)

Under the improper prior distribution, Multiparameter


models: Normal
mean & variance
Joint posterior distribution ∝ the likelihood × 1/σ 2
(Reference: Gel-
man et. al.
3.2)
Normal Dist’n
n
!
2 1 X
p(µ, σ |y) ∝ σ −n−2
exp − 2 (yi − µ)2
2σ i=1
" n #!
1 X
= σ −n−2 exp − 2 (yi − ȳ)2 + n(ȳ − µ)2
2σ i=1
 
1  2 2

= σ −n−2
exp − 2 (n − 1)s + n(ȳ − µ) (,3.2)

1
Pn
where s = n−1 i=1 (yi − ȳ)2 .
2

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 23/55


The joint posterior distribution p(µ, σ 2 |y)

Under the improper prior distribution, Multiparameter


models: Normal
mean & variance
Joint posterior distribution ∝ the likelihood × 1/σ 2
(Reference: Gel-
man et. al.
3.2)
Normal Dist’n
n
!
2 1 X
p(µ, σ |y) ∝ σ −n−2
exp − 2 (yi − µ)2
2σ i=1
" n #!
1 X
= σ −n−2 exp − 2 (yi − ȳ)2 + n(ȳ − µ)2
2σ i=1
 
1  2 2

= σ −n−2
exp − 2 (n − 1)s + n(ȳ − µ) (,3.2)

1
Pn
where s = n−1 i=1 (yi − ȳ)2 .
2

2
Sufficient statistics:
S
ȳ and
CHOOL OFF
sINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 23/55


The conditional posterior dist.n p(µ|σ 2 , y)

■ We now try to factor the joint posterior density as Multiparameter


models: Normal
the product of p(µ|σ 2 , y) and the marginal p(σ 2 |y) mean & variance

(Reference: Gel-
p(µ, σ 2 |y) = p(µ|σ 2 , y) × p(σ 2 |y) man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 24/55


The conditional posterior dist.n p(µ|σ 2 , y)

■ We now try to factor the joint posterior density as Multiparameter


models: Normal
the product of p(µ|σ 2 , y) and the marginal p(σ 2 |y) mean & variance

(Reference: Gel-
p(µ, σ 2 |y) = p(µ|σ 2 , y) × p(σ 2 |y) man et. al.
3.2)
Normal Dist’n
■ We can use a previous result for the mean of a
normal distribution with known variance.
µ|σ 2 , y ∼ N (ȳ, σ 2 /n). (3.3)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 24/55


The marginal posterior dist’n p(σ 2 |y)

This requires averaging the joint distribution (3.2) Multiparameter


models: Normal
over µ, that is, evaluating the simple normal mean & variance

integral (Reference: Gel-


  man et. al.
1
Z p 3.2)
2
exp − 2 n(ȳ − µ) dµ = 2πσ 2 /n. Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 25/55


The marginal posterior dist’n p(σ 2 |y)

This requires averaging the joint distribution (3.2) Multiparameter


models: Normal
over µ, that is, evaluating the simple normal mean & variance

integral (Reference: Gel-


  man et. al.
1
Z p 3.2)
2
exp − 2 n(ȳ − µ) dµ = 2πσ 2 /n. Normal Dist’n

Thus,
2
 
2 2 −[ n−1 +1] (n − 1)s
p(σ |y) ∝ (σ ) 2 exp − 2
, (3.4)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 25/55


The marginal posterior dist’n p(σ 2 |y)

This requires averaging the joint distribution (3.2) Multiparameter


models: Normal
over µ, that is, evaluating the simple normal mean & variance

integral (Reference: Gel-


  man et. al.
1
Z p 3.2)
2
exp − 2 n(ȳ − µ) dµ = 2πσ 2 /n. Normal Dist’n

Thus,
2
 
2 2 −[ n−1 +1] (n − 1)s
p(σ |y) ∝ (σ ) 2 exp − 2
, (3.4)

which is a scaled inverse-χ2 density:


σ 2 |y ∼ Inv-χ2 (n − 1, s2 ). (3.5)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 25/55


Parallel between Bayes & frequentist results

As with one-parameter normal results, there is a Multiparameter


models: Normal
remarkable parallel with sampling theory: mean & variance

(Reference: Gel-
man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 26/55


Parallel between Bayes & frequentist results

As with one-parameter normal results, there is a Multiparameter


models: Normal
remarkable parallel with sampling theory: mean & variance

(Reference: Gel-
Bayes: man et. al.
2 3.2)
(n − 1)s 2 Normal Dist’n
2
y ∼ χ n−1 .
σ

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 26/55


Parallel between Bayes & frequentist results

As with one-parameter normal results, there is a Multiparameter


models: Normal
remarkable parallel with sampling theory: mean & variance

(Reference: Gel-
Bayes: man et. al.
2 3.2)
(n − 1)s 2 Normal Dist’n
2
y ∼ χ n−1 .
σ

Frequentist:
2

(n − 1)s 2 2
2
µ, σ ∼ χ n−1 .
σ

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 26/55


Marginal posterior distribution of µ

■ µ is typically the estimand of interest, so ultimate Multiparameter


models: Normal
objective of the Bayesian analysis is the marginal mean & variance

posterior distribution of µ. (Reference: Gel-


man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 27/55


Marginal posterior distribution of µ

■ µ is typically the estimand of interest, so ultimate Multiparameter


models: Normal
objective of the Bayesian analysis is the marginal mean & variance

posterior distribution of µ. (Reference: Gel-


man et. al.
3.2)
■ Analytically:This can be obtained by integrating Normal Dist’n
σ 2 out of the joint posterior distribution.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 27/55


Marginal posterior distribution of µ

■ µ is typically the estimand of interest, so ultimate Multiparameter


models: Normal
objective of the Bayesian analysis is the marginal mean & variance

posterior distribution of µ. (Reference: Gel-


man et. al.
3.2)
■ Analytically:This can be obtained by integrating Normal Dist’n
σ 2 out of the joint posterior distribution.
■ Easily done by simulation: first draw σ 2 from (3.5),
then draw µ from (3.3).

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 27/55


Marginal posterior distribution of µ

■ µ is typically the estimand of interest, so ultimate Multiparameter


models: Normal
objective of the Bayesian analysis is the marginal mean & variance

posterior distribution of µ. (Reference: Gel-


man et. al.
3.2)
■ Analytically:This can be obtained by integrating Normal Dist’n
σ 2 out of the joint posterior distribution.
■ Easily done by simulation: first draw σ 2 from (3.5),
then draw µ from (3.3).

The posterior distribution of µ can be thought of


as a mixture of normal distributions mixed over the
scaled inverse chi-squared distribution for the vari-
ance.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 27/55


Analytic form ?

We start by integrating the joint posterior density Multiparameter


models: Normal
(3.2) over σ 2 mean & variance

(Reference: Gel-
man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 28/55


Analytic form ?

We start by integrating the joint posterior density Multiparameter


models: Normal
(3.2) over σ 2 mean & variance
Z ∞ (Reference: Gel-
p(µ|y) = p(µ, σ 2 |y)dσ 2 . man et. al.
3.2)
0 Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 28/55


Analytic form ?

We start by integrating the joint posterior density Multiparameter


models: Normal
(3.2) over σ 2 mean & variance
Z ∞ (Reference: Gel-
p(µ|y) = p(µ, σ 2 |y)dσ 2 . man et. al.
3.2)
0 Normal Dist’n

This can be evaluated using the substitution


A
z = 2, where A = (n − 1)s2 + n(µ − ȳ)2 .

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 28/55


Marginal posterior distribution of µ

We recognize (!) that the result is an unnormalized Multiparameter


models: Normal
gamma integral: mean & variance

(Reference: Gel-
man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 29/55


Marginal posterior distribution of µ

We recognize (!) that the result is an unnormalized Multiparameter


models: Normal
gamma integral: mean & variance

(Reference: Gel-
man et. al.
Z 3.2)
∞ Normal Dist’n
p(µ|y) ∝ A−n/2 z (n−2)/2 exp(−z)dz
0
∝ [(n − 1)s2 + n(µ − ȳ)2 ]−n/2
2
 
n(µ − ȳ)
∝ 1+
(n − 1)s2

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 29/55


Marginal posterior distribution of µ

We recognize (!) that the result is an unnormalized Multiparameter


models: Normal
gamma integral: mean & variance

(Reference: Gel-
man et. al.
Z 3.2)
∞ Normal Dist’n
p(µ|y) ∝ A−n/2 z (n−2)/2 exp(−z)dz
0
∝ [(n − 1)s2 + n(µ − ȳ)2 ]−n/2
2
 
n(µ − ȳ)
∝ 1+
(n − 1)s2

That is the tn−1 (ȳ, s2 /n) density (See Appendix A).

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 29/55


Multiparameter
Equivalently, under the noninformative uniform models: Normal
prior distribution on (µ, log(σ)), the posterior distri- mean & variance

bution of µ is (Reference: Gel-


man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 30/55


Multiparameter
Equivalently, under the noninformative uniform models: Normal
prior distribution on (µ, log(σ)), the posterior distri- mean & variance

bution of µ is (Reference: Gel-


man et. al.
3.2)
Normal Dist’n

µ − ȳ
√ y ∼ tn−1 ,
s/ n
where where tn−1 is the standard Student-t density
(location 0, scale 1) with n − 1 degrees of freedom.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 30/55


Multiparameter
Equivalently, under the noninformative uniform models: Normal
prior distribution on (µ, log(σ)), the posterior distri- mean & variance

bution of µ is (Reference: Gel-


man et. al.
3.2)
Normal Dist’n

µ − ȳ
√ y ∼ tn−1 ,
s/ n
where where tn−1 is the standard Student-t density
(location 0, scale 1) with n − 1 degrees of freedom.
Comparing with the sampling theory:

µ − ȳ
√ µ, σ 2 ∼ tn−1 ,
s/ n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 30/55


Multiparameter
Equivalently, under the noninformative uniform models: Normal
prior distribution on (µ, log(σ)), the posterior distri- mean & variance

bution of µ is (Reference: Gel-


man et. al.
3.2)
Normal Dist’n

µ − ȳ
√ y ∼ tn−1 ,
s/ n
where where tn−1 is the standard Student-t density
(location 0, scale 1) with n − 1 degrees of freedom.
Comparing with the sampling theory:

µ − ȳ
√ µ, σ 2 ∼ tn−1 ,
s/ n
■ sampling distribution does not depend on the
nuisance parameter σ 2
S
CHOOL OFF S
INANCE AND TAT I S T I C S

■ posterior distribution does not depend on data


March 30, 2009 Chapter 2 - p. 30/55
Exercises

Nov. 14: Multiparameter


models: Normal
Ex 3.1, 3.2, 3.5 mean & variance

(Reference: Gel-
man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 31/55


Exercises

Nov. 14: Multiparameter


models: Normal
Ex 3.1, 3.2, 3.5 mean & variance

(Reference: Gel-
man et. al.
3.2)
Normal Dist’n

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 31/55


Introduction to
Multiparameter
Models (II)

— Example:
Bioassay
Introduction to experiment
References:
Multiparameter Models (II) (1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
— Example: Bioassay experiment Multiparameter
models
References: Bioassay example
Example Data
(1) Gelman et. al. 3.7 Sampling Model

(2) Jim Albert, Chapter 4 Model across dose


levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 32/55


Multiparameter models

■ Few multiparameter sampling models allow Introduction to


Multiparameter
explicit calculation of the posterior distribution. Models (II)

— Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 33/55


Multiparameter models

■ Few multiparameter sampling models allow Introduction to


Multiparameter
explicit calculation of the posterior distribution. Models (II)

— Example:
■ Data analysis for such models is usually Bioassay
experiment
achieved with simulation (especially MCMC References:
(1) Gelman
methods). et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 33/55


Multiparameter models

■ Few multiparameter sampling models allow Introduction to


Multiparameter
explicit calculation of the posterior distribution. Models (II)

— Example:
■ Data analysis for such models is usually Bioassay
experiment
achieved with simulation (especially MCMC References:
(1) Gelman
methods). et. al. 3.7
(2) Jim
■ We will illustrate with a nonconjugate model for Albert, Chapter 4
Multiparameter
bioassay experiment using a two-parameter models
generalized linear model. Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 33/55


Bioassay example

■ In drug development, acute toxicity tests(Ó5Á Introduction to


Multiparameter
) are performed in animals. Models (II)

— Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 34/55


Bioassay example

■ In drug development, acute toxicity tests(Ó5Á Introduction to


Multiparameter
) are performed in animals. Models (II)

— Example:
■ Various dose(Jþ) levels of the compound are Bioassay
experiment
administered(‰ƒ) to batches of animals. References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 34/55


Bioassay example

■ In drug development, acute toxicity tests(Ó5Á Introduction to


Multiparameter
) are performed in animals. Models (II)

— Example:
■ Various dose(Jþ) levels of the compound are Bioassay
experiment
administered(‰ƒ) to batches of animals. References:
(1) Gelman
et. al. 3.7
■ Animals responses typically characterized by a (2) Jim
Albert, Chapter 4
binary outcome: alive or dead, tumor or no Multiparameter
tumor, response or no response etc. models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 34/55


Data Structure

■ Such an experiment gives rise to data of the form Introduction to


Multiparameter
Models (II)
(xi , ni , yi ), i = 1, 2, · · · , k,
— Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 35/55


Data Structure

■ Such an experiment gives rise to data of the form Introduction to


Multiparameter
Models (II)
(xi , ni , yi ), i = 1, 2, · · · , k,
— Example:
Bioassay
experiment
■ where References:
(1) Gelman
◆ xi is the ith dose level (i = 1, 2, · · · , k) et. al. 3.7
(often measured on a logarithmic scale) (2) Jim
Albert, Chapter 4
◆ ni animals given ith dose level Multiparameter
models
◆ yi animals with positive outcome (tumor, death, Bioassay example
Example Data
response). Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 35/55


Data Structure

■ Such an experiment gives rise to data of the form Introduction to


Multiparameter
Models (II)
(xi , ni , yi ), i = 1, 2, · · · , k,
— Example:
Bioassay
experiment
■ where References:
(1) Gelman
◆ xi is the ith dose level (i = 1, 2, · · · , k) et. al. 3.7
(often measured on a logarithmic scale) (2) Jim
Albert, Chapter 4
◆ ni animals given ith dose level Multiparameter
models
◆ yi animals with positive outcome (tumor, death, Bioassay example
Example Data
response). Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 35/55


Example Data

For the example data, 20 animals were tested, 5 at Introduction to


Multiparameter
each of 4 dose levels. Models (II)

— Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 36/55


Example Data

For the example data, 20 animals were tested, 5 at Introduction to


Multiparameter
each of 4 dose levels. Models (II)

— Example:
Bioassay
Dose, xi Number of Number of experiment
References:
(log g/ml) animals, ni deaths, yi (1) Gelman
et. al. 3.7
-0.86 5 0 (2) Jim
Albert, Chapter 4
-0.30 5 1 Multiparameter
models
-0.05 5 3 Bioassay example
Example Data
0.73 5 5 Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 36/55


Example Data

For the example data, 20 animals were tested, 5 at Introduction to


Multiparameter
each of 4 dose levels. Models (II)

— Example:
Bioassay
Dose, xi Number of Number of experiment
References:
(log g/ml) animals, ni deaths, yi (1) Gelman
et. al. 3.7
-0.86 5 0 (2) Jim
Albert, Chapter 4
-0.30 5 1 Multiparameter
models
-0.05 5 3 Bioassay example
Example Data
0.73 5 5 Sampling Model
Model across dose
levels
Source: Racine A, Grieve AP, Fluhler H, Smith AFM. (1986). Approximation
Approximation
Bayesian methods in practice: experiences in the pharmaceutical Posterior inference
Results
industry (with discussion). Applied Statistics 35, 93-150.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 36/55


Sampling model at each dose level

Within dosage level i: Introduction to


Multiparameter
Models (II)

— Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 37/55


Sampling model at each dose level

Within dosage level i: Introduction to


Multiparameter
Models (II)
■ The animals are assumed to be exchangeable — Example:
(there is no information to distinguish among Bioassay
experiment
them). References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 37/55


Sampling model at each dose level

Within dosage level i: Introduction to


Multiparameter
Models (II)
■ The animals are assumed to be exchangeable — Example:
(there is no information to distinguish among Bioassay
experiment
them). References:
(1) Gelman
et. al. 3.7
■ We model the outcomes as independent given (2) Jim
Albert, Chapter 4
same probability of death θi , which leads to the Multiparameter
familiar binomial sampling model: models
Bioassay example
Example Data
yi |θi ∼ Bin(ni , θi ). Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 37/55


Setting up a model across dose levels

■ Modeling the response at several dosage levels Introduction to


Multiparameter
requires a relationship between the θi ’s and Models (II)

xi .s. — Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 38/55


Setting up a model across dose levels

■ Modeling the response at several dosage levels Introduction to


Multiparameter
requires a relationship between the θi ’s and Models (II)

xi .s. — Example:
Bioassay
experiment
■ Note:We start by assuming that each θi is an References:
(1) Gelman
independent parameter. We relax this et. al. 3.7
assumption when we develop hierarchical (2) Jim
Albert, Chapter 4
models (See Chapter 5). Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 38/55


Setting up a model across dose levels

■ Modeling the response at several dosage levels Introduction to


Multiparameter
requires a relationship between the θi ’s and Models (II)

xi .s. — Example:
Bioassay
experiment
■ Note:We start by assuming that each θi is an References:
(1) Gelman
independent parameter. We relax this et. al. 3.7
assumption when we develop hierarchical (2) Jim
Albert, Chapter 4
models (See Chapter 5). Multiparameter
models
Bioassay example
■ There are many possibilities for relating the θi ’s Example Data
Sampling Model
to the xi .s, but a popular and reasonable choice Model across dose
is a logistic regression model: levels
Approximation
  Approximation
θi Posterior inference
logit(θi ) = log = α + βxi Results
1 − θi
SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 38/55


Prior distribution for (α, β): Introduction to
Multiparameter
Models (II)

— Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 39/55


Prior distribution for (α, β): Introduction to
Multiparameter
Models (II)
■ We assume that p(α, β) is independent and — Example:
locally uniform in the two parameters, that is, Bioassay
experiment
References:
p(α, β) ∝ 1. (1) Gelman
et. al. 3.7
It is an improper /noninformative0distribution. (2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 39/55


Prior distribution for (α, β): Introduction to
Multiparameter
Models (II)
■ We assume that p(α, β) is independent and — Example:
locally uniform in the two parameters, that is, Bioassay
experiment
References:
p(α, β) ∝ 1. (1) Gelman
et. al. 3.7
It is an improper /noninformative0distribution. (2) Jim
Albert, Chapter 4
Multiparameter
■ We need to check that the posterior distribution models
Bioassay example
is proper (?) (details not shown). Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 39/55


Prior distribution for (α, β): Introduction to
Multiparameter
Models (II)
■ We assume that p(α, β) is independent and — Example:
locally uniform in the two parameters, that is, Bioassay
experiment
References:
p(α, β) ∝ 1. (1) Gelman
et. al. 3.7
It is an improper /noninformative0distribution. (2) Jim
Albert, Chapter 4
Multiparameter
■ We need to check that the posterior distribution models
Bioassay example
is proper (?) (details not shown). Example Data
Sampling Model
(Noninformative/improper) prior distributions can Model across dose
levels
lead to improper and thus uninterpretable posterior Approximation
distributions. Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 39/55


Describing the posterior distribution

The form of posterior distribution: Introduction to


Multiparameter
Models (II)
p(α, β|y) ∝ p(α, β)p(y|α, β)
— Example:
k  α+βxi
yi  ni −yi Bioassay
Y e 1 experiment
∝ α+βxi α+βxi References:
i=1
1 + e 1 + e (1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 40/55


Describing the posterior distribution

The form of posterior distribution: Introduction to


Multiparameter
Models (II)
p(α, β|y) ∝ p(α, β)p(y|α, β)
— Example:
k  α+βxi
yi  ni −yi Bioassay
Y e 1 experiment
∝ α+βxi α+βxi References:
i=1
1 + e 1 + e (1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Parameter Estimation: Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 40/55


Describing the posterior distribution

The form of posterior distribution: Introduction to


Multiparameter
Models (II)
p(α, β|y) ∝ p(α, β)p(y|α, β)
— Example:
k  α+βxi
yi  ni −yi Bioassay
Y e 1 experiment
∝ α+βxi α+βxi References:
i=1
1 + e 1 + e (1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Parameter Estimation: Multiparameter
models
■ One approach would be to use a normal Bioassay example
approximation (see Chapter 4) centered at Example Data
Sampling Model
posterior mode (α̃ = 0.87, β̃ = 7.91) Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 40/55


Describing the posterior distribution

The form of posterior distribution: Introduction to


Multiparameter
Models (II)
p(α, β|y) ∝ p(α, β)p(y|α, β)
— Example:
k  α+βxi
yi  ni −yi Bioassay
Y e 1 experiment
∝ α+βxi α+βxi References:
i=1
1 + e 1 + e (1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Parameter Estimation: Multiparameter
models
■ One approach would be to use a normal Bioassay example
approximation (see Chapter 4) centered at Example Data
Sampling Model
posterior mode (α̃ = 0.87, β̃ = 7.91) Model across dose
levels
Approximation
■ Second approach (similar to above) is obtaining Approximation
maximum likelihood estimates (eg by running Posterior inference
Results
glm in R). Asymptotic standard errors can be
obtained via MLStheory. F
CHOOL OF S
INANCE AND TAT I S T I C S

March 30, 2009 Chapter 2 - p. 40/55


Introduction to
Multiparameter
Models (II)

— Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Figure 1: (a) Contour plot of the posterior density of the pa- Approximation
rameters α and β. (b) Scatterplotof 1000 draws from the pos- Approximation
Posterior inference
terior distribution Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 41/55


Discrete approx. to the posterior density

We illustrate computing the joint posterior Introduction to


Multiparameter
distribution for (α, β) at a grid of points(‚f:) in Models (II)

2-dimensions: — Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 42/55


Discrete approx. to the posterior density

We illustrate computing the joint posterior Introduction to


Multiparameter
distribution for (α, β) at a grid of points(‚f:) in Models (II)

2-dimensions: — Example:
Bioassay
experiment
1. We begin with a rough estimate of the References:
(1) Gelman
parameters. et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 42/55


Discrete approx. to the posterior density

We illustrate computing the joint posterior Introduction to


Multiparameter
distribution for (α, β) at a grid of points(‚f:) in Models (II)

2-dimensions: — Example:
Bioassay
experiment
1. We begin with a rough estimate of the References:
(1) Gelman
parameters. et. al. 3.7
(2) Jim
Albert, Chapter 4
■ Since Multiparameter
models
logit(E[(yi /ni )|α, β]) = α + βxi , Bioassay example
Example Data
Sampling Model
we obtain rough estimates of α and β using a Model across dose
linear regression of logit(yi /ni ) on xi . levels
Approximation
■ Set y1 = 0.5, y4 = 4.5 to enable calculation. Approximation
Posterior inference
Results
■ α̂ = 0.1, β̂ = 2.9 (standard errors 0.3 and 0.5).
SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 42/55


Discrete approx. to the posterior density

2. Evaluate the posterior on a 200 × 200 grid; use Introduction to


Multiparameter
range [−5, 10] × [−10, 40]. Models (II)

— Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 43/55


Discrete approx. to the posterior density

2. Evaluate the posterior on a 200 × 200 grid; use Introduction to


Multiparameter
range [−5, 10] × [−10, 40]. Models (II)

— Example:
3. Use R to produce a contour plot (lines of equal Bioassay
experiment
posterior density). References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 43/55


Discrete approx. to the posterior density

2. Evaluate the posterior on a 200 × 200 grid; use Introduction to


Multiparameter
range [−5, 10] × [−10, 40]. Models (II)

— Example:
3. Use R to produce a contour plot (lines of equal Bioassay
experiment
posterior density). References:
P P (1) Gelman
4. Renormalize on grid so that α β p(α, β|y) = 1 et. al. 3.7
(2) Jim
(i.e., create discrete approx to posterior) Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 43/55


Discrete approx. to the posterior density

2. Evaluate the posterior on a 200 × 200 grid; use Introduction to


Multiparameter
range [−5, 10] × [−10, 40]. Models (II)

— Example:
3. Use R to produce a contour plot (lines of equal Bioassay
experiment
posterior density). References:
P P (1) Gelman
4. Renormalize on grid so that α β p(α, β|y) = 1 et. al. 3.7
(2) Jim
(i.e., create discrete approx to posterior) Albert, Chapter 4
Multiparameter
5. Sample from marginal Pdistribution of one models
Bioassay example
parameter: p(α|y) = β p(α, β|y) Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 43/55


Discrete approx. to the posterior density

2. Evaluate the posterior on a 200 × 200 grid; use Introduction to


Multiparameter
range [−5, 10] × [−10, 40]. Models (II)

— Example:
3. Use R to produce a contour plot (lines of equal Bioassay
experiment
posterior density). References:
P P (1) Gelman
4. Renormalize on grid so that α β p(α, β|y) = 1 et. al. 3.7
(2) Jim
(i.e., create discrete approx to posterior) Albert, Chapter 4
Multiparameter
5. Sample from marginal Pdistribution of one models
Bioassay example
parameter: p(α|y) = β p(α, β|y) Example Data
Sampling Model
6. Sample from conditional distribution of second Model across dose
levels
parameter: p(β|α, y) Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 43/55


Discrete approx. to the posterior density

2. Evaluate the posterior on a 200 × 200 grid; use Introduction to


Multiparameter
range [−5, 10] × [−10, 40]. Models (II)

— Example:
3. Use R to produce a contour plot (lines of equal Bioassay
experiment
posterior density). References:
P P (1) Gelman
4. Renormalize on grid so that α β p(α, β|y) = 1 et. al. 3.7
(2) Jim
(i.e., create discrete approx to posterior) Albert, Chapter 4
Multiparameter
5. Sample from marginal Pdistribution of one models
Bioassay example
parameter: p(α|y) = β p(α, β|y) Example Data
Sampling Model
6. Sample from conditional distribution of second Model across dose
levels
parameter: p(β|α, y) Approximation
Approximation
7. We can improve sampling slightly by drawing Posterior inference
from linear interpolation between grid points. Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 43/55


Discrete approx. to the posterior density

2. Evaluate the posterior on a 200 × 200 grid; use Introduction to


Multiparameter
range [−5, 10] × [−10, 40]. Models (II)

— Example:
3. Use R to produce a contour plot (lines of equal Bioassay
experiment
posterior density). References:
P P (1) Gelman
4. Renormalize on grid so that α β p(α, β|y) = 1 et. al. 3.7
(2) Jim
(i.e., create discrete approx to posterior) Albert, Chapter 4
Multiparameter
5. Sample from marginal Pdistribution of one models
Bioassay example
parameter: p(α|y) = β p(α, β|y) Example Data
Sampling Model
6. Sample from conditional distribution of second Model across dose
levels
parameter: p(β|α, y) Approximation
Approximation
7. We can improve sampling slightly by drawing Posterior inference
from linear interpolation between grid points. Results

Alternative: exact posterior using advanced


computation (methods
S covered
CHOOL OF F later)
S
INANCE AND TAT I S T I C S

March 30, 2009 Chapter 2 - p. 43/55


Posterior inference

Quantities of interest: (α, β) Introduction to


Multiparameter
Models (II)

— Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 44/55


Posterior inference

Quantities of interest: (α, β) Introduction to


Multiparameter
Models (II)
LD50 = dose at which Pr(death) is 0.5 = −α/β — Example:
Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 44/55


Posterior inference

Quantities of interest: (α, β) Introduction to


Multiparameter
Models (II)
LD50 = dose at which Pr(death) is 0.5 = −α/β — Example:
Bioassay
experiment
References:
Why? (1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 44/55


Posterior inference

Quantities of interest: (α, β) Introduction to


Multiparameter
Models (II)
LD50 = dose at which Pr(death) is 0.5 = −α/β — Example:
Bioassay
experiment
References:
Why? (1) Gelman
  et. al. 3.7
yi (2) Jim
Because: E ni
= θi = logit−1 (α + βxi ) = 0.5 gives Albert, Chapter 4
Multiparameter
α + βxi = logit(0.5) = 0, thus xi = −α/β. models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 44/55


Posterior inference

Quantities of interest: (α, β) Introduction to


Multiparameter
Models (II)
LD50 = dose at which Pr(death) is 0.5 = −α/β — Example:
Bioassay
experiment
References:
Why? (1) Gelman
  et. al. 3.7
yi (2) Jim
Because: E ni
= θi = logit−1 (α + βxi ) = 0.5 gives Albert, Chapter 4
Multiparameter
α + βxi = logit(0.5) = 0, thus xi = −α/β. models
Bioassay example
Example Data
■ This is meaningless if β ≤ 0 (substance not Sampling Model
harmful). Model across dose
levels
■ We perform inference in two steps: Approximation
Approximation
◆ P r(β > 0|y) Posterior inference
Results
◆ posterior dist.n of LD50 conditional on β > 0

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 44/55


Results

We take 1000 simulation draws of (α, β) from the Introduction to


Multiparameter
grid Models (II)

(different posterior sample from results in book) — Example:


Bioassay
experiment
References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 45/55


Results

We take 1000 simulation draws of (α, β) from the Introduction to


Multiparameter
grid Models (II)

(different posterior sample from results in book) — Example:


Bioassay
experiment
Note that β > 0 for all 1000 draws. References:
(1) Gelman
et. al. 3.7
(2) Jim
Albert, Chapter 4
Multiparameter
models
Bioassay example
Example Data
Sampling Model
Model across dose
levels
Approximation
Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 45/55


Results

We take 1000 simulation draws of (α, β) from the Introduction to


Multiparameter
grid Models (II)

(different posterior sample from results in book) — Example:


Bioassay
experiment
Note that β > 0 for all 1000 draws. References:
(1) Gelman
et. al. 3.7
Summary of posterior distribution (2) Jim
Albert, Chapter 4
posterior quantiles Multiparameter
models
2.5% 25% 50% 75% 97.5% Bioassay example
Example Data
α -0.6 0.6 1.3 2.0 4.1 Sampling Model
Model across dose
β 3.5 7.5 11.0 15.2 26.0 levels
Approximation
LD50 -0.28 -0.16 -0.11 -0.06 0.12 Approximation
Posterior inference
Results

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 45/55


Lessons from
simple examples

(Reference: Gel-
man et. al.
3.8)
Lessons from simple examples Lessons
five steps

(Reference: Gelman et. al. 3.8)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 46/55


Lessons from simple examples

Lessons from
The lack of multiparameter models with explicit simple examples
posterior distributions not necessarily a barrier to (Reference: Gel-
analysis. man et. al.
3.8)
Lessons
five steps

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 47/55


Lessons from simple examples

Lessons from
The lack of multiparameter models with explicit simple examples
posterior distributions not necessarily a barrier to (Reference: Gel-
analysis. man et. al.
3.8)
Lessons
We can use simulation, maybe after replacing five steps

sophisticated models with hierarchical or


conditional models (possibly invoking a normal
approximation in some cases).

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 47/55


■ Inference from posterior distribution Lessons from
simple examples
◆ one parameter at a time
(Reference: Gel-
◆ simple graphical methods man et. al.
3.8)
◆ analytic methods
Lessons
◆ simulation five steps

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 48/55


■ Inference from posterior distribution Lessons from
simple examples
◆ one parameter at a time
(Reference: Gel-
◆ simple graphical methods man et. al.
3.8)
◆ analytic methods
Lessons
◆ simulation five steps

■ No need to rely on asymptotics for inference

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 48/55


■ Inference from posterior distribution Lessons from
simple examples
◆ one parameter at a time
(Reference: Gel-
◆ simple graphical methods man et. al.
3.8)
◆ analytic methods
Lessons
◆ simulation five steps

■ No need to rely on asymptotics for inference


■ Real advantage of Bayesian approach to be seen
in more complex settings (but computational
strategies will need to be extended)......

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 48/55


The five steps of Bayesian inference

1. Write the likelihood p(y|θ) Lessons from


simple examples

(Reference: Gel-
man et. al.
3.8)
Lessons
five steps

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 49/55


The five steps of Bayesian inference

1. Write the likelihood p(y|θ) Lessons from


simple examples

2. Generate the posterior as p(θ|y) = p(θ)p(y|θ) by (Reference: Gel-


including well formulated information in p(θ) or man et. al.
3.8)
else use Lessons
five steps
p(θ) = constant.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 49/55


The five steps of Bayesian inference

1. Write the likelihood p(y|θ) Lessons from


simple examples

2. Generate the posterior as p(θ|y) = p(θ)p(y|θ) by (Reference: Gel-


including well formulated information in p(θ) or man et. al.
3.8)
else use Lessons
five steps
p(θ) = constant.

3. Get crude estimates for θ as a starting point or


for comparison.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 49/55


The five steps of Bayesian inference

1. Write the likelihood p(y|θ) Lessons from


simple examples

2. Generate the posterior as p(θ|y) = p(θ)p(y|θ) by (Reference: Gel-


including well formulated information in p(θ) or man et. al.
3.8)
else use Lessons
five steps
p(θ) = constant.

3. Get crude estimates for θ as a starting point or


for comparison.
4. Draw simulations θ1 , · · · , θL from the posterior
distribution and compute the posterior density of
any function of θ which may be of interest.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 49/55


The five steps of Bayesian inference

1. Write the likelihood p(y|θ) Lessons from


simple examples

2. Generate the posterior as p(θ|y) = p(θ)p(y|θ) by (Reference: Gel-


including well formulated information in p(θ) or man et. al.
3.8)
else use Lessons
five steps
p(θ) = constant.

3. Get crude estimates for θ as a starting point or


for comparison.
4. Draw simulations θ1 , · · · , θL from the posterior
distribution and compute the posterior density of
any function of θ which may be of interest.
5. Simulate ỹ 1 , · · · , ỹ L by drawing each ỹ l from
predicative distribution p(ỹ|θl ) for any predicative
quantities ỹ of interest.
S
CHOOL OF F
INANCE AND STAT I S T I C S

March 30, 2009 Chapter 2 - p. 49/55


APPENDIX: Large
sample Bayesian
inference

(Reference: Gel-
man et. al. Chapter
APPENDIX: Large sample Bayesian 4)
Normal
inference approximations
Transformations

(Reference: Gelman et. al. Chapter 4)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 50/55


Bayesian and noninformative priors

■ In many simple Bayesian analyses using APPENDIX: Large


sample Bayesian
noninformative priors give similar results to inference

standard non-Bayesian approaches; e.g. (Reference: Gel-


man et. al. Chapter
Posterior t interval for the normal distribution with 4)
unknown mean and variance. Normal
approximations
Transformations

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 51/55


Bayesian and noninformative priors

■ In many simple Bayesian analyses using APPENDIX: Large


sample Bayesian
noninformative priors give similar results to inference

standard non-Bayesian approaches; e.g. (Reference: Gel-


man et. al. Chapter
Posterior t interval for the normal distribution with 4)
unknown mean and variance. Normal
approximations
■ The extent to which a /noninformative0prior Transformations

can be justified as an objective depends on a


judgement that the /data dominate the prior0.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 51/55


Bayesian and noninformative priors

■ In many simple Bayesian analyses using APPENDIX: Large


sample Bayesian
noninformative priors give similar results to inference

standard non-Bayesian approaches; e.g. (Reference: Gel-


man et. al. Chapter
Posterior t interval for the normal distribution with 4)
unknown mean and variance. Normal
approximations
■ The extent to which a /noninformative0prior Transformations

can be justified as an objective depends on a


judgement that the /data dominate the prior0.
■ As the sample size increases, so does the
amount of information available in the data
increases and the influence of the prior
distribution on posterior inference decreases.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 51/55


Normal approximationsto the joint posterior distribution

■ If p(θ|y) unimodal and roughly symmetric, often APPENDIX: Large


sample Bayesian
convenient to approximate it by a normal inference

distribution centred at the mode, that is, we (Reference: Gel-


man et. al. Chapter
approximate the logarithm of the posterior 4)
density by a quadratic function. Normal
approximations
Transformations

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 52/55


Normal approximationsto the joint posterior distribution

■ If p(θ|y) unimodal and roughly symmetric, often APPENDIX: Large


sample Bayesian
convenient to approximate it by a normal inference

distribution centred at the mode, that is, we (Reference: Gel-


man et. al. Chapter
approximate the logarithm of the posterior 4)
density by a quadratic function. Normal
approximations
Transformations
■ Let θ̂ denote the posterior mode. Then a Taylor
series expansion of log p(θ|y) centered at θ̂ (θ
may be a vector and θ̂ is assumed to be in the
interior of the parameter space)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 52/55


Normal approximationsto the joint posterior distribution

■ If p(θ|y) unimodal and roughly symmetric, often APPENDIX: Large


sample Bayesian
convenient to approximate it by a normal inference

distribution centred at the mode, that is, we (Reference: Gel-


man et. al. Chapter
approximate the logarithm of the posterior 4)
density by a quadratic function. Normal
approximations
Transformations
■ Let θ̂ denote the posterior mode. Then a Taylor
series expansion of log p(θ|y) centered at θ̂ (θ
may be a vector and θ̂ is assumed to be in the
interior of the parameter space)
p(θ|y) ≈ N (θ̂, [I(θ̂)]−1 ),
where I(θ) is the observed information,
d2
I(θ) = − 2 p(θ|y)
S F dθ
CHOOL OF S
INANCE AND TAT I S T I C S

March 30, 2009 Chapter 2 - p. 52/55


Note: APPENDIX: Large
sample Bayesian
inference

(Reference: Gel-
man et. al. Chapter
4)
Normal
approximations
Transformations

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 53/55


Note: APPENDIX: Large
sample Bayesian
inference
■ if θ̂ is in interior of parameter space, then I(θ̂) > 0
(Reference: Gel-
■ If θ is a vector, then I(θ) is a matrix. man et. al. Chapter
4)
Normal
approximations
Transformations

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 53/55


Note: APPENDIX: Large
sample Bayesian
inference
■ if θ̂ is in interior of parameter space, then I(θ̂) > 0
(Reference: Gel-
■ If θ is a vector, then I(θ) is a matrix. man et. al. Chapter
4)
Normal
Example 1: Normal distribution with unknown approximations
Transformations
mean and variance; see Gelman et al. page 102.
Example 2: Bioassay Experiment; see Gelman et
al. page 104.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 53/55


Transformations

APPENDIX: Large
Transformations: in many cases convergence to sample Bayesian
normality of posterior distribution for θ can be dra- inference

matically improved by transformation. (Reference: Gel-


man et. al. Chapter
4)
Normal
approximations
Transformations

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 54/55


Transformations

APPENDIX: Large
Transformations: in many cases convergence to sample Bayesian
normality of posterior distribution for θ can be dra- inference

matically improved by transformation. (Reference: Gel-


man et. al. Chapter
4)
Normal
If φ = continuous transformation of θ, then both approximations
p(φ|x) and p(θ|x) −→ normal distributions, but Transformations

closeness of approximation for finite n may be very


different.

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 54/55


Transformations

APPENDIX: Large
Transformations: in many cases convergence to sample Bayesian
normality of posterior distribution for θ can be dra- inference

matically improved by transformation. (Reference: Gel-


man et. al. Chapter
4)
Normal
If φ = continuous transformation of θ, then both approximations
p(φ|x) and p(θ|x) −→ normal distributions, but Transformations

closeness of approximation for finite n may be very


different.
Most commonly used transformations:
■ logarithmic: transforms (0, ∞) to (−∞, ∞)

■ logistic: transforms (0, 1) to (−∞, ∞)

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 54/55


Exercises

Nov. 21: APPENDIX: Large


sample Bayesian
Ex 3.10, 3.11 inference

(Reference: Gel-
man et. al. Chapter
4)
Normal
approximations
Transformations

SCHOOL OF FINANCE AND S TAT I S T I C S

March 30, 2009 Chapter 2 - p. 55/55

You might also like