Professional Documents
Culture Documents
Tang Yin-cai
yctang@stat.ecnu.edu.cn
d©Û ¥IÚOÑ
: , ,
, , 1998. pp714.
ÇU ydÚOÆ
¥IÚOÑ
■ Samuel Kotz, , (Modern
Bayesian Statistics), , 2000.
■ ÜÌ Ç¸ dÚOíä ÆÑ
, , , ,
1994.
■ +t dÚO ¥IÚOÑ
, , , 1999.
© .
Models
( )
— Setting-up and
examples
© .
Hierarchical Models
( )
— Setting-up and examples
(Ref.: Gelman et
al., 5.1-5.3;
Hierarchical models(I) — introduction Berger, 4.6; J.
Albert, 7)
•Review 1
(Ref.: Gelman et al., 5.1-5.3; Berger, •Review 2
Empirical Bayes
4.6; J. Albert, 7) •Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
Hierarchical
models(I) —
Strategy of Computation (for all Bayesian analysis!)— introduction
Simulation —Draw samples from univariate distributions: (Ref.: Gelman et
Draw θ2 , then given θ2 draw θ1 . al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
■ If θ2 is given, then it degenerates into a •Review 1
•Review 2
1-parameter model. Empirical Bayes
•Why
■ If direct sampling is not possible, then sample Hierarchical?
•Hierarchical
from its discretized distribution/grid Model
•hierarchical
approximation. approach
•Exchangeability
■ For more complex models, advanced •Basic ex. model
•General ex.
computation methods can be used (See Part III, model
Gelman, et. al.) •Typical structure
•Posterior dist.
■ ...... •Predictive dist.
²d
method as above. This method is called Empirical Berger, 4.6; J.
Bayes( ). Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
²d
method as above. This method is called Empirical Berger, 4.6; J.
Bayes( ). Albert, 7)
•Review 1
How? •Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
²d
method as above. This method is called Empirical Berger, 4.6; J.
Bayes( ). Albert, 7)
•Review 1
How? •Review 2
Empirical Bayes
■ Example. Estimating the risk of tumor of rats—θ: •Why
Hierarchical?
◆ Current experiment: y = 4 of n = 14 rate •Hierarchical
Model
developed tumor. •hierarchical
approach
◆ Bayesisan Model:
•Exchangeability
•Basic ex. model
y|θ ∼ Bin(n, θ) •General ex.
model
θ ∼ Beta(α, β) •Typical structure
•Posterior dist.
•Predictive dist.
◆ Posterior: θ|y ∼ Beta(α + 4, β + 10).
SCHOOL OF FINANCE AND S TAT I S T I C S
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
Hierarchical
■ Many problems have multiple parameters that models(I) —
introduction
are related.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
Hierarchical
■ Many problems have multiple parameters that models(I) —
introduction
are related.
(Ref.: Gelman et
■ Use a joint probability model to reflect this al., 5.1-5.3;
Berger, 4.6; J.
dependence. Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
Hierarchical
■ Many problems have multiple parameters that models(I) —
introduction
are related.
(Ref.: Gelman et
■ Use a joint probability model to reflect this al., 5.1-5.3;
Berger, 4.6; J.
dependence.
©
Albert, 7)
•Review 1
■ It is useful to think hierarchically( ): •Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
.
Let s generalize our simple bioassay example: Hierarchical
models(I) —
introduction
■ Imagine repeated bioassays with same (Ref.: Gelman et
compound, where (αj , βj ) parameters from al., 5.1-5.3;
Berger, 4.6; J.
different bioassays. Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
.
Let s generalize our simple bioassay example: Hierarchical
models(I) —
introduction
■ Imagine repeated bioassays with same (Ref.: Gelman et
compound, where (αj , βj ) parameters from al., 5.1-5.3;
Berger, 4.6; J.
different bioassays. Albert, 7)
•Review 1
■ A single (α, β) may be inadequate to fit a •Review 2
Empirical Bayes
combined data set (several experiments) (⇒ •Why
pooled estimate). Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
.
Let s generalize our simple bioassay example: Hierarchical
models(I) —
introduction
■ Imagine repeated bioassays with same (Ref.: Gelman et
compound, where (αj , βj ) parameters from al., 5.1-5.3;
Berger, 4.6; J.
different bioassays. Albert, 7)
•Review 1
■ A single (α, β) may be inadequate to fit a •Review 2
Empirical Bayes
combined data set (several experiments) (⇒ •Why
pooled estimate). Hierarchical?
•Hierarchical
Model
Separate unrelated (αj , βj ) are likely to
/ 0
■ •hierarchical
approach
overfit data (only 4 points in each data set). •Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
.
Let s generalize our simple bioassay example: Hierarchical
models(I) —
introduction
■ Imagine repeated bioassays with same (Ref.: Gelman et
compound, where (αj , βj ) parameters from al., 5.1-5.3;
Berger, 4.6; J.
different bioassays. Albert, 7)
•Review 1
■ A single (α, β) may be inadequate to fit a •Review 2
Empirical Bayes
combined data set (several experiments) (⇒ •Why
pooled estimate). Hierarchical?
•Hierarchical
Model
Separate unrelated (αj , βj ) are likely to
/ 0
■ •hierarchical
approach
overfit data (only 4 points in each data set). •Exchangeability
•Basic ex. model
■ Think: Is there a compromise? •General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
.
Let s generalize our simple bioassay example: Hierarchical
models(I) —
introduction
■ Imagine repeated bioassays with same (Ref.: Gelman et
compound, where (αj , βj ) parameters from al., 5.1-5.3;
Berger, 4.6; J.
different bioassays. Albert, 7)
•Review 1
■ A single (α, β) may be inadequate to fit a •Review 2
Empirical Bayes
combined data set (several experiments) (⇒ •Why
pooled estimate). Hierarchical?
•Hierarchical
Model
Separate unrelated (αj , βj ) are likely to
/ 0
■ •hierarchical
approach
overfit data (only 4 points in each data set). •Exchangeability
•Basic ex. model
■ Think: Is there a compromise? •General ex.
model
■ — Hierarchical Model: a compromise between •Typical structure
•Posterior dist.
single data estimate and pooled estimate. •Predictive dist.
distribution.
.
(Ref.: Gelman et
al., 5.1-5.3;
■ We d be better off estimating the Berger, 4.6; J.
Albert, 7)
parameters,say φ, governing the population •Review 1
•Review 2
distribution of (αj , βj ) rather than each (αj , βj ) Empirical Bayes
separately. •Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
distribution.
.
(Ref.: Gelman et
al., 5.1-5.3;
■ We d be better off estimating the Berger, 4.6; J.
Albert, 7)
parameters,say φ, governing the population •Review 1
•Review 2
distribution of (αj , βj ) rather than each (αj , βj ) Empirical Bayes
separately. •Why
Hierarchical?
•Hierarchical
■ This introduces new parameters that govern this Model
•hierarchical
population distribution, called hyperparameters. approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
distribution.
.
(Ref.: Gelman et
al., 5.1-5.3;
■ We d be better off estimating the Berger, 4.6; J.
Albert, 7)
parameters,say φ, governing the population •Review 1
•Review 2
distribution of (αj , βj ) rather than each (αj , βj ) Empirical Bayes
separately. •Why
Hierarchical?
•Hierarchical
■ This introduces new parameters that govern this Model
•hierarchical
population distribution, called hyperparameters. approach
•Exchangeability
•Basic ex. model
Hierarchical models uses many parameters but im- •General ex.
posing a population distribution induces enough model
•Typical structure
structure (dependence) to avoid overfitting. •Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
.
introduction
.
introduction
Hierarchical
models(I) —
introduction
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
Hierarchical
Examples (see pages 122-123): models(I) —
introduction
1. The simplest form: i.i.d. given some unknown
(Ref.: Gelman et
parameter. al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
Hierarchical
Examples (see pages 122-123): models(I) —
introduction
1. The simplest form: i.i.d. given some unknown
(Ref.: Gelman et
parameter. al., 5.1-5.3;
Berger, 4.6; J.
2. Seemingly non-exchangeable random variables Albert, 7)
•Review 1
may become exchangeable if we condition on all •Review 2
available information (e.g. covariates regression Empirical Bayes
•Why
analysis) Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
Hierarchical
Examples (see pages 122-123): models(I) —
introduction
1. The simplest form: i.i.d. given some unknown
(Ref.: Gelman et
parameter. al., 5.1-5.3;
Berger, 4.6; J.
2. Seemingly non-exchangeable random variables Albert, 7)
•Review 1
may become exchangeable if we condition on all •Review 2
available information (e.g. covariates regression Empirical Bayes
•Why
analysis) Hierarchical?
•Hierarchical
3. Hierarchical models often use exchangeable Model
•hierarchical
models for prior distribution of model approach
•Exchangeability
parameters. •Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
.
Berger, 4.6; J.
Albert, 7)
This mixture of i.i.d. s is usually all we need to cap- •Review 1
ture exchangeability in practice. •Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
.
Berger, 4.6; J.
Albert, 7)
This mixture of i.i.d. s is usually all we need to cap- •Review 1
ture exchangeability in practice. •Review 2
Empirical Bayes
•Why
Bruno de Finetti Theorem: Hierarchical?
•Hierarchical
As J → ∞, any suitably well-behaved Model
•hierarchical
exchangeable distribution on θ1 , · · · , θJ can be approach
•Exchangeability
written as an i.i.d. mixture. •Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
J
Z "Y # (Ref.: Gelman et
al., 5.1-5.3;
p(θ1 , . . . , θJ |x1 , . . . , xJ ) = p(θj |φ, xj ) p(φ|xj )dφ,
Berger, 4.6; J.
Albert, 7)
j=1 •Review 1
•Review 2
where x = (x1 , . . . , xJ ) represents the available Empirical Bayes
•Why
information. Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
J
Z "Y # (Ref.: Gelman et
al., 5.1-5.3;
p(θ1 , . . . , θJ |x1 , . . . , xJ ) = p(θj |φ, xj ) p(φ|xj )dφ,
Berger, 4.6; J.
Albert, 7)
j=1 •Review 1
•Review 2
where x = (x1 , . . . , xJ ) represents the available Empirical Bayes
•Why
information. Hierarchical?
•Hierarchical
Model
■ In this way, exchangeable models become •hierarchical
approach
almost universally applicable. •Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
1. p(y|θ) = the sampling distribution of the data. al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
1. p(y|θ) = the sampling distribution of the data. al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
2. p(θ|φ) = the prior distribution for θ = (θ1 , . . . , θJ ) •Review 1
•Review 2
given φ—called population distribution. Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
1. p(y|θ) = the sampling distribution of the data. al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
2. p(θ|φ) = the prior distribution for θ = (θ1 , . . . , θJ ) •Review 1
•Review 2
given φ—called population distribution. Empirical Bayes
•Why
Hierarchical?
3. p(φ) = the prior distribution for φ—called •Hierarchical
Model
hyperprior distribution. •hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
1. p(y|θ) = the sampling distribution of the data. al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
2. p(θ|φ) = the prior distribution for θ = (θ1 , . . . , θJ ) •Review 1
•Review 2
given φ—called population distribution. Empirical Bayes
•Why
Hierarchical?
3. p(φ) = the prior distribution for φ—called •Hierarchical
Model
hyperprior distribution. •hierarchical
approach
•Exchangeability
4. More levels are possible! •Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
1. p(y|θ) = the sampling distribution of the data. al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
2. p(θ|φ) = the prior distribution for θ = (θ1 , . . . , θJ ) •Review 1
•Review 2
given φ—called population distribution. Empirical Bayes
•Why
Hierarchical?
3. p(φ) = the prior distribution for φ—called •Hierarchical
Model
hyperprior distribution. •hierarchical
approach
•Exchangeability
4. More levels are possible! •Basic ex. model
•General ex.
model
5. The hyperprior at highest level is often diffuse. •Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.1-5.3;
Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
p(θ, φ|y) ∝ p(θ, φ)p(y|θ, φ) al., 5.1-5.3;
Berger, 4.6; J.
∝ p(θ, φ)p(y|θ) (y ind. of φ given θ) Albert, 7)
•Review 1
∝ p(φ)p(θ|φ)p(y|θ). •Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
p(θ, φ|y) ∝ p(θ, φ)p(y|θ, φ) al., 5.1-5.3;
Berger, 4.6; J.
∝ p(θ, φ)p(y|θ) (y ind. of φ given θ) Albert, 7)
•Review 1
∝ p(φ)p(θ|φ)p(y|θ). •Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
p(θ, φ|y) ∝ p(θ, φ)p(y|θ, φ) al., 5.1-5.3;
Berger, 4.6; J.
∝ p(θ, φ)p(y|θ) (y ind. of φ given θ) Albert, 7)
•Review 1
∝ p(φ)p(θ|φ)p(y|θ). •Review 2
Empirical Bayes
•Why
♣ Inference (and computation) is often carried out Hierarchical?
•Hierarchical
in two steps: Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
p(θ, φ|y) ∝ p(θ, φ)p(y|θ, φ) al., 5.1-5.3;
Berger, 4.6; J.
∝ p(θ, φ)p(y|θ) (y ind. of φ given θ) Albert, 7)
•Review 1
∝ p(φ)p(θ|φ)p(y|θ). •Review 2
Empirical Bayes
•Why
♣ Inference (and computation) is often carried out Hierarchical?
•Hierarchical
in two steps: Model
•hierarchical
1. Inference for θ as if we knew φ using the approach
•Exchangeability
posterior conditional distribution p(θ|y, φ); •Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
p(θ, φ|y) ∝ p(θ, φ)p(y|θ, φ) al., 5.1-5.3;
Berger, 4.6; J.
∝ p(θ, φ)p(y|θ) (y ind. of φ given θ) Albert, 7)
•Review 1
∝ p(φ)p(θ|φ)p(y|θ). •Review 2
Empirical Bayes
•Why
♣ Inference (and computation) is often carried out Hierarchical?
•Hierarchical
in two steps: Model
•hierarchical
1. Inference for θ as if we knew φ using the approach
•Exchangeability
posterior conditional distribution p(θ|y, φ); •Basic ex. model
•General ex.
2. Inference for φ based on posterior marginal model
distribution p(φ|y). •Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
p(θ, φ|y) ∝ p(θ, φ)p(y|θ, φ) al., 5.1-5.3;
Berger, 4.6; J.
∝ p(θ, φ)p(y|θ) (y ind. of φ given θ) Albert, 7)
•Review 1
∝ p(φ)p(θ|φ)p(y|θ). •Review 2
Empirical Bayes
•Why
♣ Inference (and computation) is often carried out Hierarchical?
•Hierarchical
in two steps: ? Model
•hierarchical
1. Inference for θ as if we knew φ using the approach
•Exchangeability
posterior conditional distribution p(θ|y, φ); •Basic ex. model
•General ex.
2. Inference for φ based on posterior marginal model
distribution p(φ|y). •Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
p(θ, φ|y) ∝ p(θ, φ)p(y|θ, φ) al., 5.1-5.3;
Berger, 4.6; J.
∝ p(θ, φ)p(y|θ) (y ind. of φ given θ) Albert, 7)
•Review 1
∝ p(φ)p(θ|φ)p(y|θ). •Review 2
Empirical Bayes
•Why
♣ Inference (and computation) is often carried out Hierarchical?
•Hierarchical
in two steps: ? Multi-parameter model? Model
•hierarchical
1. Inference for θ as if we knew φ using the approach
•Exchangeability
posterior conditional distribution p(θ|y, φ); •Basic ex. model
•General ex.
2. Inference for φ based on posterior marginal model
distribution p(φ|y). •Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
p(θ, φ|y) ∝ p(θ, φ)p(y|θ, φ) al., 5.1-5.3;
Berger, 4.6; J.
∝ p(θ, φ)p(y|θ) (y ind. of φ given θ) Albert, 7)
•Review 1
∝ p(φ)p(θ|φ)p(y|θ). •Review 2
Empirical Bayes
•Why
♣ Inference (and computation) is often carried out Hierarchical?
•Hierarchical
in two steps: ? Multi-parameter model? Model
•hierarchical
1. Inference for θ as if we knew φ using the approach
•Exchangeability
posterior conditional distribution p(θ|y, φ); •Basic ex. model
•General ex.
2. Inference for φ based on posterior marginal model
distribution p(φ|y). •Typical structure
•Posterior dist.
3. Treat θ as vector parameter of interest and φ as •Predictive dist.
nuisance parameter(s)
S
CHOOL OFF
(thoughS they are both of
INANCE AND TAT I S T I C S
Hierarchical
Hierarchical models are characterized both by hy- models(I) —
perparameters, φ, and parameters θ. introduction
(Ref.: Gelman et
al., 5.1-5.3;
Two posterior predictive distributions: Berger, 4.6; J.
Albert, 7)
•Review 1
•Review 2
Empirical Bayes
•Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
Hierarchical
Hierarchical models are characterized both by hy- models(I) —
perparameters, φ, and parameters θ. introduction
(Ref.: Gelman et
al., 5.1-5.3;
Two posterior predictive distributions: Berger, 4.6; J.
■ the distribution of future observations ỹ Albert, 7)
•Review 1
corresponding to an existing θj (experiment), •Review 2
Empirical Bayes
based on the posterior draws of θj (and/or φ). •Why
Hierarchical?
•Hierarchical
Model
•hierarchical
approach
•Exchangeability
•Basic ex. model
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
Hierarchical
Hierarchical models are characterized both by hy- models(I) —
perparameters, φ, and parameters θ. introduction
(Ref.: Gelman et
al., 5.1-5.3;
Two posterior predictive distributions: Berger, 4.6; J.
■ the distribution of future observations ỹ Albert, 7)
•Review 1
corresponding to an existing θj (experiment), •Review 2
Empirical Bayes
based on the posterior draws of θj (and/or φ). •Why
Hierarchical?
•Hierarchical
■ the distribution of observations ỹ corresponding Model
•hierarchical
to future(experiment) θj ’s drawn from p(θj |φ): approach
◆ draw θ̃ from p(θj |φ); •Exchangeability
•Basic ex. model
◆ draw ỹ from p(y|θ̃).
•General ex.
model
•Typical structure
•Posterior dist.
•Predictive dist.
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
1
Pnj (Ref.: Gelman et
■ Sample mean: ȳ.j = nj i=1 yij with sample al., 5.4-5.5)
A special case
σ2
variance: σj2 = nj
. •SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
1
Pnj (Ref.: Gelman et
■ Sample mean: ȳ.j = nj i=1 yij with sample al., 5.4-5.5)
A special case
σ2
variance: σj2 = nj
. •SAT coaching ex
•Non-H approach
■ Likelihood for θj in terms of sufficient statistics ȳ.j : •Model
specification
•Joint posterior
ȳ.j | θj ∼ N (θj , σj2 ). •Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
1
Pnj (Ref.: Gelman et
■ Sample mean: ȳ.j = nj i=1 yij with sample al., 5.4-5.5)
A special case
σ2
variance: σj2 = nj
. •SAT coaching ex
•Non-H approach
■ Likelihood for θj in terms of sufficient statistics ȳ.j : •Model
specification
•Joint posterior
ȳ.j | θj ∼ N (θj , σj2 ). •Computation
•N-N Summary
•SAT ex. Result
■ This model ("normal with known variance") is •HM summary
appropriate for nj large enough. •Computation
Overview
•Exercises
1
Pnj (Ref.: Gelman et
■ Sample mean: ȳ.j = nj i=1 yij with sample al., 5.4-5.5)
A special case
σ2
variance: σj2 = nj
. •SAT coaching ex
•Non-H approach
■ Likelihood for θj in terms of sufficient statistics ȳ.j : •Model
specification
•Joint posterior
ȳ.j | θj ∼ N (θj , σj2 ). •Computation
•N-N Summary
•SAT ex. Result
■ This model ("normal with known variance") is •HM summary
appropriate for nj large enough. •Computation
Overview
■ Purpose: estimating θj . How? •Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
Hierarchical
df SS MS E(MS|σ 2 , τ ) models(II)
— A N-N HM
− ȳ.. )2 SS
nτ 2 + σ 2
P P
Between J −1 i j (ȳ.j J−1 with SAT coaching
example
− ȳ.j )2 SS
σ2
P P
Within J(n-1) i j (ȳij J(n−1)
2 (Ref.: Gelman et
SS
σ2
P P
Total Jn-1 i j (ȳij − ȳ.. ) Jn−1 al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
Hierarchical
df SS MS E(MS|σ 2 , τ ) models(II)
— A N-N HM
− ȳ.. )2 SS
nτ 2 + σ 2
P P
Between J −1 i j (ȳ.j J−1 with SAT coaching
example
− ȳ.j )2 SS
σ2
P P
Within J(n-1) i j (ȳij J(n−1)
2 (Ref.: Gelman et
SS
σ2
P P
Total Jn-1 i j (ȳij − ȳ.. ) Jn−1 al., 5.4-5.5)
A special case
■ Conclusions: •SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
Hierarchical
df SS MS E(MS|σ 2 , τ ) models(II)
— A N-N HM
− ȳ.. )2 SS
nτ 2 + σ 2
P P
Between J −1 i j (ȳ.j J−1 with SAT coaching
example
− ȳ.j )2 SS
σ2
P P
Within J(n-1) i j (ȳij J(n−1)
2 (Ref.: Gelman et
SS
σ2
P P
Total Jn-1 i j (ȳij − ȳ.. ) Jn−1 al., 5.4-5.5)
A special case
■ Conclusions: •SAT coaching ex
◆ if the ratio of "between" to "within" mean •Non-H approach
•Model
squares is significantly greater than 1, then specification
•Joint posterior
θ̂j = ȳ.j ; •Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
Hierarchical
df SS MS E(MS|σ 2 , τ ) models(II)
— A N-N HM
− ȳ.. )2 SS
nτ 2 + σ 2
P P
Between J −1 i j (ȳ.j J−1 with SAT coaching
example
− ȳ.j )2 SS
σ2
P P
Within J(n-1) i j (ȳij J(n−1)
2 (Ref.: Gelman et
SS
σ2
P P
Total Jn-1 i j (ȳij − ȳ.. ) Jn−1 al., 5.4-5.5)
A special case
■ Conclusions: •SAT coaching ex
◆ if the ratio of "between" to "within" mean •Non-H approach
•Model
squares is significantly greater than 1, then specification
•Joint posterior
θ̂j = ȳ.j ; •Computation
◆ if ratio of the mean squares is not statistically •N-N Summary
•SAT ex. Result
significant, then F test cannot reject H0 : τ = 0, •HM summary
•Computation
and θ̂j = ȳ.. . Overview
•Exercises
Hierarchical
df SS MS E(MS|σ 2 , τ ) models(II)
— A N-N HM
− ȳ.. )2 SS
nτ 2 + σ 2
P P
Between J −1 i j (ȳ.j J−1 with SAT coaching
example
− ȳ.j )2 SS
σ2
P P
Within J(n-1) i j (ȳij J(n−1)
2 (Ref.: Gelman et
SS
σ2
P P
Total Jn-1 i j (ȳij − ȳ.. ) Jn−1 al., 5.4-5.5)
A special case
■ Conclusions: •SAT coaching ex
◆ if the ratio of "between" to "within" mean •Non-H approach
•Model
squares is significantly greater than 1, then specification
•Joint posterior
θ̂j = ȳ.j ; •Computation
◆ if ratio of the mean squares is not statistically •N-N Summary
•SAT ex. Result
significant, then F test cannot reject H0 : τ = 0, •HM summary
•Computation
and θ̂j = ȳ.. . Overview
•Exercises
■ Method 3: weighted combination:
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
ÆUÿÁ):
«Y²ÿÁ øÆ3Æ)\Æë, [©
SAT: Scholastic Aptitude Test( Hierarchical
models(II)
Ǒ«
, — A N-N HM
with SAT coaching
3 example
Ô§
■ Purpose: Analyze the effects of special coaching Hierarchical
models(II)
programs( ) on test scores. — A N-N HM
with SAT coaching
example
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
Ô§
■ Purpose: Analyze the effects of special coaching Hierarchical
models(II)
programs( ) on test scores. — A N-N HM
with SAT coaching
■ All students in the experiments had taken the example
Ô§
■ Purpose: Analyze the effects of special coaching Hierarchical
models(II)
programs( ) on test scores. — A N-N HM
with SAT coaching
■ All students in the experiments had taken the example
Ô§
■ Purpose: Analyze the effects of special coaching Hierarchical
models(II)
programs( ) on test scores. — A N-N HM
with SAT coaching
■ All students in the experiments had taken the example
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
/ 0
— A N-N HM
with SAT coaching
average true effects of coaching programs. example
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
/ 0
— A N-N HM
with SAT coaching
average true effects of coaching programs. example
(Ref.: Gelman et
■ Data yj : separate estimated treatment effects for al., 5.4-5.5)
A special case
each school. •SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
/ 0
— A N-N HM
with SAT coaching
average true effects of coaching programs. example
(Ref.: Gelman et
■ Data yj : separate estimated treatment effects for al., 5.4-5.5)
A special case
each school. •SAT coaching ex
•Non-H approach
■ The standard errors σj are assumed known •Model
specification
(large samples). •Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
/ 0
— A N-N HM
with SAT coaching
average true effects of coaching programs. example
(Ref.: Gelman et
■ Data yj : separate estimated treatment effects for al., 5.4-5.5)
A special case
each school. •SAT coaching ex
•Non-H approach
■ The standard errors σj are assumed known •Model
specification
(large samples). •Joint posterior
•Computation
•N-N Summary
■ This is a randomized experiment with large •SAT ex. Result
samples (over 32 students in each school), no •HM summary
•Computation
outliers, so we appeal to the central limit Overview
•Exercises
theorem:
yj |θj ∼ N (θj , σj2 ).
SCHOOL OF FINANCE AND S TAT I S T I C S
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
j=1 (1/σj )
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
j=1 (1/σj )
A special case
•SAT coaching ex
•Non-H approach
■ Pooled estimate applies to each school. •Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
j=1 (1/σj )
A special case
•SAT coaching ex
•Non-H approach
■ Pooled estimate applies to each school. •Model
specification
•Joint posterior
Separate and pooled estimates are both unreason- •Computation
•N-N Summary
able! —See further on pages 139–141. •SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
j=1 (1/σj )
A special case
•SAT coaching ex
•Non-H approach
■ Pooled estimate applies to each school. •Model
specification
•Joint posterior
Separate and pooled estimates are both unreason- •Computation
•N-N Summary
able! —See further on pages 139–141. •SAT ex. Result
•HM summary
j=1 (1/σj )
A special case
•SAT coaching ex
•Non-H approach
■ Pooled estimate applies to each school. •Model
specification
•Joint posterior
Separate and pooled estimates are both unreason- •Computation
•N-N Summary
able! —See further on pages 139–141. •SAT ex. Result
•HM summary
.
•Computation
•N-N Summary
■ Prior model for θj s is based on a normal •SAT ex. Result
population distribution (conjugate) •HM summary
•Computation
Overview
J
Y •Exercises
p(θ1 , . . . , θJ |µ, τ ) = N (θj |µ, τ )
j=1
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
j=1
2(σ j + τ •Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
■ Result: µ|τ, y ∼ N (µ̂, Vµ ) with al., 5.4-5.5)
A special case
PJ 1 •SAT coaching ex
j=1 σj +τ 2 yj
2
1 •Non-H approach
µ̂ = PJ 1
and Vµ = PJ 1
•Model
specification
j=1 σj2 +τ 2 j=1 σj2 +τ 2 •Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
Hierarchical
models(II)
— A N-N HM
with SAT coaching
example
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
0 5 10 15 20 25 30 •Computation
Overview
τ
•Exercises
Hierarchical
Conditional posterior mean Conditional posterior SD
models(II)
— A N-N HM
30
with SAT coaching
20
example
25
A
(Ref.: Gelman et
15
H al., 5.4-5.5)
20
C A special case
A
G
•SAT coaching ex
15
•Non-H approach
10
D
F
B
G
H •Model
10
E
B
D
specification
•Joint posterior
5
5
F
•Computation
C
E •N-N Summary
0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
•Computation
Overview
τ τ
•Exercises
Hierarchical
School 2.5% 25% 50% 75% 97.5% yj models(II)
— A N-N HM
with SAT coaching
A -2 7 10 16 31 28 example
B -5 3 8 12 23 8 (Ref.: Gelman et
al., 5.4-5.5)
C -11 2 7 11 19 -3 A special case
•SAT coaching ex
D -7 4 8 11 21 7 •Non-H approach
•Model
E -9 1 5 10 18 -1 specification
•Joint posterior
F -7 2 6 10 28 1 •Computation
•N-N Summary
G -1 7 10 15 26 18 •SAT ex. Result
•HM summary
H -6 3 8 13 33 12 •Computation
Overview
µ -2 5 8 11 18 •Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
al., 5.4-5.5)
A special case
•SAT coaching ex
•Non-H approach
•Model
specification
•Joint posterior
•Computation
•N-N Summary
•SAT ex. Result
•HM summary
•Computation
Overview
•Exercises
(Ref.: Gelman et
Hierarchical models(III) al., 5.6, 19.4)
•Meta-analysis
— Meta-analysis of clinical trials •The data
•Parameters
•Normal Approx.
(Ref.: Gelman et al., 5.6, 19.4) •Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
Assumptions
•Classical model
■ We ll re-inforce some of the concepts of •Classical test
hierarchical modelling in a meta-analysis of •HM–stage 1
•HM–stage 2
clinical trials data.
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
Assumptions
•Classical model
■ We ll re-inforce some of the concepts of •Classical test
hierarchical modelling in a meta-analysis of •HM–stage 1
•HM–stage 2
clinical trials data.
.
•Posterior
?Ô©Û´A½+(XÆ)¥æ^«ÚOÆ distn s
Ü{§´òõÕáéÓKïÄn
•Result
•Another example
Üå5?1½þ©Û{"
•Exercises
%*òl
two groups of heart attack(myocardial infarction, — Meta-analysis of
clinical trials
ÉN{
) patients receiving (or not)
(Ref.: Gelman et
beta-blockers( ) (samples sizes from al., 5.6, 19.4)
100 to almost 2000, mortality from 3% to 21% •Meta-analysis
•The data
showing a modest, though not ’statistically •Parameters
•Normal Approx.
significant,’ benefit from use of beta-blockers.) •Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
%*òl
two groups of heart attack(myocardial infarction, — Meta-analysis of
clinical trials
ÉN{
) patients receiving (or not)
(Ref.: Gelman et
beta-blockers( ) (samples sizes from al., 5.6, 19.4)
100 to almost 2000, mortality from 3% to 21% •Meta-analysis
•The data
showing a modest, though not ’statistically •Parameters
•Normal Approx.
significant,’ benefit from use of beta-blockers.) •Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
%*òl
two groups of heart attack(myocardial infarction, — Meta-analysis of
clinical trials
ÉN{
) patients receiving (or not)
(Ref.: Gelman et
beta-blockers( ) (samples sizes from al., 5.6, 19.4)
100 to almost 2000, mortality from 3% to 21% •Meta-analysis
•The data
showing a modest, though not ’statistically •Parameters
•Normal Approx.
significant,’ benefit from use of beta-blockers.) •Possible
Assumptions
■ Aim: Use a combined analysis of the studies to •Classical model
•Classical test
measure the strength of evidence for (and •HM–stage 1
•HM–stage 2
magnitude of) any beneficial effect of the
.
•Posterior
treatment under study. distn s
•Result
•Another example
■ Note: Any formal analysis must be preceded by •Exercises
the application of rigorous inclusion criteria.
SCHOOL OF FINANCE AND S TAT I S T I C S
.
•Posterior
6 6/52 4/59 -0.584 0.676 distn s
•Result
. ··· ··· ··· ··· •Another example
•Exercises
21 43/364 27/391 -0.591 0.257
22 39/674 22/680 -0.608 0.272
SCHOOL OF FINANCE AND S TAT I S T I C S
(Ref.: Gelman et
al., 5.6, 19.4)
•Meta-analysis
•The data
•Parameters
•Normal Approx.
•Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
■ In trial j there are n0j control subjects and n1j (Ref.: Gelman et
al., 5.6, 19.4)
treatment subjects, with y0j and y1j deaths •Meta-analysis
respectively. •The data
•Parameters
•Normal Approx.
•Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
■ In trial j there are n0j control subjects and n1j (Ref.: Gelman et
al., 5.6, 19.4)
treatment subjects, with y0j and y1j deaths •Meta-analysis
respectively. •The data
•Parameters
•Normal Approx.
■ Sampling model: y0j and y1j have independent •Possible
Assumptions
binomial sampling distributions with probabilities •Classical model
•Classical test
of death p0j and p1j respectively. •HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•The data
•Parameters
We ll use the natural logarithm of the odds ratio, •Normal Approx.
θj = log ρj , as a measure of effect size comparing •Possible
Assumptions
treatment to control groups. •Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•The data
•Parameters
We ll use the natural logarithm of the odds ratio, •Normal Approx.
θj = log ρj , as a measure of effect size comparing •Possible
Assumptions
treatment to control groups. The reasons are: •Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•The data
•Parameters
We ll use the natural logarithm of the odds ratio, •Normal Approx.
θj = log ρj , as a measure of effect size comparing •Possible
Assumptions
treatment to control groups. The reasons are: •Classical model
•Classical test
■ Interpretability in a range of study designs •HM–stage 1
•HM–stage 2
(cohorts, case-control and clinical trials).
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•The data
•Parameters
We ll use the natural logarithm of the odds ratio, •Normal Approx.
θj = log ρj , as a measure of effect size comparing •Possible
Assumptions
treatment to control groups. The reasons are: •Classical model
•Classical test
■ Interpretability in a range of study designs •HM–stage 1
•HM–stage 2
(cohorts, case-control and clinical trials).
.
•Posterior
distn s
■ Posterior distribution of θj = log ρj close to
•Result
normality even for small sample sizes. •Another example
•Exercises
.
•The data
•Parameters
We ll use the natural logarithm of the odds ratio, •Normal Approx.
θj = log ρj , as a measure of effect size comparing •Possible
Assumptions
treatment to control groups. The reasons are: •Classical model
•Classical test
■ Interpretability in a range of study designs •HM–stage 1
•HM–stage 2
(cohorts, case-control and clinical trials).
.
•Posterior
distn s
■ Posterior distribution of θj = log ρj close to
•Result
normality even for small sample sizes. •Another example
•Exercises
■ Canonical (natural) parameter for logistic
regression.
SCHOOL OF FINANCE AND S TAT I S T I C S
(Ref.: Gelman et
al., 5.6, 19.4)
•Meta-analysis
•The data
•Parameters
•Normal Approx.
•Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
1 1 1 1 distn s
σj2 = + + + . •Result
y1j n1j − y1j y0j n0j − y0j •Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
(Ref.: Gelman et
al., 5.6, 19.4)
•Meta-analysis
•The data
•Parameters
•Normal Approx.
•Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
■ Assumptions imply µ̂ normal with variance distn s
PJ •Result
1/ j=1 1/σj2 . •Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
/ 0
multivariate analysis that use the •Possible
Assumptions
true binomial sampling distribution). •Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
■ Note that the posterior mean is a distn s
precision-weighted average of the prior •Result
•Another example
population mean and the observed yj •Exercises
representing the treatment effect in the j th group.
SCHOOL OF FINANCE AND S TAT I S T I C S
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
■ The classical pooled result results from τ 2 → 0, distn s
•Result
that is, Bj = 1. •Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
■ τ 2 → 0 gives the classical result where Bj = 1. distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.ll use
■ The hierarchical model is completed by Hierarchical
models(III)
specifying a prior distribution for τ — we — Meta-analysis of
clinical trials
the noninformative prior p(τ ) = 1.
(Ref.: Gelman et
al., 5.6, 19.4)
•Meta-analysis
•The data
•Parameters
•Normal Approx.
•Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
.ll use
■ The hierarchical model is completed by Hierarchical
models(III)
specifying a prior distribution for τ — we — Meta-analysis of
clinical trials
the noninformative prior p(τ ) = 1.
(Ref.: Gelman et
al., 5.6, 19.4)
■ Nevertheless, p(τ |y) is a complicated function of •Meta-analysis
•The data
τ: •Parameters
QJ •Normal Approx.
2 2
j=1 N (y j |µ̂, σj + τ ) •Possible
Assumptions
p(τ |y) ∝ •Classical model
N (µ̂|µ̂, Vµ ) •Classical test
J •HM–stage 1
2
1/2
Y
2 2 −1/2 (yj − µ̂) •HM–stage 2
∝ Vµ (σj + τ ) exp − .
.
2 •Posterior
2(σj + τ )2 distn s
j=1 •Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
•Posterior
j=1 distn s
•Result
•Another example
•Exercises
.
•Posterior
distn s
•Result
•Another example
•Exercises
(Ref.: Gelman et
al., 5.6, 19.4)
•Meta-analysis
•The data
•Parameters
•Normal Approx.
•Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
(Ref.: Gelman et
■ Estimates of µ and τ , and the predicted θj , see al., 5.6, 19.4)
Table 5.5 on page 149. •Meta-analysis
•The data
•Parameters
•Normal Approx.
•Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises
.
LIMIT-2 90 1159 118 1157 -0.30 0.15 •Posterior
distn s
•Result
•Another example
•Exercises
.
LIMIT-2 90 1159 118 1157 -0.30 0.15 •Posterior
distn s
•Result
Please analyze the data using a normal-normal •Another example
•Exercises
hierarchical Bayesian model.
Ex 5.10 Hierarchical
models(III)
— Meta-analysis of
clinical trials
(Ref.: Gelman et
al., 5.6, 19.4)
•Meta-analysis
•The data
•Parameters
•Normal Approx.
•Possible
Assumptions
•Classical model
•Classical test
•HM–stage 1
•HM–stage 2
.
•Posterior
distn s
•Result
•Another example
•Exercises