Professional Documents
Culture Documents
Geert Verbeke
Geert Molenberghs
geert.molenberghs@uhasselt.be
http://perswww.kuleuven.be/geert verbeke
www.censtat.uhasselt.be
Master of Statistics
Contents
Related References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simple Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
10
11
12
II
13
14
15
16
17
310
ii
18
19
20
21
III
22
23
24
IV
25
26
27
402
437
iii
28
29
Incomplete Data
30
31
32
33
34
VI
35
36
37
528
600
iv
38
39
Chapter 1
Related References
Aerts, M., Geys, H., Molenberghs, G., and Ryan, L.M. (2002). Topics in
Modelling of Clustered Data. London: Chapman & Hall.
Brown, H. and Prescott, R. (1999). Applied Mixed Models in Medicine. New
York: John Wiley & Sons.
Crowder, M.J. and Hand, D.J. (1990). Analysis of Repeated Measures. London:
Chapman & Hall.
Davidian, M. and Giltinan, D.M. (1995). Nonlinear Models For Repeated
Measurement Data. London: Chapman & Hall.
University Press.
Littell, R.C., Milliken, G.A., Stroup, W.W., Wolfinger, R.D., and Schabenberger,
O. (2005). SAS for Mixed Models (2nd ed.). Cary: SAS Press.
Longford, N.T. (1993). Random Coefficient Models. Oxford: Oxford University
Press.
McCullagh, P. and Nelder, J.A. (1989). Generalized Linear Models (second
edition). London: Chapman & Hall.
Molenberghs, G. and Verbeke, G. (2005). Models for Discrete Longitudinal Data.
New York: Springer-Verlag.
Pinheiro, J.C. and Bates D.M. (2000). Mixed effects models in S and S-Plus.
New York: Springer.
Searle, S.R., Casella, G., and McCulloch, C.E. (1992). Variance Components.
Introduction to Longitudinal Data Analysis
New-York: Wiley.
Senn, S.J. (1993). Cross-over Trials in Clinical Research. Chichester: Wiley.
Verbeke, G. and Molenberghs, G. (1997). Linear Mixed Models In Practice: A
SAS Oriented Approach, Lecture Notes in Statistics 126. New-York: Springer.
Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal
Data. Springer Series in Statistics. New-York: Springer.
Vonesh, E.F. and Chinchilli, V.M. (1997). Linear and Non-linear Models for the
Analysis of Repeated Measurements. Basel: Marcel Dekker.
Weiss, R.E. (2005). Modeling Longitudinal Data. New York: Springer.
West, B.T., Welch, K.B., and Galecki, A.T. (2007). Linear Mixed Models: A
Practical Guide Using Statistical Software. Boca Raton: Chapman & Hall/CRC.
Part I
Continuous Longitudinal Data
Chapter 2
Introduction
2.1
Units:
2.2
Captopril Data
Patient
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Before
SBP DBP
210
130
169
122
187
124
160
104
167
112
176
101
185
121
206
124
173
115
146
102
174
98
201
119
198
106
148
107
154
100
After
SBP DBP
201
125
165
121
166
121
157
106
147
101
145
85
168
98
180
105
147
103
136
98
151
90
168
98
179
110
129
103
131
82
10
Research question:
Remarks:
. Paired observations:
Most simple example of longitudinal
data
11
2.3
Growth Curves
< 155 cm
> 164 cm
16
7 13
14 20
Research question:
Is growth related to height of mother ?
12
Individual profiles:
13
Remarks:
14
2.4
Growth Data
15
Individual profiles:
16
Remarks:
17
2.5
Rat Data
18
Treatment starts at the age of 45 days; measurements taken every 10 days, from
day 50 on.
The responses are distances (pixels) between well defined points on x-ray pictures
of the skull of each rat:
19
Measurements with respect to the roof, base and height of the skull. Here, we
consider only one response, reflecting the height of the skull.
Individual profiles:
20
# Observations
Control Low High
15
18 17
13
17 16
13
15 15
10
15 13
7
12 10
4
10 10
4
8
10
Total
50
46
43
38
29
24
22
Remarks:
21
2.6
Toenail Data
22
Research question:
Are both treatments equally effective for
the treatment of TDO ?
23
24
As response is related to toe size, we restrict to patients with big toenail as target
nail = 150 and 148 subjects.
30 randomly selected profiles, in each group:
25
# Observations
Time (months)
Treatment A Treatment B
Total
150
148
298
149
142
291
2
3
146
140
138
131
284
271
131
124
255
9
12
120
118
109
108
229
226
Remarks:
26
2.7
Hence, is the probability of occurrence of mastitis related to the yield that would
have been observed had mastitis not occured ?
Hypothesis cannot be tested directly since covariate is missing for all events
Introduction to Longitudinal Data Analysis
27
Individual profiles:
Remarks:
28
2.8
Reference: Shock, Greullich, Andres, Arenberg, Costa, Lakatta, & Tobin, National
Institutes of Health Publication, Washington, DC: National Institutes of Health
(1984).
BLSA: Ongoing, multidisciplinary observational study, started in 1958, with the
study of normal human aging as primary objective
Participants:
29
30
2.8.1
Prostate Data
References:
31
32
33
Individual profiles:
34
Remarks:
35
2.8.2
Hearing Data
References:
36
. 6170 observations (3089 left ear, 3081 right ear) from 681 males without any
otologic disease
. followed for up to 22 years, with a maximum of 15 measurements/subject
30 randomly selected profiles, for each ear:
37
Remarks:
38
Chapter 3
Cross-sectional versus Longitudinal Data
. Introduction
. Paired verus unpaired t-test
. Cross-sectional versus longitudinal data
39
3.1
Introduction
40
For example, for the growth curves, the correlation matrix of the 5 repeated
measurements equals
1.00
0.95
0.96
0.93
0.87
0.95
1.00
0.97
0.96
0.89
0.96
0.97
1.00
0.98
0.94
0.93
0.96
0.98
1.00
0.98
0.87
0.89
0.94
0.98
1.00
41
3.2
3.2.1
Paired t-test
42
i = 1, . . . , 15
43
Statistica output:
44
3.2.2
15 2 6= 30 1
45
The two-sample t-test does not take into account the fact that the 30
measurements are not independent observations.
This illustrates that classical statistical models which assume independent
observations will not be valid for the analysis of longitudinal data
46
3.3
Suppose it is of interest to study the relation between some response Y and age
A cross-sectional study yields the following data:
47
Exactly the same observations could also have been obtained in a longitudinal
study, with 2 measurements per subject.
First case:
48
49
Conclusion:
50
51
Chapter 4
Simple Methods
. Introduction
. Overview of frequently used methods
. Summary statistics
52
4.1
Introduction
The reason why classical statistical techniques fail in the context of longitudinal
data is that observations within subjects are correlated.
In many cases the correlation between two repeated measurements decreases as
the time span between those measurements increases.
A correct analysis should account for this
The paired t-test accounts for this by considering subject-specific differences
i = Yi1 Yi2.
This reduces the number of measurements to just one per subject, which implies
that classical techniques can be applied again.
Introduction to Longitudinal Data Analysis
53
In the case of more than 2 measurements per subject, similar simple techniques
are often applied to reduce the number of measurements for the ith subject, from
ni to 1.
Some examples:
54
4.2
4.2.1
. Simple to interpret
. Uses all available data
Disadvantages:
55
4.2.2
56
4.2.3
Analysis of Endpoints
Disadvantages:
57
4.2.4
Analysis of Increments
58
4.2.5
Analysis of Covariance
Disadvantages:
59
4.3
Summary Statistics
60
However, all these methods have the disadvantage that (lots of) information is lost
Further, they often do not allow to draw conclusions about the way the endpoint
has been reached:
61
Chapter 5
The Multivariate Regression Model
62
5.1
63
Yi = Xi + i with
Xi : matrix of covariates
: vector of regression parameters
i : vector of error components, i N (0, )
64
Yi N (Xi, )
N
Y
i=1
(2)n/2 ||
12
0 1
65
66
5.2
5.2.1
Model Parameterization
=
=
=
=
In matrix notation:
Y i = Xi + i ,
with
Xi =
(1 xi )
0
0
0
0
(1 xi )
0
0
0
0
(1 xi )
0
0
0
0
(1 xi )
xi
0
0
0
0
xi
0
0
0
0
xi
0
0
0
0
xi
and with
= (0,8 , 0,10 , 0,12 , 0,14, 1,8 , 1,10 , 1,12 , 1,14 )0
68
5.2.2
SAS Program
SAS syntax:
proc mixed data = growth method = ml;
class idnr sex age;
model measure = age*sex / noint s;
repeated age / type = un subject = idnr;
run;
Data structure:
one record per observation:
idnr
age
sex
measure
1.0000
1.0000
1.0000
1.0000
2.0000
2.0000
8.0000
10.000
12.000
14.000
8.0000
10.000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
21.000
20.000
21.500
23.000
21.000
21.500
......
......
......
......
26.000
26.000
27.000
27.000
27.000
27.000
12.000
14.000
8.0000
10.000
12.000
14.000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
26.000
30.000
22.000
21.500
23.500
25.000
69
The mean is modeled in the MODEL statement, as in other SAS procedures for
linear models
The covariance matrix is modeled in the REPEATED statement:
. option type= specifies covariance structure
70
5.2.3
Results
MLE
22.8750
23.8125
25.7188
27.4688
21.1818
22.2273
23.0909
24.0909
(s.e.)
(0.5598)
(0.4921)
(0.6112)
(0.5371)
(0.6752)
(0.5935)
(0.7372)
(0.6478)
71
72
5.3
Model Reduction
73
5.3.1
Xi =
1 xi 8(1 xi )
8xi
74
75
Covar par
2`
df
1 2.968 4 0.5632
G2
unstr. 18 416.509
Predicted trends:
Ref
76
Xi =
1 xi 8
1 xi 10
1 xi 12
1 xi 14
77
LR test:
Mean
1 unstr.
Covar par
2`
G2
df
unstr. 18 416.509
Ref
1 2.968 4
0.5632
2 6.676 1
0.0098
78
79
5.3.2
80
Some examples:
Structure
Unstructured
type=UN
Simple
type=SIMPLE
Example
Compound
symmetry
type=CS
Banded
type=UN(2)
First-order
autoregressive
type=AR(1)
12 12 13
12 22 23
13 23 32
2 0 0
0 2 0
0 0 2
Structure
Toeplitz
type=TOEP
Toeplitz (1)
type=Toep(1)
12 + 2
12
12
12
12 + 2
12
12
12
12 + 2
12 12 0
12 22 23
0 23 32
2 2 2 2
2 2 2
2 2 2 2
Example
Heterogeneous
compound
symmetry
type=CSH
Heterogeneous
first-order
autoregressive
type=ARH(1)
Heterogeneous
Toeplitz
type=TOEPH
2 12 13
12 2 12
13 12 2
2 0 0
0 2 0
0 0 2
12 1 2 1 3
1 2 22 2 3
1 3 2 3 32
12
1 2 2 1 3
1 2
22
2 3
2
1 3 2 3
32
12
1 1 2 2 1 3
1 1 2
22
1 2 3
2 1 3 1 2 3
32
81
This suggests that a possible model reduction could consist of assuming equal
variances, and banded covariances.
82
This is the so-called Toeplitz covariance matrix , with elements of the form
ij = |ij| :
0 1 2 3
1 0 1 2
2 1 0 1
3 2 1 0
Note that this is only really meaningful when the time points at which
measurements are taken are equally spaced, as in the current example.
SAS program :
proc mixed data = growth method = ml;
class sex idnr ageclss;
model measure = sex age*sex / s;
repeated ageclss / type = toep subject = idnr;
run;
Introduction to Longitudinal Data Analysis
83
Covar par
2`
Ref
G2
df
unstr.
18 416.509
2 6= slopes unstr.
14 419.477
1 2.968 4 0.5632
8 424.643
2 5.166 6 0.5227
4 6= slopes banded
84
1 2 3
1 2
2 1
3 2 1
Note that this is also only really meaningful when the time points at which
measurements are taken are equally spaced.
85
SAS program:
proc mixed data = growth method = ml;
class sex idnr ageclss;
model measure = sex age*sex / s;
repeated ageclss / type = AR(1) subject = idnr;
run;
Covar par
2`
Ref
G2
df
unstr.
18 416.509
2 6= slopes unstr.
14 419.477
2.968 4 0.5632
8 424.643
5.166 6 0.5227
5 6= slopes AR(1)
6 440.681
2 21.204 8 0.0066
4 6= slopes banded
4 16.038 2 0.0003
86
87
5.4
Remarks
from which it can be derived which outcomes have been observed, and which ones
are missing.
88
. Multivariate regression models can only be applied under very specific mean
and covariance structures, even in case of complete balance.
. Multivariate regression models can only be applied under very specific mean
and covariance structures.
. For example, Toeplitz and AR(1) covariances are not meaningful since time
points are not equally spaced.
. For example, compound symmetric covariances are meaningful, but based on
very strong assumptions.
89
Chapter 6
A Model for Longitudinal Data
. Introduction
. The 2-stage model formulation
. Examples: Rat and prostate data
. The general linear mixed-effects model
. Hierarchical versus marginal model
. Examples: Rat and prostate data
. A model for the residual covariance structure
90
6.1
Introduction
91
6.2
6.2.1
Stage 1
Stage 1 model:
Yi = Zi i + i
92
93
6.2.2
Stage 2
94
6.3
Individual profiles:
95
Yij = 1i + 2itij + ij ,
j = 1, . . . , ni
Matrix notation:
Yi = Zi i + i
with
Zi =
1 ti1
1
..
ti2
..
1 tini
96
In the second stage, the subject-specific intercepts and time effects are related to
the treatment of the rats
Stage 2 model:
1i = 0 + b1i,
1 if low dose
0 otherwise
Hi =
1 if high dose
0 otherwise
Ci =
1 if control
0 otherwise
97
Parameter interpretation:
98
6.4
Individual profiles:
99
Matrix notation:
Yi = Zi i + i
with
j = 1, . . . , ni
1 ti1 t2i1
1
Zi =
..
ti2 t2i2
..
1 tini t2ini
In the second stage, the subject-specific intercepts and time effects are related to
the age (at diagnosis) and disease status
Introduction to Longitudinal Data Analysis
100
Stage 2 model:
Ci =
Li =
1 if Control
0 otherwise
1 if L/R cancer case
0 otherwise
1 if BPH case
Bi =
Mi =
0 otherwise
0 otherwise
101
Parameter interpretation:
102
6.5
d
. summary statistics
i analysed in second stage
The associated drawbacks can be avoided by combining the two stages into one
model:
Yi = Zi i + i
i = K i + bi
= Yi = Z
K + Z i b i + i = Xi + Z i b i + i
| i{z i}
Xi
103
Yi = Xi + Zibi + i
bi N (0, D),
i N (0, i ),
b1, . . . , bN , 1, . . . , N independent
Terminology:
. Fixed effects:
. Random effects: bi
. Variance components: elements in D and i
104
6.6
Yi = Xi + Zibi + i
bi N (0, D),
i N (0, i ),
b1, . . . , bN , 1, . . . , N independent
bi N (0, D)
105
106
6.7
Stage 1 model:
Stage 2 model:
Yij = 1i + 2itij + ij ,
j = 1, . . . , ni
1i = 0 + b1i,
2i = 1Li + 2Hi + 3Ci + b2i,
Combined:
1 t1 D
1
t2
+ 2{t1 ,t2 }
108
109
2
1 D 1 + {t1,t2 }
110
6.8
Stage 1 model:
Stage 2 model:
Combined:
j = 1, . . . , ni
111
1 t1 t21 D t2
t22
+ 2{t1 ,t2 }
112
6.9
Balanced data, two measurements per subject (ni = 2), two models:
Model 1:
Random intercepts
+
heterogeneous errors
V =
d + 12
(d) (1 1) +
d
d+
22
Model 2:
Uncorrelated intercepts and slopes
+
measurement error
12 0
0
22
V =
1 0
d1 0
d1 + 2
d1
d1
d1 + d2 + 2
1 1
0 d2
1 1
0 1
2 0
0 2
113
114
6.10
115
Hence, when there is no evidence for (additional) random effects, or if they would
have no substantive meaning, the correlation structure in the data can be
accounted for in an appropriate model for i
Frequently used model:
Yi = Xi + Zibi +
(1)i + (2)i
|
{z
}
3 stochastic components:
116
(2)i represents the belief that part of an individuals observed profile is a response
to time-varying stochastic processes operating within that individual.
This results in a correlation between serial measurements, which is usually a
decreasing function of the time separation between these measurements.
The correlation matrix Hi of (2)i is assumed to have (j, k) element of the form
hijk = g(|tij tik |) for some decreasing function g() with g(0) = 1
Frequently used functions g():
117
Graphically, for = 1:
Extreme cases:
118
119
120
Chapter 7
Exploratory Data Analysis
. Introduction
. Mean structure
. Variance function
. Correlation structure
. Individual profiles
121
7.1
Introduction
122
123
124
7.2
For balanced data, averages can be calculated for each occasion separately, and
standard errors for the means can be added
Example: rat data:
. SAS program:
125
. SAS output:
126
. Discretize the time scale and use simple averaging within intervals
. SAS output:
128
If (important) covariates or factors are known, similar plots can be constructed for
subgroups with different values for these covariates or factors.
Example for the rat data:
129
130
7.3
131
132
133
7.4
7.4.1
For balanced longitudinal data, the correlation structure can be studied through
the correlation matrix, or a scatterplot matrix
Graphically, pairwise scatterplots can be used for exploring the correlation between
any two repeated measurements
Introduction to Longitudinal Data Analysis
134
135
7.4.2
Semi-variogram
For unbalanced data, the same approach can be used, after discretizing the time
scale.
An alternative method, in case the variance function suggests constant variance is
the semi-variogram
Re-consider the general linear mixed model:
Yi = Xi + Zibi + (1)i + (2)i
bi N (0, D)
136
Based on a mean function exploration, residuals rij = yij (tij ) can be obtained
These residuals are assumed to follow the model:
The semi-variogram assumes constant variance, which implies that the only
137
The correlation between any two residuals rij and rik from the same subject i is
given by
2 + 2 g(|tij tik |)
(|tij tik |) =
.
2
2
2
+ +
One can show that, for j 6= k,
1
E (rij rik )2 = 2 + 2 (1 g(|tij tik |))
2
= v(uijk )
The function v(u) is called the semi-variogram, and it only depends on the time
points tij through the time lags uijk = |tij tik |.
Decreasing serial correlation functions g() yield increasing semi-variograms v(u),
with v(0) = 2 , which converge to 2 + 2 as u grows to infinity.
Introduction to Longitudinal Data Analysis
138
139
residuals within subjects versus the corresponding time lags uijk = |tij tik |.
1
E[rij
2
rkl ]2 = 2 + 2 + 2
Hence, the total variability in the data (assumed constant) can be estimated by
c2
c2
c2
+ +
1
=
2N
i
X
l
X
(rij rkl )2 ,
140
141
*/
142
143
. SAS output:
144
7.5
7.5.1
Introduction
As discussed before, linear mixed models are often obtained from a two-stage
model formulation
This is based on a good approximation of the subject-specific profiles by linear
regression models
This requires methods for the exploration of longitudinal profiles
145
7.5.2
Graphical Exploration
146
. SAS output:
147
7.5.3
Some ad hoc statistical procedures for checking the linear regression models
Yi = Zi i + i
used in the first stage of the model formulation.
Extensions of classical linear regression techniques:
. Coefficient R2 of multiple determination
148
SSE
R2 = SSTO
SSTO
Subject-specific coefficients:
i SSEi
Ri2 = SSTO
SSTOi
N
X
(SSTOi SSEi)
i=1
N
X
i=1
SSTOi
149
{i:ni p+p }
(SSEi(R) SSEi(F ))
{i:ni p+p }
Null-distribution is F with
freedom
SSEi(F )
{i:ni p+p }
{i:nip+p }
p and
{i:nip+p }
(ni p p)
p p ) degrees of
150
151
Linear model:
2
= 0.8188
. Rmeta
2
. Rmeta
= 0.9143
152
Chapter 8
Estimation of the Marginal Model
. Introduction
. Maximum likelihood estimation
. Restricted maximum likelihood estimation
. Fitting linear mixed models in SAS
. Negative variance components
153
8.1
Introduction
independent
Yi N (Xi, ZiDZi0 + i)
Note that inferences based on the marginal model do not explicitly assume the
presence of random effects representing the natural heterogeneity between subjects
154
Notation:
N
Y
i=1
ni /2
|Vi ()|
(2)
12
() =
N
X
i=1
1
exp (Yi Xi)0 Vi1 () (Yi Xi)
2
1
0
Xi W i Xi
N
X
i=1
Xi0Wiyi ,
155
d
In most cases, is not known, and needs to be replaced by an estimate
156
8.2
c
LML(, ())
with respect to
c
c
d
The resulting estimate (
ML and ML can also be obtained from maximizing LML () with respect to , i.e.,
with respect to and simultaneously.
157
8.3
8.3.1
c 2 = (Yi )2/N
c 2 is unbiased for 2
When is not known, MLE of 2 equals:
c2
c2
c2 =
(Yi Y )2 /N
N 1 2
N
158
S 2 = (Yi Y )2 /(N 1)
i
Y =
Y1
..
YN
.. , 2IN
159
U=
Y1 Y2
Y2 Y
..
YN 2 Y
N 1
YN 1 Y
1 X
S =
(Yi Y )2
N 1 i
2
160
8.3.2
Y =
MLE of 2:
Y1
..
YN
N (X, 2I)
c 0
c
c 2 = (Y X )
(Y X )/N,
c2
N p 2
161
The MSE can also be obtained from transforming the data orthogonal to X:
U = A0Y N (0, 2A0A)
The MLE of 2, based on U , now equals the mean squared error, MSE
The MSE is again called the REML estimate of 2
162
8.3.3
Y =
Y1
.. , X =
YN
X1
V1
.. , V () = .. . . .
XN
0
..
0 VN
163
d
The MLE of , based on U is called the REML estimate, and is denoted by
REML
c
c
d
The resulting estimate (
LREML() =
N
X
i=1
Xi0Wi()X
1
2
i
LML()
LREML , () is the likelihood of the error contrasts U , and is often called the
REML likelihood function.
Note that LREML() is NOT the likelihood for our original data Y
164
8.4
ln(PSAij + 1)
= 1Agei + 2Ci + 3Bi + 4Li + 5Mi
+ (6Agei + 7Ci + 8Bi + 9Li + 10Mi) tij
+ (11 Agei + 12Ci + 13Bi + 14Li + 15 Mi) t2ij
+ b1i + b2itij + b3i t2ij + ij .
. BPH : group = 2
. local cancer : group = 3
. metastatic cancer : group = 4
Introduction to Longitudinal Data Analysis
165
166
. response variable
. fixed effects
. options similar to SAS regression procedures
167
RANDOM statement:
168
Example
Structure
12 12 13
12 22 23
13 23 32
2 0 0
0 2 0
0 0 2
Compound
symmetry
type=CS
12 + 2
12
12
12
12 + 2
12
2
2
2
1
1
1 + 2
Banded
type=UN(2)
12 12 0
12 22 23
0 23 32
First-order
autoregressive
type=AR(1)
2 2 2 2
2 2 2
2 2 2 2
Unstructured
type=UN
Simple
type=SIMPLE
Toeplitz
type=TOEP
Toeplitz (1)
type=Toep(1)
Heterogeneous
compound
symmetry
type=CSH
Heterogeneous
first-order
autoregressive
type=ARH(1)
Heterogeneous
Toeplitz
type=TOEPH
Example
2 12 13
12 2 12
13 12 2
2 0 0
0 2 0
0 0 2
12 1 2 1 3
1 2 22 2 3
1 3 2 3 32
12
1 2 2 1 3
1 2
22
2 3
2
1 3 2 3
32
12
1 1 2 2 1 3
1 1 2
22
1 2 3
2 1 3 1 2 3
32
169
Exponential
type=SP(EXP)(list)
Gaussian
type=SP(GAU)(list)
Example
2
d12
d13
d12 d13
1 d23
d23 1
1
exp(d12 /) exp(d13/)
2
exp(d12 /)
1
exp(d23/)
exp(d13 /) exp(d23 /)
1
1
exp(d212/2 ) exp(d213/2 )
2
2
2
exp(d12 / )
1
exp(d223/2 )
exp(d213 /2 ) exp(d223/2 )
1
170
171
172
173
8.5
174
This suggests that the REML likelihood could be further increased by allowing
negative estimates for d22
In SAS, this can be done by adding the option nobound to the PROC MIXED
statement.
Results:
Effect
Parameter
Intercept
0
Time effects:
Low dose
1
7.503
High dose
2
6.877
Control
3
7.319
Covariance of bi :
var(b1i)
d11
3.369
var(b2i)
d22
0.000
cov(b1i, b2i)
d12 = d21
0.090
Residual variance:
var(ij )
2
1.445
REML log-likelihood
466.173
(0.228)
(0.231)
(0.285)
7.475 (0.198)
6.890 (0.198)
7.284 (0.254)
(1.123)
(
)
(0.381)
2.921 (1.019)
0.287 (0.169)
0.462 (0.357)
(0.145)
1.522 (0.165)
465.193
175
Note that the REML log-likelihood value has been further increased and a
negative estimate for d22 is obtained.
Brown & Prescott (1999, p. 237) :
176
1 t D
1
t
+ c 2
177
This again shows that the hierarchical and marginal models are not equivalent:
. The marginal model allows negative variance components, as long as the
marginal covariances Vi = ZiDZi0 + 2Ini are positive definite
. The hierarchical interpretation of the model does not allow negative variance
components
178
Chapter 9
Inference for the Marginal Model
179
9.1
Estimate for :
c
() =
N
X
i=1
1
0
Xi W i Xi
N
X
i=1
Xi0Wiyi
Var() =
N
X
i=1
N
X
i=1
Xi0WiXi
1
0
Xi W i Xi
N
X
i=1
Xi0Wivar(Yi )WiX
N
X
i=1
Xi0WiXi
180
9.1.1
versus HA : L 6= 0
c
G =
L0 L
N
X
i=1
d
Xi0Vi1 ()X
i
L0
c
L
181
9.1.2
Var() =
N
X
i=1
1
0
Xi Wi()Xi
versus HA : L 6= 0
182
F test statistic:
0
F =
L0 L
N
X
i=1
d
Xi0Vi1 ()X
i
rank(L)
L0
c
L
. Satterthwaite approximation
. Kenward and Roger approximation
. ...
183
In the context of longitudinal data, all methods typically lead to large numbers of
degrees of freedom, and therefore also to very similar p-values.
For univariate hypotheses (rank(L) = 1) the F -test reduces to a t-test
184
9.1.3
versus HA : L 6= 0
185
4 = 5
H0 : 9 = 10
14 = 15,
H0 :
0 0 0 1 1 0 0 0 0
00000
0000
0 0 0 0 1 1 0 0 0 0
0000
00000
0 0 0 0 1 1
= 0,
186
Remarks:
187
Source
NDF
DDF
3 24.4
ChiSq
Pr > ChiSq
Pr > F
17.57 5.86
0.0005
0.0037
188
H0 :
6 = 0
7 = 0
11 = 0
12 = 0
13 = 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
= 0
H0 :
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
189
NDF
DDF
ChiSq
Pr > ChiSq
Pr > F
46.7
3.39
0.56
0.7587
0.7561
190
191
9.1.4
Robust Inference
Estimate for :
c
() =
N
X
i=1
1
0
Xi W i Xi
N
X
i=1
Xi0WiYi
E () =
N
X
i=1
N
X
i=1
1
0
Xi W i Xi
1
0
Xi W i Xi
N
X
Xi0WiE(Yi )
N
X
Xi0WiXi =
i=1
i=1
192
c
Conditional on ,
has covariance
Var() =
N
X
i=1
N
X
i=1
Xi0WiXi
Xi0WiXi
N
X
i=1
Xi0WiVar(Yi )WiX
N
X
i=1
1
0
Xi W i Xi
Note that this assumes that the covariance matrix Var(Yi ) is correctly modelled as
Vi = ZiDZi0 + i
This covariance estimate is therefore often called the naive estimate.
c
The so-called robust estimate for Var(),
which does not assume the covariance
matrix to be
correctly specified
is obtained from replacing Var(Yi) by
0
c
c
Yi Xi
Yi Xi
rather than Vi
193
0
Var() =
N
X
1 N
Xi0WiXi
i=1
{z
BREAD
}|
i=1
Xi0WiVar(Yi )WiXi
{z
MEAT
N
X
}|
i=1
1
0
Xi W i Xi
{z
BREAD
Based on this sandwich estimate, robust versions of the Wald test as well as of
the approximate t-test and F -test can be obtained.
194
Note that this suggests that as long as interest is only in inferences for the mean
structure, little effort should be spent in modeling the covariance structure,
provided that the data set is sufficiently large
Extreme point of view: OLS with robust standard errors
Appropriate covariance modeling may still be of interest:
. for the interpretation of random variation in data
. for gaining efficiency
. in presence of missing data, robust inference only valid under very severe
assumptions about the underlying missingness process (see later)
195
9.1.5
196
Comparison of naive and robust standard errors (only fixed effects !):
Effect
Parameter
Age effect
1
Intercepts:
Control
2
BPH
3
L/R cancer
4
Met. cancer
5
Time effects:
BPH
8
L/R cancer
9
Met. cancer
10
Time2 effects:
Cancer
14 = 15
(0.428;0.404)
(0.488;0.486)
(0.486;0.499)
(0.531;0.507)
0.410 (0.068;0.067)
1.870 (0.233;0.360)
2.303 (0.262;0.391)
0.510 (0.088;0.128)
For some parameters, the robust standard error is smaller than the naive,
model-based one. For other parameters, the opposite is true.
197
9.1.6
MLE
0,8
0,10
22.8750
23.8125
(0.5598)
(0.4921)
(0.5938)
(0.5170)
0,12
25.7188
(0.6112)
(0.6419)
0,14
1,8
27.4688
21.1818
(0.5371)
(0.6752)
(0.5048)
(0.6108)
1,10
22.2273
(0.5935)
(0.5468)
1,12
1,14
23.0909
24.0909
(0.7372)
(0.6478)
(0.6797)
(0.7007)
198
We fit a model with a separate covariance structure for each group (Model 0)
SAS program:
proc mixed data=test method=ml ;
class idnr sex age;
model measure = age*sex / noint s;
repeated age / type=un subject=idnr r rcorr group=sex;
run;
199
9.1.7
Comparison of nested models with different mean structures, but equal covariance
structure
Null hypothesis of interest equals H0 : ,0, for some subspace ,0 of the
parameter space of the fixed effects .
Notation:
c
.
ML : MLE under general model
200
Test statistic:
2 ln N = 2 ln
LML( ML,0)
c
LML( ML)
201
9.1.8
ln(PSAij + 1)
= 1Agei + 2Ci + 3Bi + 4Li + 5Mi + (8Bi + 9Li + 10Mi) tij
+ 14 (Li + Mi) t2ij + b1i + b2itij + b3it2ij + ij ,
LML = 3.575
6.602
p-value
0.010
degrees of freedom
202
LREML = 20.165
2.324
degrees of freedom
p-value
203
9.1.9
Models with different mean structures lead to different sets of error contrasts
Hence, the corresponding REML likelihoods are based on different observations,
which makes them no longer comparable
Introduction to Longitudinal Data Analysis
204
Conclusion:
LR tests for the mean structure are not valid under REML
205
9.2
206
9.2.1
207
9.2.2
208
Related output:
Covariance Parameter Estimates
Cov Parm
Subject
UN(1,1)
UN(2,1)
UN(2,2)
UN(3,1)
UN(3,2)
UN(3,3)
timeclss
XRAY
XRAY
XRAY
XRAY
XRAY
XRAY
XRAY
Estimate
Standard
Error
Z
Value
Pr Z
0.4432
-0.4903
0.8416
0.1480
-0.3000
0.1142
0.02837
0.09349
0.1239
0.2033
0.04702
0.08195
0.03454
0.002276
4.74
-3.96
4.14
3.15
-3.66
3.31
12.47
<.0001
<.0001
<.0001
0.0017
0.0003
0.0005
<.0001
209
9.2.3
Subject
UN(3,3)
XRAY
Estimate
Standard
Error
Z
Value
Pr Z
0.1142
0.03454
3.31
0.0005
210
Boundary Problems
The quality of the normal approximation for the ML or REML estimates strongly
depends on the true value
Poor normal approximation if is relatively close to the boundary of the
parameter space
If is a boundary value, the normal approximation completely fails
One of the Wald tests for the variance components in the reduced model for the
prostate data was
Cov Parm
Subject
UN(3,3)
XRAY
Estimate
Standard
Error
Z
Value
Pr Z
0.1142
0.03454
3.31
0.0005
211
212
9.2.4
Comparison of nested models with equal mean structures, but different covariance
structure
Null hypothesis of interest equals H0 : ,0, for some subspace ,0 of the
parameter space of the variance components .
Notation:
c
.
ML : MLE under general model
Test statistic:
2 ln N = 2 ln
LML( ML,0)
c
LML(
)
ML
213
214
9.2.5
215
versus HA : D = d11
216
c
c
. hence, LML(
)
=
L
(
ML,0
ML
ML ) in 50% of the cases
Graphically ( 2 = d11):
217
d11 0
0 0
218
H0 : D =
D11 0
00 0
219
D11 0
0
220
Conclusions
Correcting for the boundary problem reduces p-values
Thus, ignoring the boundary problem too often leads to over-simplified covariance
structures
Hence, ignoring the boundary problem may invalidate inferences, even for the
mean structure
221
9.2.6
We reconsider the model with random intercepts and slopes for the rat data:
Yij = (0 + b1i) + (1Li + 2Hi + 3Ci + b2i)tij + ij
in which tij equals ln[1 + (Ageij 45)/10)]
The marginal model assumes linear average trends with common intercept for the
3 groups, and covariance structure:
1 t1 D
1
t2
+ 2{t1 ,t2 }
222
This suggested earlier that the above random-effects model might not be valid, as
it does not allow negative curvature in the variance function
It is therefore of interest to test whether the random slopes b2i may be left out of
the model.
Interpretation:
. On hierarchical level: all rats receiving the same treatment have the same slope
223
Under H0
REMLE (s.e.)
68.607 (0.331)
(0.198)
(0.198)
(0.254)
7.507 (0.225)
6.871 (0.228)
7.507 (0.225)
(1.019)
(0.169)
(0.357)
3.565 (0.808)
(
)
(
)
(0.165)
1.445 (0.145)
466.202
Effect
Parameter
Intercept
0
Time effects:
Low dose
1
7.503 (0.228)
7.475
High dose
2
6.877 (0.231)
6.890
Control
3
7.319 (0.285)
7.284
Covariance of bi :
var(b1i)
d11
3.369 (1.123)
2.921
var(b2i)
d22
0.000 (
) 0.287
cov(b1i, b2i)
d12 = d21
0.090 (0.381)
0.462
Residual variance:
var(ij )
2
1.445 (0.145)
1.522
REML log-likelihood
466.173
465.193
224
. Test statistic:
dd12 dd22
Var(d12 )
d
Cov(
dd12 , dd22 )
0.462 0.287
Cov(d12 , d22)
d
Var(
dd22 )
0.127 0.038
0.038
0.029
d12
dd22
0.462
0.287
= 2.936,
. p-value:
P (22 2.936 | H0) = 0.2304
Introduction to Longitudinal Data Analysis
225
LR test:
. Test statistic:
2 ln N = 2(466.202 + 465.193) = 2.018
. p-value:
P (22 2.018 | H0) = 0.3646
226
P (21:2 0.058 | H0 )
= 0.5 P (21 0.058 | H0) + 0.5 P (22 0.058 | H0) = 0.8906
Note that the naive p-value, obtained from ignoring the boundary problem is
indeed larger:
P (22 0.058 | H0 ) = 0.9714
Introduction to Longitudinal Data Analysis
227
Reduced Model
Under both model interpretations, H0 was accepted, leading to the reduced model:
Yij = (0 + b1i) + (1Li + 2Hi + 3Ci)tij + ij
Marginal interpretation:
d
d
11
dI = d c 2 = 0.712
d11 +
228
9.3
9.3.1
Information Criteria
229
One then selects the model with the largest (log-)likelihood provided it is not
(too) complex
The model is selected with the highest penalized log-likelihood ` F (#) for
some function F () of the number # of parameters in the model.
Different functions F () lead to different criteria:
Definition of F ()?
Criterion
F (#) = #
Akaike (AIC)
Schwarz (SBC)
Hannan and Quinn (HQIC)
Bozdogan (CAIC)
PN
: n = n =
: n = n p under REML
i=1 ni
under ML
F (#) = (# ln n)/2
F (#) = # ln(ln n)
230
231
9.3.2
232
Summary of results:
Mean structure
`ML
AIC
SBC
470.326 480.914
470.622 477.681
Selected models:
Based on Wald test, the average slopes are found not to be significantly different
from each other (p = 0.0987)
233
Chapter 10
Inference for the Random Effects
234
10.1
Random effects bi reflect how the evolution for the ith subject deviates from the
expected evolution Xi.
Estimation of the bi helpful for detecting outlying profiles
This is only meaningful under the hierarchical model interpretation:
Yi |bi N (Xi + Zibi , i )
bi N (0, D)
235
Posterior density:
f (bi|yi) f (bi|Yi = yi) =
f (yi|bi) f (bi)
f (yi|bi) f (bi ) dbi
f (yi|bi) f (bi)
...
1
0 1
0
0
exp (bi DZi Wi(yi Xi)) i (bi DZi Wi(yi Xi))
2
bi | yi N (DZi0Wi(yi Xi), i )
236
N
X
i=1
Xi0WiXi
Xi0Wi ZiD
237
238
10.2
. unbiased for u
. minimum variance among all unbiased linear estimators
Introduction to Longitudinal Data Analysis
239
10.3
240
The ODS statements are used to write the EB estimates into a SAS output data
set, and to prevent SAS from printing them in the output window.
In practice, histograms and scatterplots of certain components of bdi are used to
detect model deviations or subjects with exceptional evolutions over time
241
242
Dcorr =
1.000 0.803
0.803
0.658
1.000 0.968
0.658 0.968
1.000
243
10.4
Shrinkage Estimators bi
c
c
= Xi
+ ZiDZi0Vi1 (yi Xi)
= Ini
=
ZiDZi0Vi1
c
iVi1 Xi
c
Xi
+ ZiDZi0Vi1yi
+ Ini
i Vi1
yi ,
c
d
Hence, Y
is
a
weighted
mean
of
the
population-averaged
profile
X
i
i and the
c d 1
c d 1
observed data yi, with weights
and Ini
respectively.
i Vi
i Vi
244
c
Note that Xi
gets much weight if the residual variability is large in comparison
to the total variability.
This is also reflected in the fact that for any linear combination 0bi of random
effects,
var(0bdi) var(0bi).
245
10.5
bi =
b21ni 0
b21ni 1ni 0
+ Ini
1
(yi Xi)
b2 0
b2
0
= 2 1ni Ini 2
1
1
ni ni (yi Xi )
2
+ ni b
ni
nib2
1 X
[j]
(y
X
= 2
ij
i )
+ ni b2 ni j=1
246
Remarks:
. bci is weighted average of 0 (prior mean) and the average residual for subject i
. less shrinkage the larger ni
. less shrinkage the smaller 2 relative to b2
247
10.6
Comparison of predicted, average, and observed profiles for the subjects #15 and
#28, obtained under the reduced model:
248
Var(bi) =
0.403 0.440
0.440
0.131
0.729 0.253 ,
0.131 0.253
0.092
D =
0.443 0.490
0.490
0.148
0.842 0.300
0.148 0.300
0.114
249
10.7
var(bi ) =
DZi0 Wi
W i Xi
N
X
i=1
Xi0WiXi
Xi0Wi ZiD
250
251
252
Conclusion:
EB estimates obtained under normality
cannot be used to check normality
This suggests that the only possibility to check the normality assumption is to fit
a more general model, with the classical linear mixed model as special case, and to
compare both models using LR methods
253
10.8
One possible extension of the linear mixed model is to assume a finite mixture as
random-effects distribution:
bi
g
X
j=1
pj N (j , D),
with
g
X
j=1
pj = 1 and
g
X
j=1
pj j = 0
Interpretation:
254
255
256
257
Chapter 11
General Guidelines for Model Building
. Introduction
. General strategy
. Example: The prostate data
258
11.1
Introduction
. Time effects
Covariance structure:
. Random effects
. Serial correlation
. Interactions
259
'
Mean structure Xi
&
Covariance structure Vi
Estimation of
Covariance matrix for c
260
261
262
The aim of this chapter is to discuss some general guidelines for model building.
263
11.2
General Strategy
Yi = Xi + Zibi + i
264
11.3
11.3.1
Strategy
Remove all systematic trends from the data, by calculating OLS residual profiles :
c
r i = y i Xi
OLS Zi bi + i
265
For balanced designs with many covariates, or for highly unbalanced data sets :
Selection of preliminary mean structures will be based on exploratory tools for the
mean.
c
Note that the calculation of
OLS ignores the longitudinal structure, and can be
obtained in any regression module
c
Provided the preliminary mean structure is sufficiently richt, consistency of
OLS
follows from the theory on robust inference for the fixed effects.
266
11.3.2
267
268
11.4
11.4.1
Stragegy
r i Z i bi + i
Explore the residual profiles
Any structure left, may indicate the presence of subject-specific regression
coefficients
Try to describe the each residual profile with a (relatively) simple model.
Introduction to Longitudinal Data Analysis
269
Do not include covariates in Zi which are not included in Xi. Otherwise, it is not
justified to assume E(bi ) = 0.
Use well-formulated models: Do not include higher-order terms unless all
lower-order terms are included as well.
Compare implied variance and covariance functions with results from exploratory
tools for covariance structure
270
11.4.2
271
Variance function:
1 t t2 D
1
t
t2
+ 2
272
. Small t: some subjects have extremely large responses close to diagnosis. This
may have inflated the fitted variance
273
11.5
11.5.1
Strategy
r i Z i bi + i
Which covariance matrix i for i ?
In many applications, random effects explain most of the variability
Therefore, in the presence of random effects other than intercepts, often
i = 2 Ini is assumed
Introduction to Longitudinal Data Analysis
274
275
276
277
When only random intercepts are included, the semi-variogram can be used to
explore the presence and the nature of serial correlation
When other random effects are present as well, an extension of the variogram is
needed.
Also, a variety of serial correlation functions can be fitted and compared.
278
11.5.2
REPEATED statement:
279
REML log-likelihood
Measurement error
31.235
24.787
24.266
280
Variance function:
1 t t2 D
1
t
t2
+ 2 + 2
281
282
11.6
Once an appropriate residual covariance model is obtained, one can try to reduce
the number of random effects in the preliminary random-effects structure
This is done based on inferential tools for variance components
283
11.7
Once an appropriate covariance model is obtained, one can try to reduce the
number of covariates in the preliminary mean structure
This is done based on inferential tools for fixed effects
In case there is still some doubt about the validity of the marginal covariance
structure, robust inference can be used to still obtain correct inferences.
284
11.8
Fixed effects estimates from the final model, under Gaussian serial correlation, and
without serial correlation:
Serial corr.
Effect
Age effect
No serial corr.
0.015 (0.006)
0.016 (0.006)
Intercepts:
0.496 (0.411) 0.564 (0.428)
0.320 (0.470) 0.275 (0.488)
Control
BPH
2
3
L/R cancer
1.216 (0.469)
1.099 (0.486)
Met. cancer
Time effects:
2.353 (0.518)
2.284 (0.531)
BPH
L/R cancer
Met. cancer
9
10
Time2 effects:
Cancer
Introduction to Longitudinal Data Analysis
14 = 15
0.510 (0.088)
285
Variance components estimates from the final model, under Gaussian serial
correlation, and without serial correlation:
Serial corr.
Effect
No serial corr.
Parameter
Estimate (s.e.)
Estimate (s.e.)
var(b1i)
var(b2i)
d11
d22
0.393 (0.093)
0.550 (0.187)
0.443 (0.093)
0.842 (0.203)
var(b3i)
d33
0.056 (0.028)
0.114 (0.035)
0.382 (0.114)
0.170 (0.070)
0.490 (0.124)
0.300 (0.082)
0.023 (0.002)
0.028 (0.002)
1/
0.029 (0.018)
0.599 (0.192)
Covariance of bi :
cov(b1i, b2i)
cov(b2i, b3i)
d12 = d21
d23 = d32
cov(b3i, b1i)
d13 = d31
0.098 (0.039)
13.704
0.148 (0.047)
20.165
286
Many standard errors are smaller under the model which includes the Gaussian
serial correlation component
Hence, adding the serial correlation leads to more efficient inferences for most
parameters in the marginal model.
287
11.9
288
As an example, consider the final model for the prostate data, with Gaussian serial
correlation
Estimated correlation matrix for variance components estimates:
c2
1.00 0.87
0.87
0.62
c 2
0.70 0.49
0.62 0.85
1.00
0.88 0.97
0.02
0.70 0.94
0.88
1.00 0.82
0.05
0.49
0.39 0.63
0.91
1.00 0.97
0.72 0.97
0.51
0.33 0.02
0.01
0.18
0.51 0.57
1.00
0.81
0.04
0.10
0.33 0.38
0.81
1.00
0.32
0.04
0.32
1.00
0.00 0.03
0.02
0.05 0.02
0.01
289
c
Small correlations between and the other estimates, except for 1/ .
Indeed, the serial correlation component vanishes for becoming infinitely large.
290
Chapter 12
Power Analyses under Linear Mixed Models
291
12.1
F =
L0 L
N
X
i=1
versus HA : L 6= 0
d
Xi0Vi1 ()X
i
rank(L)
L0
c
L
292
. Satterthwaite approximation
. Kenward and Roger approximation
. ...
In general (not necessarily under H0 ), F is approximately F distributed with the
same numbers of degrees of freedom, but with non-centrality parameter
= 0L0 L
N
X
i=1
d
Xi0Vi1 ()X
i
L0
293
c
Note that is equal to rank(L) F , and with
replaced by
The SAS procedure MIXED can therefore be used for the calculation of and the
related numbers of degrees of freedom.
294
12.2
Calculation in SAS
Construct a data set of the same dimension and with the same covariates and
factor values as the design for which power is to be calculated
Use as responses yi the average values Xi under the alternative model
The fixed effects estimate will then be equal to
() =
N
X
i=1
N
X
i=1
1
0
Xi Wi()Xi
1
0
Xi Wi()Xi
N
X
Xi0Wi()yi
N
X
Xi0Wi()Xi =
i=1
i=1
295
This calculated F value, and the associated numbers of degrees of freedom can be
saved and used afterwards for calculation of the power.
Note that this requires keeping the variance components in fixed, equal to the
assumed population values.
Steps in calculations:
The SAS functions finv and probf are used to calculated Fc and the power
Introduction to Longitudinal Data Analysis
296
12.3
Example 1
Re-consider the random-intercepts model previously discussed for the rat data:
Yij = (0 + b1i) + (1Li + 2Hi + 3Ci)tij + ij
in which tij equals ln[1 + (Ageij 45)/10)]
This model is fitted in SAS as follows:
proc mixed data = test;
class treat rat;
model y = treat*t / solution ddfm=kr;
random intercept / subject=rat;
contrast Equal slopes treat*t 1 -1 0, treat*t 1 0 -1;
run;
Introduction to Longitudinal Data Analysis
297
68
Low dose
High dose
7.5
Control
6.5
d11
3.6
1.4
Intercept
Time effects:
Covariance of bi:
var(b1i )
Residual variance:
var(ij )
298
The power of a design with 10 rats per treatment group is calculated as follows:
. Construction of data set with expected averages as response values:
data power;
do treat=1 to 3;
do rat=1 to 10;
do age=50 to 110 by 10;
t=log(1+(age-45)/10);
if treat=1 then y=68 + 7.5*t;
if treat=2 then y=68 + 7.0*t;
if treat=3 then y=68 + 6.5*t;
output;
end;
end;
end;
299
. Fit model, keeping the variance components equal to their true values:
proc mixed data = power noprofile;
class treat rat;
model y = treat*t ;
random intercept / subject=rat(treat);
parms (3.6) (1.4) / noiter;
contrast Equal slopes treat*t 1 -1 0,
treat*t 1 0 -1;
ods output contrasts=c;
run;
300
Output:
Label
Equal slopes
Num
DF
Den
DF
FValue
ProbF
alpha
ncparm
fc
power
177
4.73
0.0100
0.05
9.46367
3.04701
0.78515
301
Power
78.5%
82.5%
85.9%
88.7%
91.0%
92.9%
97.9%
302
12.4
Example 2
We continue the previous random-intercepts model and study the effect of varying
the variance components values
d11
3.2
3.6
4.0
1.0
89.3%
88.5%
87.9%
1.4
79.8%
78.5%
77.4%
1.8
71.9%
70.3%
68.9%
Conclusions:
303
12.5
Example 3
12.5.1
Introduction
304
1 + b1i + ij if treatment A
2 + b1i + ij if treatment B
Fixed effects:
Average treatment A
Average treatment B
var(b1i )
d11
var(ij )
?
?
d11 + 2
Variance components:
305
Hence, the individual variance components are unknown. Only the total variability
is known to equal 4.
Power analyses will be performed for several values for the intraclass correlation
I = d11/(d11 + 2)
306
12.5.2
We now consider the situation in which the treatments will be randomly assigned
to GPs, and all subjects with the same GP will be treated identically.
Powers for 2 25 = 50 GPs, each treating 10 subjects ( = 0.05):
I
Power
0.25
86%
0.50
65%
0.75
50%
307
12.5.3
We now consider the situation in which the treatments will be randomly assigned
to subjects within GPs, with the same number n/2 of subjects assigned to both
treatments
Powers for 2 5 = 10 subjects within 10 GPs ( = 0.05):
I
Power
0.25
81%
0.50
94%
0.75
100%
308
12.5.4
Conclusion
Within-subject correlation
increases power for inferences on within-subject effects,
but decreases power for inferences on between-subject effects
309
Part II
Marginal Models for Non-Gaussian Longitudinal Data
310
Chapter 13
The Toenail Data
311
Research question:
Severity relative to treatment of TDO ?
312
313
Chapter 14
The Analgesic Trial
single-arm trial with 530 patients recruited (491 selected for analysis)
analgesic treatment for pain caused by chronic nonmalignant disease
treatment was to be administered for 12 months
we will focus on Global Satisfaction Assessment (GSA)
GSA scale goes from 1=very good to 5=very bad
GSA was rated by each subject 4 times during the trial, at months 3, 6, 9, and 12.
Introduction to Longitudinal Data Analysis
314
Research questions:
. Investigation of dropout
Frequencies:
GSA
Month 3
Month 6
Month 9
Month 12
55 14.3%
38 12.6%
40 17.6%
30 13.5%
112 29.1%
84 27.8%
67 29.5%
66 29.6%
76 33.5%
97 43.5%
27 12.1%
52 13.5%
51 16.9%
33 14.5%
15
14
11
Tot
385
3.9%
302
4.6%
227
4.9%
1.4%
223
315
Missingness:
Measurement occasion
Month 3 Month 6 Month 9 Month 12
Number
163
41.2
Completers
O
O
Dropouts
51
12.91
51
12.91
63
15.95
Non-monotone missingness
30
7.59
1.77
0.51
18
4.56
0.51
0.25
0.25
0.76
316
Chapter 15
The National Toxicology Program (NTP) Data
. DEHP: di(2-ethyhexyl)-phtalate
. EG: ethylene glycol
. DYME: diethylene glycol dimethyl ether
317
Implanted fetuses:
. death/resorbed
Data structure:
dam
@
@
. viable:
weight
malformations: visceral,
skeletal, external
@
@
@
R
@
. . . implant (mi). . .
@
@
@
@
@
R
@
A
A
A
A
UA
A
A
A
A
UA
A
A
A
UA
1 ... K
Introduction to Longitudinal Data Analysis
318
Litter
# Dams, 1
Exposure Dose Impl.
EG
DEHP
DYME
Viab.
Size
Malformations
25
25
297
11.9
0.0
0.0
0.3
750
24
24
276
11.5
1.1
0.0
8.7
1500
23
22
229
10.4
1.7
0.9
36.7
3000
23
23
226
9.8
7.1
4.0
55.8
30
30
330
13.2
0.0
1.5
1.2
44
26
26
288
11.1
1.0
0.4
0.4
91
26
26
277
10.7
5.4
7.2
4.3
191
24
17
137
8.1
17.5 15.3
18.3
292
25
50
5.6
54.0 50.0
48.0
21
21
282
13.4
0.0
0.0
0.0
62.5
20
20
225
11.3
0.0
0.0
0.0
125
24
24
290
12.1
1.0
0.0
1.0
250
23
23
261
11.3
2.7
0.1
20.0
500
22
22
141
6.1
66.0 19.9
79.4
319
Chapter 16
Generalized Linear Models
. The model
. Maximum likelihood estimation
. Examples
. McCullagh and Nelder (1989)
320
16.1
321
16.2
f (y|, )dy
Z
2
2
f (y|, ) dy = 0
Z
f (y|, ) dy = 0
322
[y 0()] f (y|, ) dy = 0
[1 (y 0())2 00()] f (y|, ) dy = 0
E(Y ) = 0()
Var(Y ) = 00()
Note that, in general, the mean and the variance are related:
00
"
Var(Y ) =
0 1
() = v()
323
324
16.3
Examples
16.3.1
Model:
Y N (, 2 )
Density function:
1
1
2
f (y|, ) =
exp 2 (y )
2
= exp
Introduction to Longitudinal Data Analysis
1
y
+
2
ln(2 )
y
2
2
325
Exponential family:
.=
. = 2
. () = 2 /2
. c(y, ) =
ln(2)
2
y2
2
. v() = 1
Note that, under this normal model, the mean and variance are not related:
v() = 2
The link function is here the identity function: =
Introduction to Longitudinal Data Analysis
326
16.3.2
Model:
Y Bernoulli()
Density function:
f (y|, ) = y (1 )1y
= exp {y ln + (1 y) ln(1 )}
= exp y ln
+ ln(1 )
1
327
Exponential family:
. = ln
.=1
. () = ln(1 ) = ln(1 + exp())
. c(y, ) = 0
Mean and variance function:
.=
exp
1+exp
. v() =
exp
(1+exp )2
= (1 )
Note that, under this model, the mean and variance are related:
v() = (1 )
The link function here is the logit link: = ln
Introduction to Longitudinal Data Analysis
328
16.3.3
Model:
Y Poisson()
Density function:
e y
f (y|, ) =
y!
= exp{y ln ln y!}
329
Exponential family:
. = ln
.=1
. () = = exp
. c(y, ) = ln y!
Mean and variance function:
. = exp =
. v() = exp =
Note that, under this model, the mean and variance are related:
v() =
The link function is here the log link: = ln
Introduction to Longitudinal Data Analysis
330
16.4
331
16.5
Log-likelihood:
`(, ) =
1X
[yi i (i )] +
i
c(yi , )
i
[yi 0(i )]
i
[yi 0(i)] = 0
332
i 1
vi (yi i) = 0
Note that the estimation of depends on the density only through the means i
and the variance functions vi = v(i).
333
334
1
N p
d )2 /v (
d
(yi
i
i i)
335
16.6
Early dropout (did the subject drop out after the first or the second visit) ?
Binary response
PROC GENMOD can fit GLMs in general
PROC LOGISTIC can fit models for binary (and ordered) responses
SAS code for logit link:
proc genmod data=earlydrp;
model earlydrp = pca0 weight psychiat physfct / dist=b;
run;
proc logistic data=earlydrp descending;
model earlydrp = pca0 weight psychiat physfct;
run;
336
Selected output:
Analysis Of Parameter Estimates
Parameter DF
Intercept
PCA0
WEIGHT
PSYCHIAT
PHYSFCT
Scale
1
1
1
1
1
0
Estimate
Standard
Error
-1.0673
0.3981
-0.0211
0.7169
0.0121
1.0000
0.7328
0.1343
0.0072
0.2871
0.0050
0.0000
Wald 95%
Confidence Limits
-2.5037
0.1349
-0.0353
0.1541
0.0024
1.0000
0.3690
0.6614
-0.0070
1.2796
0.0219
1.0000
0.1453
0.0030
0.0034
0.0125
0.0145
337
Chapter 17
Parametric Modeling Families
. Continuous outcomes
. Longitudinal generalized linear models
. Notation
338
17.1
Continuous Outcomes
Marginal Models:
E(Yij |xij ) = x0ij
Random-Effects Models:
E(Yij |bi, xij ) = x0ij + z 0ij bi
Transition Models:
E(Yij |Yi,j1 , . . . , Yi1, xij ) = x0ij + Yi,j1
339
17.2
. Introduction of non-linearity
. No easy transfer between model families
crosssectional
longitudinal
normal outcome
linear model
LMM
non-normal outcome
GLM
340
17.3
Notation
Chapter 18
Conditional Models
. A log-linear model
. Quadratic version of the model
. Linear version of the model
. Clustered-data versions of the model
. Transition models
342
18.1
A Log-linear Model
Cox (1972)
Joint distribution of Y i in terms of a multivariate exponential family:
f (y i, i) = exp
n
X
j=1
ij yij +
= c( i) exp
n
X
j=1
j1 <j2
ij yij +
j1 <j2
343
Interpretation of Parameters:
. The parameters have a conditional interpretation:
ij = ln
Pr(Yij = 1|Yik = 0; k =
6 j)
Pr(Yij = 0|Yik = 0; k =
6 j)
ijk = ln
344
Advantages:
Drawbacks:
. Due to above conditional interpretation, the models are less useful for
regression.
The dependence of E(Yij ) on covariates involves all parameters, not only the main effects.
345
18.2
Cox (1972) and others suggest that often the higher order interactions can be
neglected. This claim is supported by empirical evidence.
The quadratic exponential model:
f (y i, i) = exp
n
X
j=1
ij yij +
= c( i) exp
n
X
j=1
j1 <j2
ij yij +
n
X
j=1
j1 <j2
ij yij A( i)
346
18.3
ij = i = 0 + ddi
Using
Zi =
we obtain
f (zi|i, a) =
Introduction to Longitudinal Data Analysis
and
ni
X
j=1
ij1 j2 = a
Yij
ni
exp { z + z (n z ) A( )}
i i
a i i
i
i
zi
347
18.4
Transition Models
348
Assume
then
i1 N (0, 2 )
and
ij N (0, 2 (1 2))
0
cov(Yij , Yij 0 ) = |j j| 2
= a marginal multivariate normal model with AR(1) variance-covariance matrix.
For non-Gaussian outcomes, first write
and then
cij = E(Yij |hij )
v c(cij ) = var(Yij |hij )
349
n
Yi
j=2
n
Yi
j=q+1
f (yij |hij )
350
The marginal means and variances do not follow easily, except in the normal case.
Recursive formulas are:
351
18.4.1
logit
Yij Bernoulli(ij )
ij
= 0 + 1 Ti + 2tij + 3 Ti tij + 1 yi,j1
1 ij
ij
logit
= 0 + 1Ti + 2 tij + 3 Ti tij + 1 yi,j1 + 2 yi,j2
1 ij
Introduction to Longitudinal Data Analysis
352
Fitted models:
First order
Effect
Par.
II
Second order
Intercept
-3.14 (0.27)
-3.77 (0.34)
-3.28 (0.34)
Ti
0.00 (0.31)
-0.08 (0.32)
0.13 (0.39)
tij
-0.09 (0.04)
0.03 (0.05)
-0.05 (0.04)
Ti tij
-0.08 (0.06)
-0.06 (0.06)
-0.09 (0.07)
Dep. on Yi,j1
4.48 (0.22)
3.59 (0.29)
4.01 (0.39)
Dep. on Yi,j1
1a
Dep. on Yi,j2
1.56 (0.35)
0.25 (0.38)
353
logit
ij
= (00 + 10 Ti + 20 tij + 30 Ti tij )IYi,j1 =0
1 ij
Fitted model:
Yi,j1 = 0
Yi,j1 = 1
Effect
Par.
Estimate (s.e.)
Par.
Estimate (s.e.)
Intercept
00
-3.92 (0.56)
01
1.56 (1.26)
Ti
10
0.45 (0.70)
11
-0.01 (0.37)
tij
20
-0.06 (0.09)
21
-0.20 (0.06)
Ti tij
30
0.07 (0.10)
31
0.04 (0.07)
354
18.4.2
Prepare models so that the previous outcome can be used as a covariate (using
the same code as used to fit a model for dropout see Part V)
%dropout(data=test,id=idnum,time=time,response=onyresp,out=test2);
data test2a;
set test2;
prev1=prev;
drop prev;
run;
%dropout(data=test2a,id=idnum,time=time,response=prev1,out=test3);
data test3a;
set test3;
prev2=prev;
drop prev;
run;
355
idnum
time
1
2
3
4
5
6
7
1
1
1
1
1
1
1
0
1
2
3
6
9
12
1
1
1
0
0
0
0
.
1
1
1
0
0
0
.
.
1
1
1
0
0
356
When both predecessors are used, one merely adds prev2 to MODEL statement:
model onyresp = prev1 treatn*prev1 time*prev1 treatn*time*prev1
/ noint dist=binomial;
357
Chapter 19
Full Marginal Models
. Introduction
. Link functions
. Associations
. Bahadur model
. Multivariate Probit model
. Example: POPS data
358
19.1
Introduction
Choices to make:
. Degree of modeling:
joint distribution fully specified likelihood procedures
only a limited number of moments e.g., generalized estimating equations
Minimally, one specifies:
and
i(i) = X i
359
19.2
ij = 1
1 (ij ).
360
19.3
Pairwise Association
ijk ij ik
ij (1 ij )ik (1 ik )
(Fishers z transform)
361
ijk = ln(ijk )
362
19.4
ijk ij ik
.
1/2
[ij (1 ij )ik (1 ik )]
363
Let
ij =
and
Yij ij
ij (1 ij )
and
eij =
yij ij
,
ij (1 ij )
ijk = E(ij ik ),
ijk` = E(ij ik i` ),
..
i12...ni = E(i1 i2 . . . ini ).
364
A general expression:
with
f1(y i) =
n
Yi
j=1
ijij (1 ij )1yij
and
c(y i) = 1 +
j<k
j<k<`
365
19.5
366
19.6
. Bilirubin value
. Neonatal seizures
. Congenital malformations
367
19.7
Probit
Dale-Norm
Dale-Logist
3.67(0.49)
2.01(0.26)
2.03(0.27)
3.68(0.52)
Neonatal seiz.
-1.94(0.42)
-1.12(0.26)
-1.16(0.26)
-2.06(0.44)
Congenital malf.
-1.21(0.31)
-0.61(0.18)
-0.62(0.18)
-1.17(0.33)
100 Bilirubin
-0.69(0.25)
-0.32(0.14)
-0.32(0.14)
-0.64(0.27)
4.03(0.51)
2.19(0.27)
2.21(0.27)
4.01(0.54)
Neonatal seiz.
-2.26(0.43)
-1.27(0.26)
-1.29(0.26)
-2.28(0.44)
Congenital malf.
-1.08(0.32)
-0.56(0.19)
-0.59(0.19)
-1.11(0.34)
100 Bilirubin
-0.85(0.26)
-0.42(0.14)
-0.41(0.14)
-0.80(0.27)
3.32(0.50)
1.84(0.27)
1.91(0.27)
3.49(0.54)
Neonatal seiz.
-1.55(0.44)
-0.88(0.27)
-0.93(0.27)
-1.70(0.46)
Congenital malf.
-0.96(0.32)
-0.47(0.19)
-0.49(0.19)
-0.96(0.35)
100 Bilirubin
-0.44(0.26)
-0.21(0.14)
-0.24(0.14)
-0.49(0.28)
368
Bahad
Probit
Dale-Norm
Dale-Logist
Association parameters
(1,2): or
0.27(0.05)
0.73(0.05)
17.37(5.19)
17.35(5.19)
(1,2): z() or ln
0.55(0.11)
1.85(0.23)
2.85(0.30)
2.85(0.30)
(1,3): or
0.39(0.05)
0.81(0.04)
30.64(9.78)
30.61(9.78)
(1,3): z() or ln
0.83(0.12)
2.27(0.25)
3.42(0.32)
3.42(0.32)
(2,3): or
0.23(0.05)
0.72(0.05)
17.70(5.47)
17.65(5.47)
(2,3): z() or ln
0.47(0.10)
1.83(0.23)
2.87(0.31)
2.87(0.31)
(1,2,3): or
0.91(0.69)
0.92(0.69)
(1,2,3): z() or ln
-0.09(0.76)
-0.09(0.76)
-598.44
-570.69
-567.11
-567.09
Log-likelihood
369
Chapter 20
Generalized Estimating Equations
. General idea
. Asymptotic properties
. Working correlation
. Special case and application
. SAS code and output
370
20.1
General Idea
with
vi = Var(Yi )
In longitudinal setting: Y = (Y 1, . . . , Y N ):
S() =
XX
i j
ij 1
v (yij ij ) =
ij
N
X
i=1
Di0 [Vi()]1 (y i i) = 0
where
ij
. Di is an ni p matrix with (i, j)th elements
371
1/2
1/2
Ai () =
vi1(i1()) . . .
..
...
0
0
..
r
Same form as for full likelihood procedure, but we restrict specification to the first
moment only
Liang and Zeger (1986)
Introduction to Longitudinal Data Analysis
372
20.2
As N
where
) N (0, I 1)
N(
0
I0 =
N
X
i=1
(Unrealistic) Conditions:
. is known
373
20.3
N
X
i=1
BUT
suppose Vi(.) is not the true variance of Y i but only a plausible guess, a so-called
working correlation matrix
specify correlations and not covariances, because the variances follow from the
mean structure
the score equations are solved as before
Introduction to Longitudinal Data Analysis
374
) N (0, I 1 I1I 1 )
N(
0
0
I0 =
I1 =
N
X
Di0 [Vi()]1 Di
N
X
i=1
i=1
375
376
20.4
1/2
377
20.5
Liang and Zeger (1986) proposed moment-based estimates for the working correlation.
Independence
Corr(Yij , Yik )
Estimate
Dispersion parameter:
Exchangeable
AR(1)
Unstructured
jk
P
1
1 PN
e e
i=1
N
ni (ni 1) j6=k ij ik
1 PN
1 P
e e
N i=1 ni 1 jni 1 ij i,j+1
jk =
1
=
N
ni
1 X
e2ij .
i=1 ni j=1
N
X
1 PN
e e
N i=1 ij ik
378
20.6
Fitting GEE
1/2
N
X
i=1
1 N
Di0 Vi1 Di
i=1
Di0 Vi1(y i i) .
379
20.7
Estimate for :
c
() =
N
X
i=1
1
0
Xi W i Xi
N
X
i=1
Xi0WiYi
E () =
N
X
i=1
1
0
Xi W i Xi
N
X
i=1
Xi0WiXi =
380
c
Conditional on ,
has covariance
c
Var()
=
N
X
i=1
Xi0 Wi Xi
N
X
i=1
N
X
i=1
Xi0 Wi Xi
N
X
i=1
Xi0 Wi Xi
Note that this model-based version assumes that the covariance matrix
Var(Yi ) is correctly modelled as Vi = ZiDZi0 + i .
An empirically corrected version is:
c
Var() =
N
X
1 N
Xi0WiXi
i=1
{z
BREAD
}|
i=1
Xi0WiVar(Yi )WiX
{z
MEAT
N
X
i
}|
i=1
1
0
Xi W i Xi
{z
BREAD
381
Chapter 21
A Family of GEE Methods
. Classical approach
. Prentices two sets of GEE
. Linearization-based version
. GEE2
. Alternating logistic regressions
382
21.1
Prentices GEE
N
X
i=1
Di0 Vi1 (Y i
i) = 0,
N
X
i=1
Ei0Wi1(Z i i) = 0
where
Zijk =
(Yij ij )(Yik ik )
,
ij (1 ij )ik (1 ik )
ijk = E(Zijk )
) normal with
The joint asymptotic distribution of N ( ) and N (
variance-covariance matrix consistently estimated by
A 0
B C
11 12 A B 0
21 22
0 C
383
where
A =
B =
C =
N
X
i=1
N
X
i=1
N
X
i=1
Di0 Vi1 Di
Ei0Wi1 Ei
Ei0Wi1 Ei
11 =
N
X
N
X
i=1
N
X
N
Z i X
Ei0 Wi1
Di0 Vi1 Di
i=1
i=1
12 =
i=1
21 = 12,
22 =
N
X
i=1
and
Statistic
Estimator
Var(Y i)
(Y i i)(Y i i)0
(Z i i)(Z i i )0
384
21.2
21.2.1
Model formulation
y i = i + i
with
i = g(i),
i = X i,
Var(y i) = Var(i) = i .
385
21.2.2
Estimation
Solve iteratively:
N
X
i=1
Xi0WiXi =
N
X
i=1
Wiy i ,
where
Wi = Fi01
i Fi ,
i
Fi =
,
i
i + (y i
i )Fi1,
y i =
i = Var(),
i = E(y i).
Remarks:
386
21.2.3
1/2
1/2
i = Ai ()Ri()Ai ()
387
21.3
GEE2
Model:
388
21.4
Diggle, Heagerty, Liang, and Zeger (2002) and Molenberghs and Verbeke (2005)
When marginal odds ratios are used to model association, can be estimated
using ALR, which is
. almost as efficient as GEE2
. almost as easy (computationally) than GEE1
ijk as before and ijk = ln(ijk ) the marginal log odds ratio:
logit Pr(Yij = 1|xij ) = xij
ij ijk
1 ij ik + ijk
389
390
21.5
21.5.1
The model
ij
log
= 0 + 1Ti + 2 tij + 3 Ti tij
1 ij
391
21.5.2
Standard GEE
SAS Code:
proc genmod data=test descending;
class idnum timeclss;
model onyresp = treatn time treatn*time
/ dist=binomial;
repeated subject=idnum / withinsubject=timeclss
type=exch covb corrw modelse;
run;
SAS statements:
392
Selected output:
. Regression parameters:
Analysis Of Initial Parameter Estimates
Parameter
Intercept
treatn
time
treatn*time
Scale
DF
Estimate
Standard
Error
1
1
1
1
0
-0.5571
0.0240
-0.1769
-0.0783
1.0000
0.1090
0.1565
0.0246
0.0394
0.0000
Wald 95%
Confidence Limits
-0.7708
-0.2827
-0.2251
-0.1556
1.0000
-0.3433
0.3307
-0.1288
-0.0010
1.0000
ChiSquare
26.10
0.02
51.91
3.95
. Estimates from fitting the model, ignoring the correlation structure, i.e.,
from fitting a classical GLM to the data, using proc GENMOD.
. The reported log-likelihood also corresponds to this model, and therefore
should not be interpreted.
. The reported estimates are used as starting values in the iterative estimation
procedure for fitting the GEEs.
393
Parameter
Standard
Estimate
Error
Intercept
-0.5840
treatn
0.0120
time
-0.1770
treatn*time -0.0886
0.1734
0.2613
0.0311
0.0571
95% Confidence
Limits
-0.9238 -0.2441
-0.5001 0.5241
-0.2380 -0.1161
-0.2006 0.0233
Z Pr > |Z|
-3.37
0.05
-5.69
-1.55
0.0008
0.9633
<.0001
0.1208
Estimate
Standard
Error
Intercept
-0.5840
treatn
0.0120
time
-0.1770
treatn*time -0.0886
0.1344
0.1866
0.0209
0.0362
Parameter
95% Confidence
Limits
-0.8475 -0.3204
-0.3537 0.3777
-0.2180 -0.1361
-0.1596 -0.0177
Z Pr > |Z|
-4.34
0.06
-8.47
-2.45
<.0001
0.9486
<.0001
0.0143
0.420259237
394
21.5.3
type=exch logor=exch
Note that now is a genuine parameter
395
Selected output:
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter
Standard
Estimate
Error
Intercept
-0.5244
treatn
0.0168
time
-0.1781
treatn*time -0.0837
Alpha1
3.2218
0.1686
0.2432
0.0296
0.0520
0.2908
95% Confidence
Limits
-0.8548 -0.1940
-0.4599 0.4935
-0.2361 -0.1200
-0.1856 0.0182
2.6519
3.7917
Z Pr > |Z|
-3.11
0.07
-6.01
-1.61
11.08
0.0019
0.9448
<.0001
0.1076
<.0001
Parameter
Standard
Estimate
Error
Intercept
-0.5244
treatn
0.0168
time
-0.1781
treatn*time -0.0837
0.1567
0.2220
0.0233
0.0392
95% Confidence
Limits
-0.8315 -0.2173
-0.4182 0.4519
-0.2238 -0.1323
-0.1606 -0.0068
Z Pr > |Z|
-3.35
0.08
-7.63
-2.13
0.0008
0.9395
<.0001
0.0329
396
21.5.4
GLIMMIX macro:
%glimmix(data=test, procopt=%str(method=ml empirical),
stmts=%str(
class idnum timeclss;
model onyresp = treatn time treatn*time / solution;
repeated timeclss / subject=idnum type=cs rcorr;
),
error=binomial,
link=logit);
GLIMMIX procedure:
proc glimmix data=test method=RSPL empirical;
class idnum;
model onyresp (event=1) = treatn time treatn*time
/ dist=binary solution;
random _residual_ / subject=idnum type=cs;
run;
397
398
21.5.5
Par.
IND
EXCH
UN
GEE1
Int.
-0.557(0.109;0.171)
-0.584(0.134;0.173)
-0.720(0.166;0.173)
Ti
0.024(0.157;0.251)
0.012(0.187;0.261)
0.072(0.235;0.246)
tij
-0.177(0.025;0.030)
-0.177(0.021;0.031)
-0.141(0.028;0.029)
Ti tij
-0.078(0.039;0.055)
-0.089(0.036;0.057)
-0.114(0.047;0.052)
ALR
Int.
-0.524(0.157;0.169)
Ti
0.017(0.222;0.243)
tij
-0.178(0.023;0.030)
Ti tij
-0.084(0.039;0.052)
Ass.
3.222(
;0.291)
-0.557(0.112;0.171)
-0.585(0.142;0.174)
-0.630(0.171;0.172)
Ti
0.024(0.160;0.251)
0.011(0.196;0.262)
0.036(0.242;0.242)
tij
-0.177(0.025;0.030)
-0.177(0.022;0.031)
-0.204(0.038;0.034)
Ti tij
-0.078(0.040:0.055)
-0.089(0.038;0.057)
-0.106(0.058;0.058)
399
21.5.6
Discussion
GEE1: All empirical standard errors are correct, but the efficiency is higher for the
more complex working correlation structure, as seen in p-values for Ti tij effect:
Structure p-value
IND
0.1515
EXCH
0.1208
UN
0.0275
Thus, opting for reasonably adequate correlation assumptions still pays off, in
spite of the fact that all are consistent and asymptotically normal
Similar conclusions for linearization-based method
Introduction to Longitudinal Data Analysis
400
401
Part III
Generalized Linear Mixed Models for Non-Gaussian
Longitudinal Data
402
Chapter 22
The Beta-binomial Model
403
22.1
ni
X
j=1
Yij
The density of Zi, given bi is binomial with ni trials and success probability bi.
Introduction to Longitudinal Data Analysis
404
The beta-binomial model assumes the bi to come from a beta distribution with
parameters and :
b1
(1 bi)1
i
f (bi|, ) =
B(, )
B(., .): the beta function
and can depend on covariates, but this dependence is temporarily dropped
from notation
405
22.2
ni
zi
ni B(zi + , ni zi + )
B(, )
zi
406
= (1 )(1 1)
Mean
Correlation
Variance
i = E(Zi) = ni
= Corr(Yij , Yik ) =
1
++1
407
fi(zi|, ) =
B[(
zi
When there are covariates (e.g., sub-populations, dose groups), rewrite and/or
as i and/or i, respectively.
It is then easy to formulate a model through the marginal parameters i and i:
. i can be modeled through, e.g., a logit link
408
Chapter 23
Generalized Linear Mixed Models (GLMM)
409
23.1
bi N (0, D)
410
23.2
Given a vector bi of random effects for cluster i, it is assumed that all responses
Yij are independent, with density
Let fij (yij |bi, , ) denote the conditional density of Yij given bi, the conditional
density of Yi equals
fi(yi |bi, , ) =
Introduction to Longitudinal Data Analysis
n
Yi
j=1
Z
Z
j=1
N
Y
i=1
N
Y
i=1
fi(yi |, D, )
Z
n
Yi
j=1
412
Under the normal linear model, the integral can be worked out analytically.
In general, approximations are required:
. Approximation of integrand
. Approximation of data
. Approximation of integral
Predictions of random effects can be based on the posterior distribution
f (bi|Yi = yi)
Empirical Bayes (EB) estimate:
Posterior mode, with unknown parameters replaced by their MLE
413
23.3
I (2)
00
c 1/2
Q (b)
Q(b)
414
23.4
Approximation of Data
23.4.1
General Idea
c
. Penalized quasi-likelihood (PQL): Around current
and bdi
c
. Marginal quasi-likelihood (MQL): Around current
and bi = 0
415
23.4.2
c
Linear Taylor expansion around current
and bdi:
c
c
c
c
Yij h(x0ij
+ z 0ij bci) + h0 (x0ij
+ z 0ij bci)x0ij ( )
+ h0 (x0ij
+ z 0ij bci)z 0ij (bi bci) + ij
c
c
c + v(
c )x0 ( )
c )z 0 (b b
+ v(
ij
ij
ij ij i
i ) + ij
ij
c
d
d
d
d + V X ( ) + V Z (b b ) +
In vector notation: Yi
i i
i i i
i
i
i
c
d
d 1
d ) + X + Z b
Yi V
(Y
i
i
i i Xi + Z i b i + i ,
i
i
Model fitting by iterating between updating the pseudo responses Yi and fitting
the above linear mixed model to them.
Introduction to Longitudinal Data Analysis
416
23.4.3
c
Linear Taylor expansion around current
and bi = 0:
c
c
c
0
0
0 c 0
Yij h(x0ij )
+ h0 (x0ij )x
ij ( ) + h (xij )z ij bi + ij
c
c + v(
c )x0 ( )
c )z 0 b +
+ v(
ij
ij
ij ij i
ij
ij
c
d
d
d + V X ( ) + V Z b +
In vector notation: Yi
i i
i i i
i
i
c
d 1
d ) + X X + Z b +
Yi V
(Y
i
i
i
i i
i
i
i
Model fitting by iterating between updating the pseudo responses Yi and fitting
the above linear mixed model to them.
Introduction to Longitudinal Data Analysis
417
23.4.4
418
23.5
Approximation of Integral
f (z)(z)dz
f (z)(z)dz
Q
X
q=1
wq f (zq )
Q is the order of the approximation. The higher Q the more accurate the
approximation will be
419
The nodes (or quadrature points) zq are solutions to the Qth order Hermite
polynomial
The wq are well-chosen weights
The nodes zq and weights wq are reported in tables. Alternatively, an algorithm is
available for calculating all zq and wq for any value Q.
With Gaussian quadrature, the nodes and weights are fixed, independent of
f (z)(z).
With adaptive Gaussian quadrature, the nodes and weights are adapted to
the support of f (z)(z).
420
Graphically (Q = 10):
421
Typically, adaptive Gaussian quadrature needs (much) less quadrature points than
classical Gaussian quadrature.
On the other hand, adaptive Gaussian quadrature is much more time consuming.
Adaptive Gaussian quadrature of order one is equivalent to Laplace transformation.
Ample detail can be found in Molenberghs and Verbeke (2005, Sections 14.314.5)
422
23.6
log
ij
= 0 + bi + 1 Ti + 2 tij + 3 Ti tij
1 ij
Notation:
423
Results:
Gaussian quadrature
Q=3
Q=5
Q = 10
Q = 20
Q = 50
0 -1.52 (0.31) -2.49 (0.39) -0.99 (0.32) -1.54 (0.69) -1.65 (0.43)
1 -0.39 (0.38)
0.19 (0.36)
2 -0.32 (0.03) -0.38 (0.04) -0.38 (0.05) -0.40 (0.05) -0.40 (0.05)
3 -0.09 (0.05) -0.12 (0.07) -0.15 (0.07) -0.14 (0.07) -0.16 (0.07)
2.26 (0.12)
3.09 (0.21)
4.53 (0.39)
3.86 (0.33)
4.04 (0.39)
2`
1344.1
1259.6
1254.4
1249.6
1247.7
Q=5
Q = 10
Q = 20
Q = 50
0 -2.05 (0.59) -1.47 (0.40) -1.65 (0.45) -1.63 (0.43) -1.63 (0.44)
1 -0.16 (0.64) -0.09 (0.54) -0.12 (0.59) -0.11 (0.59) -0.11 (0.59)
2 -0.42 (0.05) -0.40 (0.04) -0.41 (0.05) -0.40 (0.05) -0.40 (0.05)
3 -0.17 (0.07) -0.16 (0.07) -0.16 (0.07) -0.16 (0.07) -0.16 (0.07)
4.51 (0.62)
3.70 (0.34)
4.07 (0.43)
4.01 (0.38)
4.02 (0.38)
2`
1259.1
1257.1
1248.2
1247.8
1247.8
424
Conclusions:
425
QUAD
PQL
MQL
Intercept group A
Intercept group B
Slope group A
Slope group B
15.99 (3.02)
4.71 (0.60)
2.49 (0.29)
426
Chapter 24
Fitting GLMMs in SAS
427
24.1
428
Cov Parm
Subject
Intercept
idnum
Estimate
Standard
Error
4.7095
0.6024
Effect
Estimate
Standard
Error
Intercept
treatn
time
treatn*time
-0.7204
-0.02594
-0.2782
-0.09583
0.2370
0.3360
0.03222
0.05105
DF
t Value
Pr > |t|
292
1612
1612
1612
-3.04
-0.08
-8.64
-1.88
0.0026
0.9385
<.0001
0.0607
429
24.2
430
-1.5311
-0.4294
-0.3107
-0.07539
2.2681
0.2961
0.3728
0.03373
0.04998
0.1220
DF
t Value
Pr > |t|
Alpha
Lower
Upper
Gradient
293
293
293
293
293
-5.17
-1.15
-9.21
-1.51
18.58
<.0001
0.2503
<.0001
0.1325
<.0001
0.05
0.05
0.05
0.05
0.05
-2.1139
-1.1631
-0.3771
-0.1738
2.0279
-0.9483
0.3043
-0.2443
0.02298
2.5083
2.879E-7
-2.11E-6
-0.00003
-0.00003
-3.6E-6
431
432
24.2.1
433
434
24.2.2
NLMIXED statement:
435
MODEL statement:
. valid distributions:
normal(m,v): Normal with mean m and variance v
binary(p): Bernoullie with probability p
binomial(n,p): Binomial with count n and probability p
poisson(m): Poisson with mean m
general(ll): General model with log-likelihood ll
. since no factors can be defined, explicit creation of dummies is required
RANDOM statement:
436
Part IV
Marginal Versus Random-effects Models and Case Studies
437
Chapter 25
Marginal Versus Random-effects Models
438
25.1
We compare our GLMM results for the toenail data with those from fitting GEEs
(unstructured working correlation):
GLMM
Parameter
Estimate (s.e.)
GEE
Estimate (s.e.)
Slope group B
439
ij
= 0 + bi + 1 tij
1 ij
. The conditional means E(Yij |bi), as functions of tij , are given by
Yij |bi Bernoulli(ij ),
log
E(Yij |bi)
=
exp(0 + bi + 1tij )
1 + exp(0 + bi + 1tij )
440
. The marginal average evolution is now obtained from averaging over the
random effects:
exp(0 + bi + 1tij )
1 + exp(0 + bi + 1tij )
exp(0 + 1tij )
6
=
1 + exp(0 + 1tij )
441
RE
= c2 2 + 1 > 1,
c = 16 3/(15)
442
For the toenail application, was estimated as 4.0164, such that the ratio equals
c2 2 + 1 = 2.5649.
The ratios between the GLMM and GEE estimates are:
GLMM
Parameter
Estimate (s.e.)
GEE
Estimate (s.e.)
Ratio
Slope group B
Note that this problem does not occur in linear mixed models:
. Conditional mean: E(Yi |bi) = Xi + Zibi
. Specifically: E(Yi |bi = 0) = Xi
443
E[g(Y )] 6= g[E(Y )]
So, whenever the random effects enter the conditional mean in a non-linear way,
the regression parameters in the marginal model need to be interpreted differently
from the regression parameters in the mixed model.
In practice, the marginal mean can be derived from the GLMM output by
integrating out the random effects.
This can be done numerically via Gaussian quadrature, or based on sampling
methods.
Introduction to Longitudinal Data Analysis
444
25.2
P (Yij = 1)
exp(1.6308 + bi 0.4043tij )
,
1 + exp(1.6308 + bi 0.4043tij )
exp(1.7454 + bi 0.5657tij )
,
E
1 + exp(1.7454 + bi 0.5657tij )
445
446
exp(0.7219 0.1409tij )
1 + exp(0.7219 0.1409tij )
exp(0.6493 0.2548tij )
1 + exp(0.6493 0.2548tij )
447
In a GLMM context, rather than plotting the marginal averages, one can also plot
the profile for an average subject, i.e., a subject with random effect bi = 0:
P (Yij = 1|bi = 0)
exp(1.6308 0.4043tij )
1 + exp(1.6308 0.4043tij )
exp(1.7454 0.5657tij )
1 + exp(1.7454 0.5657tij )
448
25.3
QUAD
PQL
MQL
GEE
Intercept group A
Intercept group B
Slope group A
Slope group B
15.99 (3.02)
4.71 (0.60)
2.49 (0.29)
Conclusion:
|GEE| < |MQL| < |PQL| < |QUAD|
449
Model Family
.
marginal
model
inference
.
&
likelihood GEE
&
random-effects
model
inference
.
&
marginal hierarchical
( , bi)
RE
RE
450
Chapter 26
Case Study: The NTP Data
. Research question
. Conditional model
. Bahadur model
. GEE1 analyses
. GEE2 analysis
. Alternating logistic regressions
. Beta-binomial model
. Generalized linear mixed model
. Discussion
Introduction to Longitudinal Data Analysis
451
26.1
Research Question
452
26.2
Conditional Model
Regression relationship:
logit[P (Yij = 1|di, Yik = 0, k 6= j)] = 0 + d di
i = a is conditional log odds ratio
Quadratic loglinear model
Maximum likelihood estimates (model based standard errors; empirically corrected
standard errors)
453
Outcome Par.
DEHP
EG
DYME
Collapsed 0
External
Visceral
Skeletal
454
26.3
Regression relationship:
logit[P (Yij = 1|di)] = 0 + d di
a: Fishers z transformed correlation
: correlation
455
Outcome Parameter
External
Visceral
Skeletal
EG
DYME
5.15(0.56)
2.63(0.76)
7.94(0.77)
0.11(0.03)
0.12(0.03)
0.11(0.04)
0.05(0.01)
0.06(0.01)
0.05(0.02)
4.38(0.49)
4.25(1.39)
5.49(0.87)
0.11(0.02)
0.05(0.08)
0.08(0.04)
0.05(0.01)
0.02(0.04)
0.04(0.02)
4.68(0.56)
2.96(0.18)
5.79(0.80)
0.13(0.03)
0.27(0.02)
0.22(0.05)
0.06(0.01)
0.13(0.01)
0.11(0.02)
Collapsed 0
DEHP
5.38(0.47)
3.05(0.17)
8.18(0.69)
0.12(0.03)
0.28(0.02)
0.12(0.03)
0.06(0.01)
0.14(0.01)
0.06(0.01)
456
26.4
GEE1
Regression relationship:
logit[P (Yij = 1|di)] = 0 + d di
: overdispersion parameter
: working correlation
Parameter estimates (model-based standard errors; empirically corrected standard
errors)
Two sets of working assumptions:
457
Outcome Par.
External
0.74
1.00
Collapsed
Linearized
Skeletal
Prentice
Visceral
Standard
0.99
1.02
0.99
1.04
458
Outcome Par.
External
Visceral
Skeletal
Collapsed
Standard
5.33(0.57;0.55)
0.88
0.11
4.55(0.55;0.59)
1.00
0.08
5.32(0.51;0.55)
0.65
0.11(0.04)
0.06
4.59(0.58;0.59)
4.55(0.54;0.59)
0.92
0.11(0.05)
0.08
4.84(0.62;0.63)
0.98
0.12
5.32(0.65;0.55)
Linearized
Prentice
4.84(0.67;0.63)
4.84(0.65;0.63)
0.86
0.14(0.06)
0.13
5.84(0.57;0.61)
1.00
0.11
5.89(0.62;0.61)
5.82(0.58;0.61)
0.96
0.15(0.05)
0.11
459
26.5
GEE2
Regression relationship:
logit[P (Yij = 1|di)] = 0 + d di
a: Fishers z transformed correlation
: correlation
Working assumption: third- and fourth-order correlations are zero
Parameter estimates (empirically corrected standard errors)
460
Outcome Parameter
External
Visceral
Skeletal
Collapsed
DEHP
EG
DYME
5.29(0.55)
3.10(0.81)
8.15(0.83)
0.15(0.05)
0.15(0.05)
0.13(0.05)
0.07(0.02)
0.07(0.02)
0.06(0.02)
4.52(0.59)
4.37(1.14)
5.51(0.89)
0.15(0.06)
0.02(0.02)
0.11(0.07)
0.07(0.03)
0.01(0.01)
0.05(0.03)
-5.23(0.40) -4.05(0.33)
5.35(0.60)
4.77(0.43)
0.18(0.02)
0.30(0.03)
0.09(0.01)
0.15(0.01)
5.35(0.60)
4.89(0.90)
8.82(0.91)
0.18(0.02)
0.26(0.14)
0.18(0.12)
0.09(0.01)
0.13(0.07)
0.09(0.06)
461
26.6
Regression relationship:
logit[P (Yij = 1|di)] = 0 + d di
Exchangeable association structure
: log odds ratio
: odds ratio
Parameter estimates (empirically corrected standard errors)
462
Outcome Parameter
External
Visceral
Skeletal
Collapsed
DEHP
EG
DYME
5.64(0.52)
3.28(0.72)
8.25(0.87)
0.96(0.30)
1.45(0.45)
0.79(0.31)
2.61(0.78)
4.26(1.92)
2.20(0.68)
4.72(0.57)
4.50(1.13)
6.05(1.04)
1.12(0.30)
0.49(0.42)
1.76(0.59)
3.06(0.92)
1.63(0.69)
5.81(3.43)
4.90(0.70)
3.85(0.39)
6.73(0.65)
1.05(0.40)
1.43(0.22)
1.62(0.37)
2.86(1.14)
4.18(0.92)
5.05(1.87)
5.93(0.63)
3.86(0.40)
7.98(0.75)
1.17(0.29)
1.40(0.22)
1.26(0.31)
3.22(0.93)
4.06(0.89)
3.53(1.09)
463
26.7
Beta-binomial Model
Regression relationship:
logit[P (Yij = 1|di, )] = 0 + d di
a: Fishers z transformed correlation
: correlation
Parameter estimates (standard errors)
464
Outcome Parameter
External
Visceral
Skeletal
EG
DYME
5.20(0.59)
2.78(0.81)
8.01(0.82)
0.21(0.09)
0.28(0.14)
0.21(0.12)
0.10(0.04)
0.14(0.07)
0.10(0.06)
4.42(0.54)
4.33(1.26)
4.94(0.90)
0.22(0.09)
0.04(0.09)
0.45(0.21)
0.11(0.04)
0.02(0.04)
0.22(0.10)
4.92(0.63)
3.42(0.40)
6.99(0.71)
0.27(0.11)
0.54(0.09)
0.61(0.14)
0.13(0.05)
0.26(0.04)
0.30(0.06)
Collapsed 0
DEHP
5.59(0.56)
3.05(0.17)
8.29(0.79)
0.32(0.10)
0.28(0.02)
0.33(0.10)
0.16(0.05)
0.14(0.01)
0.16(0.05)
465
26.8
Regression relationship:
logit[P (Yij = 1|di, bi)] = 0 + bi + d di,
bi N (0, 2)
466
Effect
Parameter
Laplace
QUAD
Intercept
Dose effect
6.50 (0.86)
6.45 (0.84)
Intercept var.
1.42 (0.70)
1.27 (0.62)
Effect
Intercept
Dose effect
5.73 (0.65)
5.71 (0.64)
Intercept var.
0.95 (0.40)
0.89 (0.38)
Effect
Intercept
Dose effect
5.70 (0.66)
5.67 (0.65)
Intercept var.
1.20 (0.53)
1.10 (0.50)
467
26.9
Summary Table
468
Family
Model
Conditional
Marginal
Random-effects
-2.81(0.58)
3.07(0.65)
LOG OR
0.18(0.04)
-2.85(0.53)
3.24(0.60)
LOG OR
0.18(0.04)
Lik. Bahadur
-4.93(0.39)
5.15(0.56)
0.05(0.01)
-4.98(0.37)
5.33(0.55)
0.11
-5.06(0.38)
5.31(0.57)
-4.99(0.37)
5.32(0.55)
0.11 (0.04)
-5.06(0.38)
5.31(0.57)
-5.00(0.37)
5.32(0.55)
0.06
-5.06(0.38)
5.31(0.57)
GEE2
-4.98(0.37)
5.29(0.55)
0.07(0.02)
ALR
-.516(0.35)
5.64(0.52)
0.96(0.30)
Beta-binomial
-4.91(0.42)
5.20(0.59)
0.10(0.04)
GLLM (MQL)
-5.18(0.40)
5.70(0.66)
Int. var 2
1.20(0.53)
GLMM (PQL)
-5.32(0.40)
5.73(0.65)
Int. var 2
0.95(0.40)
GLMM (QUAD)
-5.97(0.57)
6.45(0.84)
Int. var 2
1.27(0.62)
Association
469
26.10
Discussion
470
471
exp(0 + bi + 1 di)
1 + exp(0 + bi + 1 di)
472
Chapter 27
Case Study: Binary Analysis of Analgesic Trial
. Research question
. GEE
. Alternating logistic regressions
. Further GEE analyses
. Generalized linear mixed model
. Discussion
473
27.1
Research Question
GSABIN =
474
27.2
GEE1
. Exchangeable
. AR(1)
. Unstructured
475
Effect
Parameter
IND
EXCH
Intercept
2.80(0.49;0.47)
2.92(0.49;0.46)
Time
Time2
Basel. PCA
Correlation
Parameter
AR
UN
Intercept
2.94(0.49;0.47)
2.87(0.48;0.46)
Time
Time2
Basel. PCA
Correlation
Effect
-0.79(0.39;0.34) -0.83(0.34;0.33)
0.18(0.08;0.07)
0.18(0.07;0.07)
-0.21(0.09;0.10) -0.23(0.10;0.10)
0.22
-0.90(0.35;0.33) -0.78(0.33;0.32)
0.20(0.07;0.07)
0.17(0.07;0.07)
-0.22(0.10;0.10) -0.23(0.10;0.10)
0.25
Correlation (1,2)
12
0.18
Correlation (1,3)
13
0.25
Correlation (1,4)
14
0.20
Correlation (2,3)
23
0.18
Correlation (2,4)
24
0.18
Correlation (3,4)
34
0.46
476
EXCH
0.22 0.22
1
0.22
1
R =
UN
R =
AR
0.25 0.06
1
0.25
1
0.18 0.18
1
0.46
1
477
27.3
478
Effect
Parameter
EXCH
FULLCLUST
ZREP
Intercept
2.98(0.46)
2.92(0.46) 2.92(0.46)
Time
-0.87(0.32)
-0.80(0.32) -0.80(0.32)
Time2
0.18(0.07)
0.17(0.06) 0.17(0.07)
Basel. PCA
-0.23(0.22)
-0.24(0.10) -0.24(0.10)
Log OR
1.43(0.22)
Log OR(1,2)
12
1.13(0.33)
Log OR(1,3)
13
1.56(0.39)
Log OR(1,4)
14
1.60(0.42)
Log OR(2,3)
23
1.19(0.37)
Log OR(2,4)
24
0.93(0.42)
Log OR(3,4)
34
2.44(0.48)
Log OR par.
1.26(0.23)
Log OR par.
1.17(0.47)
479
In the FULLCLUST structure, there is a hint that 34 is different from the others,
with all others being equal.
To confirm this, a Wald test can be used for the null hypothesis:
H0 : 12 = 13 = 14 = 23 = 24
Details on the test: Molenberghs and Verbeke (2005, pp. 312313)
The reduced structure, fitted with ZREP, is:
12 = 13 = 14 = 23 = 24 = 0,
34 = 0 + 1
At the odds ratio level, with fitted values:
480
EXCH
4.18 4.18
4.18
ZREP
UN
3.53 3.53
1
11.36
1
3.29 2.53
1
11.47
1
481
27.4
Methods used:
482
Effect
Parameter
Log. regr.
Standard
Prentice
Intercept
Time
Time2
Basel. PCA
Correlation
Effect
Parameter
0.21
Lineariz.
ALR
Intercept
2.94(0.46) 2.98(0.46)
Time
-0.84(0.33) -0.87(0.32)
Time2
0.18(0.07) 0.18(0.07)
Basel. PCA
-0.23(0.10) -0.23(0.10)
Corr.
Log OR
0.26(0.05)
0.26(0.04)
1.43(0.22)
483
27.5
Four tools:
. SAS procedure GLIMMIX:
MQL (= MQL1)
PQL (= PQL1)
I: non-adaptive (Q = 10)
II: non-adaptive (Q = 10)
adaptive (Q = 10)
adaptive (Q = 20)
. MLwiN:
PQL1
PQL2
. MIXOR
484
Integrand approximation
SAS GLIMMIX
Effect
MLwiN
Par.
MQL
PQL1
PQL1
PQL2
Intercept
2.91(0.53)
3.03(0.55)
3.02(0.55)
4.07(0.70)
Time
Time2
Basel. PCA
1.06(0.25)
1.04(0.23)
1.01(0.12)
1.61(0.15)
1.12(0.53)
1.08(0.48)
1.02(0.25)
2.59(0.47)
0.19(0.08)
0.19(0.08)
0.25(0.10)
Numerical integration
SAS NLMIXED
Effect
Par.
II
MIXOR
Intercept
4.07(0.71)
4.05(0.71)
4.05(0.55)
Time
Time2
Basel. PCA
1.60(0.22)
1.59(0.21)
1.59(0.21)
2.56(0.70)
2.53(0.68)
2.53(0.67)
0.24(0.09)
0.24(0.10)
485
27.6
Discussion
486
Chapter 28
Case Study: Ordinal Analysis of Analgesic Trial
487
28.1
(k = 1, . . . , c 1)
(k = 1, . . . , c 1)
(k = 1, . . . , c 1)
488
It is of use only when there is one natural directionality in the data: subjects go
from the lowest category to higher categories, without ever returning. This is
often not satisfied.
Proportional-odds model for the 5-point GSA outcome in the analgesic trial:
logit[P (Yij k|tij , Xi)] = k + 2tij + 3t2ij + 4Xi,
(k = 1, . . . , 4)
SAS code:
proc genmod data=m.gsa2;
title Analgesic, logistic regression, Ordinal;
class patid timecls;
model gsa = time|time pca0 / dist=multinomial link=cumlogit;
run;
Note that the dist and link options have been adapted
489
Selected output:
The GENMOD Procedure
Analysis Of Parameter Estimates
Parameter
DF
Estimate
Standard
Error
Intercept1
Intercept2
Intercept3
Intercept4
TIME
TIME*TIME
PCA0
1
1
1
1
1
1
1
-1.0048
0.5225
2.3171
4.0525
-0.2027
0.0479
-0.2141
0.3437
0.3407
0.3481
0.3754
0.2706
0.0545
0.0622
ChiSquare
Pr > ChiSq
-1.6785
-0.1452
1.6349
3.3166
-0.7330
-0.0590
-0.3361
8.55
2.35
44.31
116.51
0.56
0.77
11.84
0.0035
0.1251
<.0001
<.0001
0.4539
0.3798
0.0006
-0.3312
1.1903
2.9994
4.7884
0.3277
0.1547
-0.0922
490
28.2
(k = 1, . . . , 4)
491
0.3549
0.3568
0.3669
0.3938
0.2028
0.0399
0.0911
95% Confidence
Limits
-1.7004
-0.1767
1.5980
3.2807
-0.6001
-0.0304
-0.3927
-0.3092
1.2218
3.0363
4.8243
0.1948
0.1261
-0.0356
Z Pr > |Z|
-2.83
1.46
6.31
10.29
-1.00
1.20
-2.35
0.0046
0.1430
<.0001
<.0001
0.3176
0.2304
0.0187
492
28.3
(k = 1, . . . , c 1)
This is the obvious counterpart for the PO logistic and GEE marginal models
considered above.
For the case of the 5-point GSA outcome in the analgesic study:
logit[P (Yij k|tij , Xi, bi)] = k + bi + 2tij + 3t2ij + 4Xi,
(k = 1, . . . , 4)
493
Also here, the dist and link functions have to be adapted to the ordinal setting.
494
Selected output:
Covariance Parameter Estimates
Cov
Parm
Subject
UN(1,1)
PATID
Estimate
Standard
Error
3.5348
0.4240
Effect
GSA
Intercept
Intercept
Intercept
Intercept
TIME
TIME*TIME
PCA0
1
2
3
4
Estimate
Standard
Error
DF
t Value
Pr > |t|
-1.4352
0.9101
3.4720
5.6263
-0.4825
0.1009
-0.2843
0.5033
0.4999
0.5084
0.5358
0.2958
0.05972
0.1249
393
393
393
393
737
737
737
-2.85
1.82
6.83
10.50
-1.63
1.69
-2.28
0.0046
0.0694
<.0001
<.0001
0.1033
0.0916
0.0231
495
In case the procedure NLMIXED is used, more drastic changes are needed:
proc nlmixed data=m.gsa2 qpoints=20;
title Analgesic, PROC NLMIXED, ordinal, adaptive, q=20;
parms int1=-1.5585 int2=1.0292 int3=3.8916 int4=6.2144
beta1=0.5410 beta2=-0.1123 beta3=0.3173 d=2.1082;
eta = beta1*time + beta2*time*time + beta3*pca0 + b1;
if gsa=1 then z = 1/(1+exp(-(int1-eta)));
else if gsa=2 then z = 1/(1+exp(-(int2-eta))) - 1/(1+exp(-(int1-eta)));
else if gsa=3 then z = 1/(1+exp(-(int3-eta))) - 1/(1+exp(-(int2-eta)));
else if gsa=4 then z = 1/(1+exp(-(int4-eta))) - 1/(1+exp(-(int3-eta)));
else z = 1 - 1/(1+exp(-(int4-eta)));
if z > 1e-8 then ll = log(z);
else ll = -1e100;
model gsa ~ general(ll);
random b1 ~ normal(0,d*d) subject=patid;
estimate var(d) d*d;
run;
496
(k = 1, . . . 5)
. P (Yij <= 0) = 0
. P (Yij <= 5) = 1
is the part of the linear predictor excluding the intercept
497
Selected output:
Parameter Estimates
Parameter Estimate
int1
int2
int3
int4
beta1
beta2
beta3
d
-1.5585
1.0292
3.8916
6.2144
0.5410
-0.1123
0.3173
2.1082
Standard
Error
DF
t Value
Pr > |t|
Alpha
Lower
Upper Gradient
0.5481
0.5442
0.5624
0.5990
0.3078
0.06187
0.1386
0.1412
394
394
394
394
394
394
394
394
-2.84
1.89
6.92
10.37
1.76
-1.82
2.29
14.94
0.0047
0.0593
<.0001
<.0001
0.0796
0.0702
0.0226
<.0001
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
-2.6360
-0.04063
2.7860
5.0368
-0.06421
-0.2340
0.04475
1.8307
-0.4810 0.000235
2.0991 -0.00004
4.9973 -0.00017
7.3920 -0.00004
1.1462 -0.00008
0.009311 0.000019
0.5898 0.000013
2.3858 0.000331
Additional Estimates
Label
var(d)
Estimate
Standard
Error
DF
t Value
Pr > |t|
Alpha
Lower
Upper
4.4447
0.5952
394
7.47
<.0001
0.05
3.2746
5.6148
498
28.4
Three approaches:
. Logistic regression
. GEE
. GLMM
For GEE: (model based standard errors; empirically corrected standard errors)
MQL performs again rather poorly
499
Marginal models
Effect
Parameter
OLR
GEE
Intercept 1
-1.00(0.34)
-1.00(0.34;0.35)
Intercept 2
0.52(0.34)
0.52(0.34;0.36)
Intercept 3
2.32(0.35)
2.32(0.34;0.37)
Intercept 4
4.05(0.38)
4.05(0.37;0.39)
Time
-0.20(0.27)
-0.20(0.27;0.20)
Time2
0.05(0.05)
0.05(0.05;0.04)
Basel. PCA
-0.21(0.06)
-0.21(0.06;0.09)
500
Random-effects models
Effect
Parameter
MQL
PQL
N.Int.
Intercept 1
-0.93(0.40)
-1.44(0.50)
-1.56(0.55)
Intercept 2
0.60(0.39)
0.51(0.50)
1.03(0.54)
Intercept 3
2.39(0.40)
3.47(0.51)
3.89(0.56)
Intercept 4
4.13(0.42)
5.63(0.54)
6.21(0.60)
Time
-0.30(0.28)
-0.48(0.30)
0.54(0.31)
Time2
0.06(0.06)
0.10(0.06)
-0.11(0.06)
Basel. PCA
-0.21(0.09)
-0.28(0.12)
0.32(0.14)
1.06(0.08)
1.88(0.11)
2.11(0.14)
1.13(0.16)
3.53(0.42)
4.44(0.60)
501
Chapter 29
Count Data: The Epilepsy Study
502
29.1
503
We want to test for a treatment effect on number of seizures, correcting for the
average number of seizures during the 12-week baseline phase, prior to the
treatment.
The response considered now is the total number of seizures a patient
experienced, i.e., the sum of all weekly measurements.
Let Yi now be the total number of seizures for subject i:
Yi =
i
X
i=1
Yij
504
Histogram:
As these sums are not taken over an equal number of visits for all subjects, the
above histogram is not a fair one as it does not account for differences in ni for
this.
505
Since ni is the number of weeks for which the number of seizures was recorded for
subject i, exp(xi0) is the average number of seizures per week.
ln ni is called an offset in the above model.
In our application, the covariates in xi are the treatment as well as the baseline
seizure rate.
Introduction to Longitudinal Data Analysis
506
SAS statements for the calculation of outcome, offset, and for fitting the Poisson
model:
proc sort data=test;
by id studyweek;
run;
proc means data=test sum n nmiss;
var nseizw;
by id;
output out=result
n=n
nmiss=nmiss
sum=sum;
run;
data result;
set result;
offset=log(n-nmiss);
keep id offset sum;
run;
data first;
set test;
by id;
if first.id;
keep id bserate trt;
run;
data result;
merge result first;
by id;
run;
proc genmod data=result;
model sum=bserate trt
/ dist=poisson offset=offset;
run;
507
The treatment variable trt is coded as 0 for placebo and 1 for treated
Output from the GENMOD procedure:
Analysis Of Parameter Estimates
Parameter DF
Intercept
bserate
trt
Scale
1
1
1
0
Standard
Estimate
Error
0.8710
0.0172
-0.4987
1.0000
0.0218
0.0002
0.0341
0.0000
Wald 95%
Confidence Limits
0.8283
0.0167
-0.5655
1.0000
0.9138
0.0177
-0.4319
1.0000
ChiSquare
1596.16
4826.14
214.18
508
A more general model would allow the treatment effect to depend on the baseline
average number of seizures:
proc genmod data=result;
model sum=bserate trt bserate*trt
/ dist=poisson offset=offset;
run;
Parameter
Intercept
bserate
trt
bserate*trt
Scale
DF
1
1
1
1
0
Standard
Estimate
Error
0.2107
0.0450
0.2938
-0.0295
1.0000
0.0353
0.0009
0.0454
0.0010
0.0000
Wald 95%
Confidence Limits
0.1415
0.0432
0.2047
-0.0314
1.0000
ChiSquare
0.2799
35.60
0.0469 2286.94
0.3829
41.81
-0.0276
911.43
1.0000
509
510
Additional output:
Contrast Estimate Results
Label
trt, bserate=6
trt, bserate=10.5
trt, bserate=21
Estimate
Standard
Error
Alpha
0.1167
-0.0161
-0.3260
0.0415
0.0388
0.0340
0.05
0.05
0.05
Label
Confidence Limits
trt, bserate=6
trt, bserate=10.5
trt, bserate=21
0.0355
-0.0921
-0.3926
0.1980
0.0600
-0.2593
ChiSquare
Pr > ChiSq
7.93
0.17
91.86
0.0049
0.6786
<.0001
On average, there are more seizures in the treatment group when there are few
seizures at baseline. The opposite is true for patients with many seizures at
baseline.
Introduction to Longitudinal Data Analysis
511
29.2
Poisson regression models will be used to describe the marginal distributions, i.e.,
the distribution of the outcome at each time point separately:
Yij = Poisson(ij )
log(ij ) = 0 + 1Ti + 2tij + 3Titij
Notation:
512
More complex mean models can again be considered (e.g. including polynomial
time effects, or including covariates).
As the response is now the number of seizures during a fixed period of one week,
we do not need to include an offset, as was the case in the GLM fitted previously
to the epilepsy data, not in the context of repeated measurements.
Given the long observation period, an unstructured working correlation would
require estimation of many correlation parameters.
Further, the long observation period makes the assumption of an exchangeable
correlation structure quite unrealistic.
We therefore use the AR(1) working correlation structure, which makes sense
since we have equally spaced time points at which measurements have been taken.
513
SAS code:
proc genmod data=test;
class id timeclss;
model nseizw = trt time trt*time / dist=poisson;
repeated subject=id / withinsubject=timeclss type=AR(1) corrw modelse;
run;
Col2
Col3
.....
Col26
Col27
Row1
Row2
Row3
1.0000
0.5946
0.3535
0.5946
1.0000
0.5946
0.3535
0.5946
1.0000
.....
.....
.....
......
......
......
......
......
......
....
......
......
......
.....
......
......
Row26
Row27
......
......
......
......
......
......
.....
.....
1.0000
0.5946
0.5946
1.0000
514
0.1778
0.2785
0.0229
0.0279
95% Confidence
Limits
0.8774
-0.3777
-0.0519
-0.0730
1.5743
0.7138
0.0378
0.0364
Z Pr > |Z|
6.90
0.60
-0.31
-0.66
<.0001
0.5461
0.7574
0.5124
0.2349
0.3197
0.0230
0.0310
95% Confidence
Limits
0.7655
-0.4585
-0.0521
-0.0790
1.6862
0.7947
0.0380
0.0425
Z Pr > |Z|
5.22
0.53
-0.31
-0.59
<.0001
0.5991
0.7585
0.5553
515
516
29.3
Random-effects Model
Similar as in our GEE analysis, we do not need to include an offset, because the
response is now the number of seizures during a fixed period of one week.
Introduction to Longitudinal Data Analysis
517
518
d
. Let
be the MLE of
d
.
is asymptotically normally distributed with mean and covariance matrix
d
var()
(inverse Fisher information matrix).
519
d
d
. The delta-method then implies that any function F ()
of
is asymptotically
normally distributed with mean F () and covariance matrix equal to
F ()
F 0()
d
var(F ()) =
var()
0
Parameter Estimate
Standard
Error
DF
t Value
Pr > |t|
Alpha
Lower
Upper Gradient
int0
slope0
int1
slope1
sigma
0.1675
0.004404
0.1699
0.004318
0.08556
88
88
88
88
88
4.88
-3.24
3.81
-2.78
12.55
<.0001
0.0017
0.0003
0.0067
<.0001
0.05
0.05
0.05
0.05
0.05
0.4852
-0.02304
0.3101
-0.02058
0.9042
1.1509 0.006008
-0.00554 0.022641
0.9855 0.010749
-0.00342 -0.04858
1.2442 0.009566
0.8180
-0.01429
0.6478
-0.01200
1.0742
520
Additional Estimates
Label
Estimate
Standard
Error
difference in slope
ratio of slopes
variance RIs
0.002287
0.8399
1.1539
0.006167
0.3979
0.1838
DF
t Value
Pr > |t|
Alpha
Lower
Upper
88
88
88
0.37
2.11
6.28
0.7116
0.0376
<.0001
0.05
0.05
0.05
-0.00997
0.04923
0.7886
0.01454
1.6306
1.5192
The number of quadrature points was not specified, and therefore was selected
adaptively, and set equal to only one.
In order to check whether Q = 1 is sufficient, we refitted the model, prespecifying
Q = 20. This produced essentially the same output.
521
522
29.4
Parameter
Estimate (s.e.)
Common intercept
Slope placebo
Slope treatment
523
MQL
PQL
Parameter
Estimate (s.e.)
Estimate (s.e.)
Common intercept
1.3525 (0.1492)
0.8079 (0.1261)
Slope placebo
Slope treatment
Variance of intercepts
d11
1.9017 (0.2986)
1.2510 (0.2155)
Variance of slopes
d22
0.0084 (0.0014)
0.0024 (0.0006)
Effect
Correlation rand.eff.
QUAD
Parameter
Estimate (s.e.)
Estimate (s.e.)
Common intercept
0.7740 (0.1291)
0.7739 (0.1293)
Slope placebo
Slope treatment
Variance of intercepts
d11
1.2814 (0.2220)
1.2859 (0.2231)
Variance of slopes
d22
0.0024 (0.0006)
0.0024 (0.0006)
Effect
Correlation rand.eff.
29.5
525
526
527
Part V
Incomplete Data
528
Chapter 30
Setting The Scene
529
30.1
Growth Data
The distance from the center of the pituitary to the maxillary fissure was recorded
at ages 8, 10, 12, and 14, for 11 girls and 16 boys
530
Individual profiles:
531
30.2
20
5 post-baseline visits: 48
-10
-20
-8
-6
Change
0
-10
Change
-4
10
-2
Standard Drug
Experimental Drug
6
=Visit
Visit
532
30.3
533
Missingness:
Measurement occasion
4 wks 12 wks 24 wks 52 wks Number
Completers
O
188
78.33
Dropouts
O
24
10.00
3.33
2.50
2.50
Non-monotone missingness
1.67
0.42
0.83
0.42
534
30.4
535
30.5
Scientific Question
In terms of
In terms of
In terms of
536
30.6
Notation
Rij =
if Yij is observed,
otherwise.
Y oi
Ym
i
537
30.7
Framework
f (Y i, Di|, )
Selection Models: f (Y i|) f (Di|Y oi, Y m
i , )
MCAR
MAR
f (Di|Y oi, )
f (Di|)
MNAR
f (Di |Y oi, Y m
i , )
538
f (Y i, Di|, )
Selection Models: f (Y i|) f (Di|Y oi, Y m
i , )
MCAR
MAR
MNAR
CC ?
direct likelihood !
joint model !?
LOCF ?
expectation-maximization (EM).
sensitivity analysis ?!
imputation ?
...
(weighted) GEE !
539
30.8
540
1 g(y1 , y2) p
.
.f1(y1 , y2 )
g(y1 , y2 ) 1 p
Chapter 31
Proper Analysis of Incomplete Data
. Simple methods
. Bias for LOCF and CC
. Direct likelihood inference
. Weighted generalized estimating equations
542
31.1
543
544
Modeling Strategies
545
31.2
Simple Methods
MCAR
Loss of information
Increase of information
Usually bias
546
Dropouts
tij = 0
p0
Probability
Ti = 0, 1
Treatment indicator
MCAR
MAR
Completers
Probability
tij = 0, 1
1 p0 = p1
Treatment indicator
Ti = 0, 1
CC
LOCF
(p1 p0 )2 (1 p1 )3
[(1 p1 )(0 + 1 0 1 )
p1 (0 + 1 + 2 + 3 ) + (1 p1 )(0 + 1)
(1 p0 )(0 0 )]
p0(0 + 2 ) (1 p0 )0 1 3
[(1 p1 )(0 + 1 0 1 ) (1 p0)(0 0 )]
547
31.3
Ignorability
with
f (Y
o
i , Di |, )
=
=
f (Y i, Di|, )dY m
i
o
m
m
f (Y oi, Y m
i |)f (Di |Y i , Y i , )dY i .
548
o
i , Di |, )
o
m
f (Y oi , Y m
|)f
(D
|Y
,
)dY
i
i
i
i
549
31.3.1
Ignorability: Summary
Likelihood/Bayesian + MAR
&
Frequentist + MCAR
550
31.4
Mechanism is MAR
and distinct
Interest in
Use observed information matrix
Outcome type
Modeling strategy
Software
Gaussian
551
31.4.1
1
2
3
7
Mean
Covar
# par
unstructured unstructured 18
6= slopes unstructured 14
= slopes unstructured 13
6= slopes
CS
6
552
31.4.2
Method
Complete case
LOCF
Unconditional mean
Conditional mean
Model
7a
2a
7a
1
Mean
= slopes
quadratic
= slopes
unstructured
Covar
CS
unstructured
CS
unstructured
# par
5
16
5
18
distorting
553
31.4.3
554
31.4.4
Data
. Complete cases
. LOCF imputed data
Model
. Direct likelihood
ML
REML
. MANOVA
. ANOVA per time point
555
Principle
Method
Original
Direct Lik.
CC
LOCF
Boys at Age 8
Boys at Age 10
Direct likelihood, ML
22.88 (0.56)
23.81 (0.49)
22.88 (0.58)
23.81 (0.51)
22.88 (0.61)
23.81 (0.53)
Direct likelihood, ML
22.88 (0.56)
23.17 (0.68)
22.88 (0.58)
23.17 (0.71)
MANOVA
24.00 (0.48)
24.14 (0.66)
22.88 (0.61)
24.14 (0.74)
Direct likelihood, ML
24.00 (0.45)
24.14 (0.62)
24.00 (0.48)
24.14 (0.66)
24.00 (0.51)
24.14 (0.74)
Direct likelihood, ML
22.88 (0.56)
22.97 (0.65)
22.88 (0.58)
22.97 (0.68)
22.88 (0.61)
22.97 (0.72)
556
31.4.5
22
24
Distance
26
28
Original Data
CC
LOCF
Direct Likelihood (Fitted)
Direct Likelihood (Observed)
28
Original Data
CC
LOCF
Direct Likelihood (Fitted)
Direct Likelihood (Observed)
10
12
14
22
24
Distance
26
Age
10
12
14
Age
557
31.4.6
R completers N R incompleters
Conditional density
1
2
2
Yi1
Yi2
1
2
11 12
22
1
1 =
N
frequentist
R
1 X
2 =
yi2
R i=1
likelihood
1
d
2 =
N
N
X
i=1
yi1
R
X
i=1
yi2 +
N
X
i=R+1
y 2 + d1(yi1 y 1)
558
31.4.7
Data
Complete
Principle
Method
Original
Direct likelihood, ML
22.88 (0.56)
23.81 (0.49)
22.88 (0.56)
23.17 (0.68)
CC
Direct likelihood, ML
24.00 (0.45)
24.14 (0.62)
LOCF
Direct likelihood, ML
22.88 (0.56)
22.97 (0.65)
Mean
Covariance
Unstructured Unstructured
22.88
23.81
Unstructured CS
22.88
23.81
Unstructured Independence
22.88
23.81
22.88
23.17
Unstructured CS
22.88
23.52
Unstructured Independence
22.88
24.14
559
560
31.4.8
SAS code:
21.0
10
20.0
12
21.5
14
23.0
20.5
12
24.5
14
26.0
...
...
561
31.4.9
21.0
10
20.0
12
21.5
14
23.0
20.5
12
24.5
14
26.0
...
...
SAS code:
data help;
set growth;
agecat = age;
run;
proc mixed data = growth method = ml;
class sex idnr agecat;
model measure = sex age*sex / s;
repeated agecat / type = un
subject = idnr;
run;
31.5
563
-1.94
(1.17) 0.0995
LOCF
-1.63
(1.08) 0.1322
MAR
-2.38
(1.16) 0.0419
564
Chapter 32
Weighted Generalized Estimating Equations
. General Principle
. Analysis of the analgesic trial
. Analysis of the ARMD trial
. Analysis of the depression trial
565
32.1
General Principle
566
diY
1
k=2
0 Vi (yi i ) = 0
i=1 idi
N
X
567
32.2
. At other than the last ocassion, the weight is the already accumulated weight,
multiplied by 1the predicted probability
. At the last occasion within a sequence where dropout occurs the weight is
multiplied by the predicted probability
. At the end of the process, the weight is inverted
568
32.3
569
Effect
Intercept
-1.80 (0.49)
Previous GSA= 1 11
-1.02 (0.41)
Previous GSA= 2 12
-1.04 (0.38)
Previous GSA= 3 13
-1.34 (0.37)
Previous GSA= 4 14
-0.26 (0.38)
Basel. PCA
0.25 (0.10)
Phys. func.
0.009 (0.004)
Genetic disfunc.
0.59 (0.24)
There is some evidence for MAR: P (Di = j|Di j) depends on previous GSA.
Furthermore: baseline PCA, physical functioning and genetic/congenital disorder.
Introduction to Longitudinal Data Analysis
570
Par.
GEE
WGEE
Intercept
Time
Time2
571
UN, GEE
0.177 0.113
1
0.456
1
UN, WGEE
0.196 0.113
1
0.409
1
572
32.4
573
574
32.5
575
Intercept
0.14 (0.49)
Previous outcome
0.04 (0.38)
Treatment
-0.86 (0.37)
Lesion level 1
31
-1.85 (0.49)
Lesion level 2
32
-1.91 (0.52)
Lesion level 3
33
-2.80 (0.72)
Time 2
41
-1.75 (0.49)
Time 3
42
-1.38 (0.44)
576
GEE:
with
. Ti = 0 for placebo and Ti = 1 for interferon-
. tj (j = 1, . . . , 4) refers to the four follow-up measurements
. Classical GEE and linearization-based GEE
. Comparison between CC, LOCF, and GEE analyses
SAS code: Molenberghs and Verbeke (2005, Section 32.5)
Results:
577
Effect Par.
CC
LOCF
Observed data
Unweighted
WGEE
Standard GEE
Int.4
Tr.12 22
Tr.24 32
Tr.52 42
Tr.4
Corr.
0.39
0.44
0.39
0.33
578
Effect Par.
CC
LOCF
Observed data
Unweighted
WGEE
Linearization-based GEE
Int.4
Tr.12 22
Tr.24 32
Tr.52 42
Tr.4
Corr.
0.62
0.57
0.62
1.29
0.39
0.44
0.39
1.85
0.39
0.44
0.39
0.59
579
32.6
580
SAS code:
producing:
. dropout indicates whether missingness at a given time occurs
. prev contains outcome at the previous occasion
. The logistic model for dropout:
proc genmod data=dropout descending;
class trt;
model dropout = prev trt / pred dist=b;
output out=pred p=pred;
run;
. The weights can now be included in the GENMOD program which specifies the
GEE, through the WEIGHT or SCWGT statements:
581
Results:
WGEE
Effect
p-value
est. (s.e.)
p-value
0.11
-0.24 (0.57)
0.67
0.30
0.09 (0.40)
0.82
Treatment at visit 6
0.62 (0.56)
0.27
0.17 (0.34)
0.62
0.12
-0.43 (0.35)
0.22
0.03
-0.71 (0.38)
0.06
est. (s.e.)
GEE
582
Chapter 33
Multiple Imputation
. General idea
. Estimation
. Hypothesis testing
. Use of MI in practice
. Analysis of the growth data
. Analysis of the ARMD trial
. Creating monotone missingness
583
33.1
General Principles
584
33.1.1
Informal Justification
We need to estimate from the data (e.g., from the complete cases)
Plug in the estimated and use
to impute the missing data.
o
|y
f (y m
i
i , )
o
f (y m
|y
,
).
i
i
585
33.1.2
The Algorithm
o
from f (y m
2. Draw Y m
i
i |y i , ).
3. To estimate , then calculate the estimate of the parameter of interest, and its
estimated variance, using the completed data, (Y o , Y m ):
= (Y
) = (Y
o, Y m)
U = Var(
)
& Um
(m = 1, . . . , M )
586
33.1.3
Pooling Information
= m=1 .
M
PM
where
total:
within:
between:
M + 1
B
=
W
+
V
M
m
U
m=1
W=
M
m
m
0
PM
m=1( )( )
=
B
M 1
PM
587
33.1.4
Hypothesis Testing
Both play a role in the asymptotic distribution (Li, Raghunathan, and Rubin 1991)
H0 : = 0
p = P (Fk,w > F )
588
where
k : length of the parameter vector
Fk,w F
( 0)0W 1 ( 0)
F =
k(1 + r)
1
(1
2
)
w = 4 + ( 4) 1 +
1
1
r =
1 + tr(BW 1 )
k
M
= k(M 1)
Limiting behavior:
F M
Fk, = 2/k
589
33.2
Use of MI in Practice
PROC MI
Analysis Task:
PROC MYFAVORITE
Inference Task:
Introduction to Longitudinal Data Analysis
PROC MIANALYZE
590
33.2.1
22.88 (0.58)
23.81 (0.51)
22.69 (0.81)
591
33.3
M = 10 imputations
GEE:
GLMM:
bi N (0, 2)
592
Results:
Effect
Par.
GEE
GLMM
Int.4
11 -0.84(0.20) -1.46(0.36)
Int.12
21 -1.02(0.22) -1.75(0.38)
Int.24
31 -1.07(0.23) -1.83(0.38)
Int.52
41 -1.61(0.27) -2.69(0.45)
Trt.4
12
0.21(0.28) 0.32(0.48)
Trt.12
22
0.60(0.29) 0.99(0.49)
Trt.24
32
0.43(0.30) 0.67(0.51)
Trt.52
42
0.37(0.35) 0.52(0.56)
R.I. s.d.
2.20(0.26)
R.I. var.
4.85(1.13)
593
33.4
Note that the imputation task is conducted on the continuous outcome diff,
indicating the difference in number of letters versus baseline
3. Then, data manipulation takes place to define the binary indicators and to create
a longitudinal version of the dataset
Introduction to Longitudinal Data Analysis
594
595
596
597
Chapter 34
Creating Monotone Missingness
598
Solution:
599
Part VI
Topics in Methods and Sensitivity Analysis for Incomplete
Data
600
Chapter 35
An MNAR Selection Model and Local Influence
601
35.1
f (Y i|)
f (Di|Y i, )
Y i = Xi + Z i b i + i
602
35.2
603
Yi1
Yi2
12
1 2
1 2
12
604
35.3
(Kenward 1998)
or
ranges of inferences
(Scharfstein et al 1999)
605
35.4
606
35.5
Local Influence
Likelihood displacement:
LD() = 2
iYi2
L=0 , L=0 ,
0
607
35.5
Local Influence
iYi2
or 0 + 1Yi1 +
i (Yi2 Yi1)
Likelihood displacement:
LD() = 2
L=0 , L=0 ,
0
608
35.5.1
Likelihood Displacement
Local influence direction h
LD()
normal curvature Ch
T0
P
PP
PP
PP
PP s
PP
= 0 PPPP
P
Ch = Ch() + Ch()
609
35.5.2
Computational Approaches
610
35.6
G2 = 0.005
611
35.6.1
Vj
1
Yi1
1
Yj1
( 1 Yj1 )
612
35.7
. Removal of:
(53,54,66,69): from local influence on Yi2
(4,5): from Kenwards informal analysis
(66): additional one identified from local influence on Yi2 Yi1
(4,5,66): frol local influence on Yi2 Yi1
613
Effect
Parameter
all
(53,54,66,69)
Measurement model:
Intercept
5.77(0.09)
5.69(0.09)
Time effect
0.72(0.11)
0.70(0.11)
2
First variance
1
0.87(0.12)
0.76(0.11)
2
Second variance
2
1.30(0.20)
1.08(0.17)
Correlation
0.58(0.07)
0.45(0.08)
Dropout model:
Intercept
0
-2.65(1.45)
-3.69(1.63)
First measurement
1
0.27(0.25)
0.46(0.28)
Second measurement
= 2
0
0
-2 loglikelihood
280.02
246.64
Effect
Parameter
all
(53,54,66,69)
Measurement model:
Intercept
5.77(0.09)
5.69(0.09)
Time effect
0.33(0.14)
0.35(0.14)
2
First variance
1
0.87(0.12)
0.76(0.11)
2
Second variance
2
1.61(0.29)
1.29(0.25)
Correlation
0.48(0.09)
0.42(0.10)
Dropout model:
Intercept
0
0.37(2.33)
-0.37(2.65)
First measurement
1
2.25(0.77)
2.11(0.76)
Second measurement
= 2
-2.54(0.83)
-2.22(0.86)
-2loglikelihood
274.91
243.21
2
G for MNAR
5.11
3.43
Introduction to Longitudinal Data Analysis
MAR
(4,5)
(66)
(4,5,66)
5.81(0.08)
0.64(0.09)
0.77(0.11)
1.30(0.20)
0.72(0.05)
5.75(0.09)
0.68(0.10)
0.86(0.12)
1.10(0.17)
0.57(0.07)
5.80(0.09)
0.60(0.08)
0.76(0.11)
1.09(0.17)
0.73(0.05)
-2.34(1.51)
0.22(0.25)
0
237.94
-2.77(1.47)
0.29(0.24)
0
264.73
-2.48(1.54)
0.24(0.26)
0
220.23
(66)
(4,5,66)
5.81(0.08)
0.40(0.18)
0.77(0.11)
1.39(0.25)
0.67(0.06)
5.75(0.09)
0.34(0.14)
0.86(0.12)
1.34(0.25)
0.48(0.09)
5.80(0.09)
0.63(0.29)
0.76(0.11)
1.10(0.20)
0.73(0.05)
-0.77(2.04)
1.61(1.13)
-1.66(1.29)
237.86
0.08
0.45(2.35)
2.06(0.76)
-2.33(0.86)
261.15
3.57
-2.77(3.52)
0.07(1.82)
0.20(2.09)
220.23
0.005
MNAR
(4,5)
614
Chapter 36
Mechanism for Growth Data
615
36.1
Modeling Missingness
with j = 1, 2, 3, or 4
MAR
Yi1
19.51
<0.0001
MAR
Yi3
7.43
0.0064
MAR
Yi4
2.51
0.1131
MNAR
Yi2
2.55
0.1105
616
Including covariates:
Boys : logit[P (Ri = 0|yi1 , xi = 0)] = (22 yi1)
Girls : logit[P (Ri = 0|yi1 , xi = 1)] = (20.75 yi1)
These models are interpreted as follows:
Chapter 37
Interval of Ignorance
618
37.1
26
57 65
49
2 0
14
Questions:
619
37.2
620
Yes
Yes
Yes
1191
No
107
Yes
158
68
29
No
14
18
43
31
Yes
90
109
No
25
19
96
No
No
Secession
21
621
37.3
1439
= 0.694
2074
Resulting Interval:
[0.694; 0.904]
Introduction to Longitudinal Data Analysis
622
Resulting Interval:
[0.694; 0.904]
Complete cases: All who answered on 3 questions
c =
1191 + 158
= 0.928 ?
1454
623
37.4
Missing at Random:
Non-response is allowed to depend on observed, but not on unobserved outcomes:
. Based on two questions:
c = 0.892
. Based on three questions:
c = 0.883
Missing Not at Random (NI):
Non-response is allowed to depend on unobserved measurements:
c = 0.782
624
37.5
Estimator
625
37.6
Estimator
626
37.7
627
37.7.1
Model Formulation
E(Y11jk ) = mjk ,
E(Y10jk ) = mjk jk ,
E(Y01jk ) = mjk jk ,
E(Y00jk ) = mjk jk jk jk ,
Interpretation:
jk : models non-response on independence question
jk : models non-response on attendance question
jk : interaction between both non-response indicators (cannot depend on j or k)
Introduction to Longitudinal Data Analysis
628
37.7.2
Identifiable Models
Model
Structure
d.f.
loglik
C.I.
BRD1
(, )
-2495.29
0.892
[0.878;0.906]
BRD2
(, j )
-2467.43
0.884
[0.869;0.900]
BRD3
(k , )
-2463.10
0.881
[0.866;0.897]
BRD4
(, k )
-2467.43
0.765
[0.674;0.856]
BRD5
(j , )
-2463.10
0.844
[0.806;0.882]
BRD6
(j , j )
-2431.06
0.819
[0.788;0.849]
BRD7
(k , k )
-2431.06
0.764
[0.697;0.832]
BRD8
(j , k )
-2431.06
0.741
[0.657;0.826]
BRD9
(k , j )
-2431.06
0.867
[0.851;0.884]
629
37.7.3
Estimator
630
37.8
Statistical Uncertainty
@
Statistical Imprecision
@
@
@
@
@
@
R
@
Statistical Ignorance
631
632
37.8.1
Monotone Patterns
R=1
R=0
Y1,11 Y1,12
Y0,1
Y1,21 Y1,22
Y0,2
R=1
R=0
Y1,11 Y1,12
Y0,11 Y0,12
Y1,21 Y1,22
Y0,21 Y0,22
633
37.8.2
R=0
Y1,11 Y1,12
Y0,1
Y1,21 Y1,22
Y0,2
R=1
R=0
Y1,11 Y1,12
Y0,11 Y0,12
Y1,21 Y1,22
Y0,21 Y0,22
(i,j=1,2;r=0,1)
634
qr|ij
# Par.
Observed d.f.
Complete d.f.
1. MCAR
qr
Non-saturated
Non-saturated
2. MAR
qr|i
Saturated
Non-saturated
3. MNAR(0)
qr|j
Saturated
Non-saturated
4. MNAR(1)
logit(qr|ij ) = + i + j
Overspecified
Non-saturated
5. MNAR(2)
qr|ij
Overspecified
Saturated
Model
635
37.8.3
Procedure:
. Given , calculate parameter and C.I. for
. Set of parameter estimates: region of ignorance
. Set of interval estimates: region of uncertainty
. Single parameter case: region becomes interval
636
37.9
Parameter
1st Margin
2nd Margin
Log O.R.
O.R.
Model 1/2
Model 3
Model 4
Model 5
II
0.43
0.43
0.43
0.43
IU
[0.37;0.48]
[0.37;0.48]
[0.37;0.48]
[0.37;0.48]
II
0.64
0.59
[0.49;0.74]
[0.49;0.74]
IU
[0.58;0.70]
[0.53;0.65]
[0.43;0.79]
[0.43;0.79]
II
2.06
2.06
[1.52;2.08]
[0.41;2.84]
IU
[1.37;2.74]
[1.39;2.72]
[1.03;2.76]
[0.0013;2.84]
II
7.81
7.81
[4.57;7.98]
[1.50;17.04]
IU
[3.95;15.44]
[4.00;15.24]
[2.79;15.74]
[1.0013;32.89]
637
37.10
Model
d.f. G2
Marg. Prob.
First
Second
Odds Ratio
Orig.
Log
BRD1
BRD2
BRD3
BRD4
BRD7
0.0
BRD9
0.0
Model 10:II
0.0
- [0.425;0.429]
Model 10:IU
0.0
[0.37;0.49]
[0.47;0.75]
[0.41;0.80]
[4.40;7.96]
[2.69;15.69]
[1.48;2.07]
[0.99;2.75]
638
37.11
Structure
(, )
d.f.
6
loglik
-2495.29
0.892
C.I.
[0.878;0.906]
BRD2
(, j )
-2467.43
0.884
[0.869;0.900]
BRD3
(k , )
-2463.10
0.881
[0.866;0.897]
BRD4
(, k )
-2467.43
0.765
[0.674;0.856]
BRD5
(j , )
-2463.10
0.844
[0.806;0.882]
BRD6
(j , j )
-2431.06
0.819
[0.788;0.849]
BRD7
(k , k )
-2431.06
0.764
[0.697;0.832]
BRD8
(j , k )
-2431.06
0.741
[0.657;0.826]
BRD9
Model 10
(k , j )
(k , jk )
8
9
-2431.06
-2431.06
0.867
[0.762;0.893]
[0.851;0.884]
[0.744;0.907]
Model 11
(jk , j )
-2431.06
[0.766;0.883]
[0.715;0.920]
Model 12
(jk , jk )
10
-2431.06
[0.694;0.904]
639
37.12
640
37.13
Model
BRD1
Structure
(, )
d.f.
6
loglik
-2495.29
b
0.892
C.I.
[0.878;0.906]
b
0.8920
BRD2
(, j )
-2467.43
0.884
[0.869;0.900]
0.8915
BRD3
(k , )
-2463.10
0.881
[0.866;0.897]
0.8915
BRD4
(, k )
-2467.43
0.765
[0.674;0.856]
0.8915
BRD5
(j , )
-2463.10
0.844
[0.806;0.882]
0.8915
BRD6
(j , j )
-2431.06
0.819
[0.788;0.849]
0.8919
BRD7
(k , k )
-2431.06
0.764
[0.697;0.832]
0.8919
BRD8
(j , k )
-2431.06
0.741
[0.657;0.826]
0.8919
BRD9
(k , j )
-2431.06
0.867
[0.851;0.884]
0.8919
Model 10
(k , jk )
-2431.06
[0.762;0.893]
[0.744;0.907]
0.8919
Model 11
(jk , j )
-2431.06
[0.766;0.883]
[0.715;0.920]
0.8919
Model 12
(jk , jk )
10
-2431.06
[0.694;0.904]
MAR
0.8919
641
=0.885
Estimator
0.883
MNAR
0.782
MNAR interval
[0.753;0.891]
Model 10
[0.762;0.893]
Model 11
[0.766;0.883]
Model 12
[0.694;0.904]
642
Chapter 38
Pattern-mixture Models
643
38.1
644
38.2
Effect
Parameter
Estimate (s.e.)
Fixed-Effect Parameters:
time
7.78 (1.05)
timebaseline
-0.065 (0.009)
timetreatment
0.086 (0.157)
time2
-0.30 (0.06)
time2 baseline
0.0024 (0.0005)
Random intercept
105.42
Serial variance
77.96
Serial association
7.22
Measurement error
77.83
Variance Parameters:
645
38.2.1
First
MNAR
i,j1
Extended 0.033(0.401) 0.013(0.003)base i 0.023(0.005) yi,j2 +y
2
0.047(0.010)
yi,j1 yi,j2
2
negative trend
646
38.3
647
648
38.3.1
Includes: timetreatment
Includes: timetreatmentpattern
PMM2:
38.3.2
Pattern-membership probabilities:
1 , . . . , t , . . . , T .
Their variance:
where
and
` =
n
X
t=1
`tt,
` = 1, . . . , g
Var(1 , . . . , g ) = AV A0
Var(`t)
0
V =
0
Var(t )
(1, . . . , g )
A=
(11, . . . , ng , 1, . . . , n)
650
38.3.3
Considerations
651
38.4
(1a)
Yi = Xi (di) + Zibi + i
bi N (0, D (di))
i N (0, i (di))
(1b)
Pattern as covariate:
(2)
Identifying restrictions:
Yi = Xi + Zibi + di + i
CCMV: Complete Case Missing Values
ACMV: Available Case Missing Values
NCMV: Neighbouring Case Missing Values
652
38.4.1
Identifying Restrictions
Pattern 3
Pattern 2
Pattern 1
653
Effect
Initial
CCMV
NCMV
ACMV
13.21(15.91)
-0.16(0.16)
-2.09(2.19)
-0.84(4.21)
0.01(0.04)
151.91(42.34)
59.84(40.46)
201.54(65.38)
55.12(58.03)
84.99(48.54)
245.06(75.56)
7.56(16.45)
-0.14(0.16)
-1.20(1.93)
-2.12(4.24)
0.03(0.04)
134.54(32.85)
119.76(40.38)
257.07(86.05)
49.88(44.16)
99.97(57.47)
241.99(79.79)
4.43(18.78)
-0.11(0.17)
-0.41(2.52)
-0.70(4.22)
0.02(0.04)
137.33(34.18)
97.86(38.65)
201.87(80.02)
61.87(43.22)
110.42(87.95)
286.16(117.90)
53.85(14.12)
-0.46(0.12)
-0.95(1.86)
-18.91(6.36)
0.15(0.05)
170.77(26.14)
151.84(29.19)
292.32(44.61)
29.78(10.43)
-0.29(0.09)
-1.68(1.21)
-4.45(2.87)
0.04(0.02)
175.59(27.53)
147.14(29.39)
297.38(46.04)
57.22(37.96)
71.58(36.73)
212.68(101.31)
33.74(11.11)
-0.33(0.10)
-1.56(2.47)
-7.00(3.80)
0.07(0.03)
176.49(27.65)
149.05(29.77)
299.40(47.22)
89.10(34.07)
107.62(47.59)
264.57(76.73)
28.69(11.37)
-0.29(0.10)
-2.12(1.36)
-4.22(4.20)
0.05(0.04)
177.86(28.19)
146.98(29.63)
297.39(46.04)
99.18(35.07)
166.64(66.45)
300.78(77.97)
29.91(9.08)
-0.26(0.08)
0.82(0.95)
-6.42(2.23)
0.05(0.02)
206.73(35.86)
96.97(26.57)
174.12(31.10)
87.38(30.66)
91.66(28.86)
262.16(44.70)
29.91(9.08)
-0.26(0.08)
0.82(0.95)
-6.42(2.23)
0.05(0.02)
206.73(35.86)
96.97(26.57)
174.12(31.10)
87.38(30.66)
91.66(28.86)
262.16(44.70)
29.91(9.08)
-0.26(0.08)
0.82(0.95)
-6.42(2.23)
0.05(0.02)
206.73(35.86)
96.97(26.57)
174.12(31.10)
87.38(30.66)
91.66(28.86)
262.16(44.70)
29.91(9.08)
-0.26(0.08)
0.82(0.95)
-6.42(2.23)
0.05(0.02)
206.73(35.86)
96.97(26.57)
174.12(31.10)
87.38(30.66)
91.66(28.86)
262.16(44.70)
Pattern 1:
Time
Timebase
Timetreat
Time2
Time2 base
11
12
22
13
23
33
3.40(13.94)
-0.11(0.13)
0.33(3.91)
131.09(31.34)
Pattern 2:
Time
Timebase
Timetreat
Time2
Time2 base
11
12
22
13
23
33
Pattern 3:
Time
Timebase
Timetreat
Time2
Time2 base
11
12
22
13
23
33
654
38.4.2
Pattern As Covariate
Effect
Time
Time
Time
Timetreat
Timetreat
Timetreat
Timebase
Timebase
Timebase
Timetreatbase
Time2
Time2
Time2
Time2 treat
Time2 base
11
12
22
13
23
33
Pattern
1
2
3
1
2
3
1
2
3
1
2
3
Estimate (s.e.)
7.29(15.69)
37.05(7.67)
39.40(9.97)
5.25(6.41)
3.48(5.46)
3.44(6.04)
-0.21(0.15)
-0.34(0.06)
-0.36(0.08)
-0.06(0.04)
-9.18(2.47)
-7.70(2.29)
1.10(0.74)
0.07(0.02)
173.63(18.01)
117.88(17.80)
233.86(26.61)
89.59(24.56)
116.12(34.27)
273.98(48.15)
655
38.4.3
656
38.5
Connection SEMPMM
f(Di )
: MCAR
l
PMM : MCAR
SeM
MAR
l
ACMV
future
l
future
MNAR
l
general
657
Chapter 39
Concluding Remarks
MCAR/simple
CC
biased
LOCF
inefficient
not simpler than MAR methods
MAR
MNAR
direct likelihood
easy to conduct
weighted GEE
variety of methods
658