Professional Documents
Culture Documents
Survival Data
Liang Li
Department of Quantitative Health Sciences
Cleveland Clinic
Longitudinal Data
FEV1
12
18
Months
Survival Data
Time to event such as death, machine failure, disease relapse, PFS, etc.
100
80
censored
60
40
Kaplan-Meier Curve
20
Survival (%)
death
6
Time (Months)
10
Years
Joint Modeling of
Longitudinal and Survival Data
Question: how does the change in the (earlier) longitudinal profile of a
subject relate to the risk of the (later) survival event?
Example 1: Rate of change of glomerular filtration rate (GFR) & time to
end stage renal disease (ESRD) or death
Example 2: FEV1 & survival among cystic fibrosis patients
Wide-spread use & active research field, e.g, surrogate endpoint
Longitudinal Profile
longitudinal profile =
signal + noise
FEV1
12
18
Months
Data Structure
longitudinal data
survival data
Stage 2
log
"#
%
$
$
f (long$int, slope)f (surv$int, slope)f (int, slope)d[int, slope]
fitted line
10
true line
12
14
16
A New Perspective
fitted line
0
10
Time
1.5
0.0
X=Z+U
-1.0
-0.5
-1.5
0.5
1.0
Y = b0 + b1Z + e
-1
Z (red) or X (blue)
10
The Example
HEMO study: a clinical trial coordinated at Cleveland Clinic
2 by 2 design: standard or high dose of dialysis, low or high flux dialyzer
Neither treatment was found to significantly affect time to all-cause mortality
(Rocco et al 2004)
We want to study a secondary question: whether the decline of albumin levels
is a strong predictor of mortality
Challenge: albumin measurements need to be calibrated to remove artificial
differences due to variations in total body water.
Monday-Wednesday-Friday
Tuesday-Thursday-Saturday
11
Stage 2
i =Xi + i
ALBij =Mon/Tues + i0
+ Timeij i1
+ Noiseij
i0
=Intercept0 + Dosei A0 + Fluxi B0 + i0
i1
=Intercept1 + Dosei A1 + Fluxi B1 + i1
Stage 1
12
C: time to censoring
Y = min(T, C)
= 1{ T < C}
Stage 2
i0 a20 + i1 a21
The proposed model includes as special cases the models considered by
Wang (2006), Ratcliffe et al (2004), Hsieh, Tseng, Wang (2006), Tsiatis &
Davidian (2004), among others
13
K
!
a0k k (t)
k=1
14
True Likelihood
corrected
version
T
T
T
Yig U1ig 1 + i 2 2 Xi + log(cig )
LSi () =
g=0
LMi () =
#
$%
exp UT1ig 1 + iT 2 2T Xi + log(cig )
q
1
1
log(2) log | | (i Xi )T 1
(i Xi )
2
2
2
Longitudinal
Survival/Poisson
Stage 1
15
If W N (X, u2 )
then,
!
i
!
i
Xi
Xi2
exp(Xi )
10
12
14
16
Corrected Likelihood
10
Time
Wi
Linear
(Wi2 u2 )
Quadratic
1
exp(Wi u2 )
2
Exponential
!
i
!
i
LL() +
n
!
i=1
LS() +
n
!
LM ()
i=1
16
A Few Remarks
The proposed estimators are maximizers of the corrected joint log likelihood
function
Variance components estimated separately in a side step.
Mis-specification allowed, like GEE
Result not sensitive to the B-spline approximation
Statistical inference based on sandwich variance estimator
17
Theoretical Properties
The estimators of the unknown parameters are maximizers of the corrected
joint log likelihood
As sample size becomes large:
the estimator is consistent
the estimator is asymptotically normal
the corrected joint log likelihood is concave
These properties remain valid even when the random effects do not have
normal distribution or their variance matrix is misspecified (robust)
19
Simulation Results
We conducted extensive computer simulations to investigate the empirical
performance of the proposed method
Bias, variance, coverage of confidence interval: Good
Result not sensitive to number of knots of B-spline
The computation is much faster than competing methods based on
maximum likelihood
The algorithm is stable, always converge (concavity)
Estimator expected to be less efficient than maximum likelihood based
methods, a trade-off for robustness
20
Bias
uncorrected
(two-step)
proposed
CI coverage
of
proposed
L1=1
0.00197
0.00299
94.5
L2=2
-0.00370
-0.00571
94.0
L3=1
0.00591
0.00659
94.0
L 4 = 0.5
-0.0104
-0.0118
97.0
intercept = 0.5
-0.347
0.0196
96.0
slope = 1
-0.471
0.0552
95.5
Parameter
n=250
21
< 0.001
high dose
0.0012
0.94
high flux
-0.007
0.67
time (years)
-0.058
< 0.001
-0.014
0.311
-0.01
0.468
Monday / Tuesday
-0.026
0.017
high dose
-0.061
0.5
high flux
-0.069
0.44
16
3.7
14
intercept
12
p-value
10
Estimator
Parameter
random intercept
-1.5
< 0.001
random slope
-3.7
< 0.001
10
Time
22
Estimated baseline survival function and its 95% point-wise confidence interval
smooth curve
60
40
20
0
Survival (%)
80
100
Years
23
Summary
A new method for joint modeling
A general model that includes most published models as special case
Theoretically appealing properties and reliable and easy computation
Robust against certain model mis-specification
May use other methods than Trapezoidal rule (Poissonization is not inevitable)
Limitation:
Need at least three repeated measurements per subject
Trade efficiency for robustness, best for large sample size
24
25
45
40
35
30
FEV1
50
55
60
65
20
40
60
80
100
26
1.5
115
178
122
89
256
49
285
266 262
241
1.0
269
25
186
187
156
105
185
293
32
305
195
173 203
309 276
196 310
272
5 50
11
179
264
98 20 287
191
33
213 233
298 295 289
306
219
222
85
171
200
216 303
243
299
104
134
311
253
35 300 8 259
301
170 208
240
78
207
119
21
76
291
274
57
86
181
138
110
304
127
194
268
184
260
283
10
112
237278
43
235 228
97193 261 277
148
12
36
169
244
1806
265
117
252
9218
157
144 129
153
137
24
14
281
120
286
210188
22
155
189
19
192
273
128292
275
199
206
4015 121160 83
20162
270
150236
225
132 239
218
135
90 93282
202
174 147139 113
294
166
284
217 94
71
220
212
133
9
100
114
145151
60 279263227
65
302
229
82
204
123
230
172
19095
26
79
1186123
211175
205
136
224
167
152 159
111
161
47 308
267
149
158
177
168
258
197
63 165
176
231
182
2464
46 72 101
146
242307255 87
131 209
68
31
163
42 48
254
102
223
143
162
164
1
280
27
245 183
290
297
214
142 51
29
130 251
226
107
7
234
221
249
247 103 250
232
918841
13
198
17
45
116
64
257
69
238
141
108
16
125
67
154
140
248
37
81
271 296
288
215
54
109
66
0.5
34
30
96
59
74
124
58
70
106
80
28
53
126
39
44
77
55
75
52
38
73
56
84
0.0
fitted curves
99
20
40
60
80
100
27
1.0
0.5
true curve
0.0
-0.5
Replace subject-specific
intercept or slope with
time-dependent covariate
1.5
2.0
2.5
estimated curve
3
Time
Why varying coefficients: constant hazard ratios unlikely for surgical data
Use what ever method to fit each subjects longitudinal profile separately
Get estimated curve & its variation; do the measurement error correction
Deal with varying coefficients
29
2
1
-2
-1
0
-2
-1
Proposed Method
6
Time
10
6
Time
10
Proposed Method
Local linear method for the full likelihood of Cox model
removed
artificially
censored
6
Time
10
10
Time
Our proposal different from all previous methods in that we did not use
partial likelihood (for exact correction)?
31
n
!
i=1
"
"
n
!
i=1
#
$ %
T
i Xi (Yi ) (Yi )
Yi
&
#
$
exp Xi (t)T (t) dt
#
$ %
Kh (Yi t0 )i Xi (Yi )T (Yi )
"
Yi
&
#
$
Kh (t t0 ) exp Xi (t)T (t) dt
Yi
&
#
$
1
Kh (t t0 ) exp Wi (t)T (t) (t)T (t)(t) dt
2
with correction
References
Liang Li, Bo Hu, Tom Greene (2009) A semiparametric joint model for
longitudinal and survival data with application to hemodialysis study.
Biometrics, in press.
Liang Li. Semiparametric joint modeling of nonlinear time-dependent
covariate process and time to event outcome with varying coefficients.
Working paper.
33