You are on page 1of 34

Joint Modeling of Longitudinal and

Survival Data
Liang Li
Department of Quantitative Health Sciences
Cleveland Clinic

Presented at ASA North Illinois Chapter Spring Meeting, March 5 2009

Outline of the Talk


What is joint modeling of longitudinal & survival data?
The shared parameter model
The measurement error perspective
Our proposal
Why it works (theoretical properties)
How it works (empirical performance)
Extension and on-going work

Longitudinal Data

e.g., After lung transplant, FEV1


measured every week for a
month, and every months
afterwards till the end of the
study

FEV1

Each subject is followed over a


period of time; a series of
measurements made.

12

18

Months

Survival Data
Time to event such as death, machine failure, disease relapse, PFS, etc.

100

Could be censored (partially observed)

80

censored

60
40

Kaplan-Meier Curve

20

Survival (%)

death

6
Time (Months)

10

Years

Joint Modeling of
Longitudinal and Survival Data
Question: how does the change in the (earlier) longitudinal profile of a
subject relate to the risk of the (later) survival event?
Example 1: Rate of change of glomerular filtration rate (GFR) & time to
end stage renal disease (ESRD) or death
Example 2: FEV1 & survival among cystic fibrosis patients
Wide-spread use & active research field, e.g, surrogate endpoint

Longitudinal Profile
longitudinal profile =
signal + noise

Relates intercept & slope to


survival

FEV1

Linear profile: subject-specific


(random) intercept & slope

Can we use raw data profile


and avoid joint modeling?
Nonlinear profile: timedependent covariate curve

12

18

Months

Data Structure

longitudinal data

survival data

e.g., Linear Mixed Model


subject-specific
intercept & slope

Stage 2

e.g., Cox Model


Stage 1

Two-stage hierarchical model


Longitudinal part and survival part are conditionally independent given
the subject-specific intercept and slope
7

Shared Parameter Model


Two-stage hierarchical structure suggests the shared parameter model
Review by Tsiatis & Davidian (2004), and Tseng, Hsieh, Wang (2005), Liu &
Ying (2007), among others
Almost all based on the following Fisher-likelihood
n
!
i=1

log

"#

%
$
$
f (long$int, slope)f (surv$int, slope)f (int, slope)d[int, slope]

Pros: maximum likelihood estimator


Cons: computational intensive, distributional assumptions needed

fitted line

10

step 2: relate them to


survival

true line

step 1: estimate the


intercept and slope for
each subject

12

14

Can we use a two-step


approach for the two-stage
problem?

16

A New Perspective

fitted line
0

10

Time

The Measurement Error Perspective


Do a regression of survival using true subject-specific intercepts and slopes
true intercept & slope unknown
estimated intercept & slope act as surrogates

1.5

measurement error may cause bias in regression

0.0

X=Z+U

-1.0

-0.5

Measurement error cause


attenuation in regression
Y~X

-1.5

0.5

1.0

Y = b0 + b1Z + e

-1

Z (red) or X (blue)

10

The Example
HEMO study: a clinical trial coordinated at Cleveland Clinic
2 by 2 design: standard or high dose of dialysis, low or high flux dialyzer
Neither treatment was found to significantly affect time to all-cause mortality
(Rocco et al 2004)
We want to study a secondary question: whether the decline of albumin levels
is a strong predictor of mortality
Challenge: albumin measurements need to be calibrated to remove artificial
differences due to variations in total body water.
Monday-Wednesday-Friday
Tuesday-Thursday-Saturday

11

Model & Notation - Longitudinal Part


Longitudinal sub-model: linear mixed model
T
Wij =Vij
+ DTij i + #ij

Stage 2

i =Xi + i

In the context of the application:

ALBij =Mon/Tues + i0
+ Timeij i1
+ Noiseij

i0
=Intercept0 + Dosei A0 + Fluxi B0 + i0

i1
=Intercept1 + Dosei A1 + Fluxi B1 + i1

We start with i N (0, )

Stage 1

and shows later that conclusion holds even when


this assumption is dropped.

12

Model & Notation - Survival Part


Survival sub-model: Cox proportional hazard model
T: time to event

C: time to censoring

Y = min(T, C)

= 1{ T < C}

Log hazard function:

h(t; Zi , i ) = h0 (t) + ZTi a1 + iT a2


In the context of the example:

h(t; Zi , i ) =h0 (t) + Dosei a11 + Fluxi a12 +

Stage 2

i0 a20 + i1 a21
The proposed model includes as special cases the models considered by
Wang (2006), Ratcliffe et al (2004), Hsieh, Tseng, Wang (2006), Tsiatis &
Davidian (2004), among others
13

Poissonization of Cox Model


Step 1: use B-spline to approximate log baseline hazard
h0 (t)

K
!

a0k k (t)

k=1

Step 2: use full likelihood of Cox model, not partial likelihood


! Yi
i Ui (Yi )T
exp{Ui (t)T }dt
0

Step 3: use Trapezoidal rule


for numerical integration

Finally: we can fit a Cox


model using Poisson
regression

14

True Likelihood

corrected
version

The joint log likelihood (for one subject)


ni
" Wi Xi Di i "2
2
LLi () =
log(2! )
2
2!2
Mi "
#
$
!

T
T
T
Yig U1ig 1 + i 2 2 Xi + log(cig )
LSi () =
g=0

LMi () =

#
$%
exp UT1ig 1 + iT 2 2T Xi + log(cig )

q
1
1

log(2) log | | (i Xi )T 1
(i Xi )
2
2
2

Longitudinal

Survival/Poisson
Stage 1

Key observation: i appears in linear, quadratic


or exponential terms

15

i = (DTi Di )1 DTi (Wi Vi )


!
"
i |i N i , !2 (DTi Di )1

If W N (X, u2 )

then,

!
i

!
i

Xi

Xi2

exp(Xi )

10

From linear model theory, i is a measurement of i

12

14

16

Corrected Likelihood

10

Time

Wi

Linear

(Wi2 u2 )

Quadratic

1
exp(Wi u2 )
2

Exponential

!
i

!
i

Do correction to the joint log likelihood (formula omitted)


n
!
i=1

LL() +

n
!
i=1

LS() +

n
!

LM ()

i=1

16

A Few Remarks
The proposed estimators are maximizers of the corrected joint log likelihood
function
Variance components estimated separately in a side step.
Mis-specification allowed, like GEE
Result not sensitive to the B-spline approximation
Statistical inference based on sandwich variance estimator

17

Summary on Proposed Method


Key idea: find a corrected joint log likelihood that looks like the true joint
log likelihood with the unknowns eliminated
This is possible because the unknowns reside in linear, quadratic or
exponential terms (Li and Greene, Biometrics 2008)
Combine three pieces of log likelihood together, similar in spirit to the hlikelihood (1996), but different from the classical Fisher likelihood (1922)
Compared with Wang (2006, Stat Sinica), our method
more general (unknown parameters in both sub-models), including most
published models as special case
exact correction with full likelihood instead of approximate correction with
partial likelihood
concave likelihood (next page)
18

Theoretical Properties
The estimators of the unknown parameters are maximizers of the corrected
joint log likelihood
As sample size becomes large:
the estimator is consistent
the estimator is asymptotically normal
the corrected joint log likelihood is concave
These properties remain valid even when the random effects do not have
normal distribution or their variance matrix is misspecified (robust)

19

Simulation Results
We conducted extensive computer simulations to investigate the empirical
performance of the proposed method
Bias, variance, coverage of confidence interval: Good
Result not sensitive to number of knots of B-spline
The computation is much faster than competing methods based on
maximum likelihood
The algorithm is stable, always converge (concavity)
Estimator expected to be less efficient than maximum likelihood based
methods, a trade-off for robustness

20

Bias
uncorrected
(two-step)

proposed

CI coverage
of
proposed

L1=1

0.00197

0.00299

94.5

L2=2

-0.00370

-0.00571

94.0

L3=1

0.00591

0.00659

94.0

L 4 = 0.5

-0.0104

-0.0118

97.0

intercept = 0.5

-0.347

0.0196

96.0

slope = 1

-0.471

0.0552

95.5

Parameter

n=250

21

Application to HEMO Study Data


1628 patients with between 3 and 15 repeated measurements

< 0.001

high dose

0.0012

0.94

high flux

-0.007

0.67

time (years)

-0.058

< 0.001

high dose by time

-0.014

0.311

high flux by time

-0.01

0.468

Monday / Tuesday

-0.026

0.017

high dose

-0.061

0.5

high flux

-0.069

0.44

16

3.7

14

intercept

smaller slope (-0.4)

12

p-value

10

Estimator

Parameter

random intercept

-1.5

< 0.001

random slope

-3.7

< 0.001

10

Time

larger slope (-0.2)

22

Estimated baseline survival function and its 95% point-wise confidence interval

smooth curve

60
40
20
0

Survival (%)

80

100

step function from


partial likelihood

Years

23

Summary
A new method for joint modeling
A general model that includes most published models as special case
Theoretically appealing properties and reliable and easy computation
Robust against certain model mis-specification
May use other methods than Trapezoidal rule (Poissonization is not inevitable)
Limitation:
Need at least three repeated measurements per subject
Trade efficiency for robustness, best for large sample size

24

Nonlinear Longitudinal Data

In a lung transplant study at


Cleveland Clinic,
investigators want to use
FEV1 profile after lung
transplant to predict
mortality

The profile is clearly


nonlinear

25

45
40
35
30

FEV1

50

55

60

65

mean FEV1 trajectory, subject!clustering ignored

20

40

60

80

100

months after transplant

26

Subject!Specific Fitted Curves

1.5

115

178

122
89

256

49
285
266 262
241

1.0

269

25

186
187

156
105

185

293
32

305

195
173 203

309 276
196 310
272
5 50
11
179
264
98 20 287
191
33
213 233
298 295 289
306
219
222
85
171
200
216 303
243
299
104
134
311
253
35 300 8 259
301
170 208
240
78
207
119
21
76
291
274
57
86
181
138
110
304
127
194
268
184
260
283
10
112
237278
43
235 228
97193 261 277
148
12
36
169
244
1806
265
117
252
9218
157
144 129
153
137
24
14
281
120
286
210188
22
155
189
19
192
273
128292
275
199
206
4015 121160 83
20162
270
150236
225
132 239
218
135
90 93282
202
174 147139 113
294
166
284
217 94
71
220
212
133
9
100
114
145151
60 279263227
65
302
229
82
204
123
230
172
19095
26
79
1186123
211175
205
136
224
167
152 159
111
161
47 308
267
149
158
177
168
258
197
63 165
176
231
182
2464
46 72 101
146
242307255 87
131 209
68
31
163
42 48
254
102
223
143
162
164
1
280
27
245 183
290
297
214
142 51
29
130 251
226
107
7
234
221
249
247 103 250
232
918841
13
198
17
45
116
64
257
69
238
141
108
16
125
67
154
140
248
37
81
271 296
288
215
54
109
66

0.5

34

30

96

59

74

124
58
70
106
80
28
53

126

39

44

77

55
75

52
38

73

56
84

0.0

fitted curves

99

20

40

60

80

100

months after transplant

27

1.0
0.5

true curve

0.0

Want Error correction?

-0.5

Replace subject-specific
intercept or slope with
time-dependent covariate

1.5

2.0

2.5

estimated curve

3
Time

Proposed Model & Method


Cox model with time-dependent covariate and time-dependent hazard
ratios (varying coefficients)

hi (t; X(t)) = exp{0 (t) + 1 (t)T Xi (t)}


Wi (t) = Xi (t) + !i (t)

hi (t; X(t)) = exp{0 (t) + 1 T Xi (t)}

Why varying coefficients: constant hazard ratios unlikely for surgical data
Use what ever method to fit each subjects longitudinal profile separately
Get estimated curve & its variation; do the measurement error correction
Deal with varying coefficients

29

2
1
-2

-1

0
-2

-1

Proposed Method

6
Time

10

6
Time

Local linear method: estimate the curves piece by piece at local


neighborhoods.

10

Proposed Method
Local linear method for the full likelihood of Cox model

removed

artificially
censored

6
Time

10

10

Time

Our proposal different from all previous methods in that we did not use
partial likelihood (for exact correction)?
31

The Evolution of Likelihoods


n
!
i=1

n
!
i=1

"
"

n
!
i=1

#
$ %
T
i Xi (Yi ) (Yi )

Yi

&
#
$
exp Xi (t)T (t) dt

#
$ %
Kh (Yi t0 )i Xi (Yi )T (Yi )

"

Yi

Cox log likelihood

&
#
$
Kh (t t0 ) exp Xi (t)T (t) dt

Cox local likelihood


#
$
T
Kh (Yi t0 )i Wi (Yi ) (Yi )

Yi

&
#
$
1
Kh (t t0 ) exp Wi (t)T (t) (t)T (t)(t) dt
2

Replace (t) by intercept + slope t

with correction

with local linear approx.

under construction ... ...


32

References

Liang Li, Bo Hu, Tom Greene (2009) A semiparametric joint model for
longitudinal and survival data with application to hemodialysis study.
Biometrics, in press.
Liang Li. Semiparametric joint modeling of nonlinear time-dependent
covariate process and time to event outcome with varying coefficients.
Working paper.

33

You might also like