Professional Documents
Culture Documents
Stata,
especially with continuous
predictors
di t
Patrick Royston & Willi Sauerbrei
UK Stata Users meeting,
g, London,, 13-14 September
p
2012
The simplest
p
type
yp of interaction:
Binary x binary
E
E.g.
g in the MRC RE01
trial in kidney cancer
12 month % survival
since randomisation
Substantial treatment
effect in patients with
low white cell count
Little or no treatment
effect
ff t in
i th
those with
ith
high white cell count
But really,
y, white cell
count is a continuous
variable
Treatment
group
White cell
count low
(<=10)
White cell
count high
(>10)
MPA
34% (se 4)
24% (se 4)
Interferon
49% (se 4)
21% (se 7)
Overview
Fitting linear interaction models in Stata
General case: analyzing interactions between
continuous covariates in observational studies
Focus on continuous covariates
Maximize power
People may not know how to handle them
Special case: analyzing interactions between
treatment and continuous covariates in
randomized controlled trials
Number of obs
F( 3,
343)
Prob
b > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
347
7.44
0.0001
1
0.0611
0.0529
15.947
-----------------------------------------------------------------------------_t |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------1
1.trt
|
12
12.81405
81405
4
4.124167
124167
3.11
3 11
0.002
0 002
4.702208
4 702208
20
20.92589
92589
wcc | -.2867831
.2741174
-1.05
0.296
-.8259457
.2523796
|
trt#c.wcc |
1 | -1.034239
1 034239
.4327233
4327233
-2.39
2 39
0
0.017
017
-1.885365
1 885365
-.1831142
1831142
|
5
_cons |
14.45292
2.712383
5.33
0.000
9.117919
19.78791
------------------------------------------------------------------------------
-20
-10
Fitted values
F
0
10
1
20
trt 0 (MPA)
20
40
x9: white cell count (per l x 10^-9)
60
Number of obs
F( 3,
343)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
347
10.35
0.0000
0.0830
0.0750
15.76
-----------------------------------------------------------------------------_t |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.0719063
.0876542
0.82
0.413
-.1005011
.2443137
t_mt |
.0659781
.0128802
5.12
0.000
.040644
.0913122
|
c.age#c.t_mt | -.0008783
.0001861
-4.72
0.000
-.0012443
-.0005124
|
_cons |
8.055213
5.256114
1.53
0.126
-2.28306
18.39349
-----------------------------------------------------------------------------8
Continuous x continuous
interactions
10
Motivation
Many people only consider linear by linear
interactions
Not sensible if main effect of either variable is
non-linear
Mi
Mismodelling
d lli
th
the main
i effect
ff t may iintroduce
t d
spurious interactions
E
E.g. ffalse
l assumption
ti
off li
linearity
it can create
t
a spurious linear x linear interaction
Or,
O people
l categorise
i the
h continuous
i
variables
i bl
Many problems, including loss of power
11
MFPIgen
12
13
((-2
2, 2)
(-2, -2)
(-2, -1)
14
15
16
Example: Whitehall 1
Prospective cohort study of 17
17,260
260 Civil
Servants in London
Studied various standard risk factors for
common causes of death
Also studied social factors,
factors particularly job
grade
e consider
co s de 10-year
0 yea all-cause
a cause mortality
o ta ty as the
t e
We
outcome
Logistic
g
regression
g
analysis
y
17
18
-4
4
-3
-2
-1
age by wt
40
60
80
100
x7: weight (kg)
fit o
on wt a
at age 40
0
fit on wt at age 60
120
140
fit o
on wt a
at age 50
19
-2
-3
-4
Logit(pr(dea
ath))
-1
1st quartile
3rd quartile
40
60
80
100
weight
120
140
22
23
24
Whitehall 1: 7 variables,
variables any interactions?
FP2(-2 -2)
age
Linear
3.1169
0.2105
(
(remaining
i i
output
t t omitted)
itt d)
25
26
28
-5
-3
40
45
.25
10
50
age
55
60
65
.15
.1
-1
0
.05
.2
-2
Pr(death)
-4
Logit(pr(de
eath))
-5
-3
.05
.1
.15
.2
-2
Pr(death
h)
-4
Logit(pr(dea
L
ath))
.25
-1
chol
15
0
40
5
chol
45
10
50
age
55
15
60
65
29
0
-2
-4
-6
-8
-8
0
2.5
7.5
10
12.5
2.5
7.5
10
12.5
0
--2
-4
-6
-8
-6
-4
--2
-8
Logit(pr(death))
-6
-4
-2
2.5
7.5
10
12.5
chol
2.5
7.5
10
12.5
30
31
MFPI in Stata
MFPI is implemented as a user command,
command
mfpi
mfpi is available on SSC
Details are given by Royston & Sauerbrei,
Stata Journal 9(2): 230-251
230 251 (2009)
Program was updated in 2012 to support
acto variables
a ab es
factor
33
34
35
1.0
00
0.5
50
0.2
25
0.0
00
Prop
portion a
alive
0.7
75
(1) MPA
(2) Interferon
At risk 1:
175
55
22
11
At risk 2:
172
73
36
20
12
24
36
48
60
72
Follow-up (months)
36
. mfpi,
p , linear(wcc) fp1(wcc)
p
fp2(wcc)
p
with(trt)
gendiff(d): stcox
[treating trt as a factor variable, i.trt]
Interactions with i.trt (347 observations). Flex-1 model (least flexible)
------------------------------------------------------------------------------Var
Main
Interact
idf
Chi2
P
Deviance tdf
AIC
------------------------------------------------------------------------------wcc
Linear
Linear
1
8.13
0.0043 3186.561 3 3192.561
wcc
FP1(2)
FP1(2)
1
5.62
0.0178 3187.954 4 3195.954
wcc
FP2(-.5 1) FP2(-.5 1)
2
8.19
0.0166 3185.237 7 3199.237
------------------------------------------------------------------------------idf = interaction degrees of freedom; tdf = total model degrees of freedom
. mfpi_plot
mfpi plot wcc, vn(3)
[using variables created by gendiff(d)]
37
-2
Treatment effect
-1
0
1
10
15
x9: white cell count (per l x 10^-9)
20
25
38
S(t)
trt = MPA
0.00 0
0.25 0.50 0.75 1
1.00
0.00 0
0.25 0.50 0.75 1
1.00
trt = Interferon-alpha
39
Concluding remarks
mfpigen and mfpi should help researchers
detect, model and visualize interactions with
continuous covariates
Usually, we are searching for interactions, so
small P-values are required
q
Other methods not considered
S
STEPP mainly
a y graphical
g ap ca
40
Thank you
you.
41