You are on page 1of 20

Chilean Journal of Statistics (ChJS) www.soche.

cl/chjs ISSN 0718-7912/20 (print/online version)


Dear author:
Please nd attached the proofs of your article.
You can submit your corrections via e-mail to chjs.editor@uv.cl,
Cc to victor.leiva@uv.cl or by fax to (56)(32)2508322.
Please always indicate the line number to which the correction refers.
You can insert your corrections in the proof PDF and email to us the annotated
PDF.
For fax submission or scanned by email, please ensure that your corrections are
clearly legible. Use a ne black pen and write the correction in the margin, not too
close to the edge of the page.
Remember to indicate the article title, article number, and your name when
sending your response via e-mail or fax.
Check the questions that may have arisen during copy editing and insert your
answers/corrections.
Check that the text is complete and that all gures, tables and their legends are
included. Also check the accuracy of special characters, equations, and electronic
supplementary material if applicable. If necessary refer to the edited manuscript.
Please take particular care that all such details are correct.
Please do not make changes that involve only matters of style. We have followed
ChJS style. Substantial changes in content, e.g., new results, corrected values, title
and authorship are not allowed without the approval of the responsible editor. In
such a case, please contact the Editorial Oce and return his/her consent together
with the proof.
Your article will be published Online First (iFirst) approximately one week after
receipt of your corrected proofs. This is the ocial rst publication. Further
changes are, therefore, not possible.
The printed version will follow in a forthcoming issue.
Please note after online publication, subscribers (personal/institutional) to this journal
will have access to the complete article using the URL: http://www.soche.cl/chjs.
Kind regards,
Editorial Oce
Chilean Journal of Statistics
Journal: Chilean Journal of Statistics
Article: ChJS Vol. 03 No. 01 Art. 05
Author Query Form
Please ensure you ll out your response to the queries raised below and
return this form along with your corrections
Dear author:
During the typesetting process of your article, the following queries have arisen. Please check
your typeset proof carefully against the queries listed below and mark the necessary changes
either directly on the proof/online grid or in the Authors response area provided below.
Query Details required Authors response
1 Please send corresponding authors
mailing address.
2 Line 142, Cook (1982) is not in the
reference list. Please add or remove
3 Line 444, Reference Henderson (1975)
is not cited. Please cite or remove.
4 Line 484, Reference Wei et al. (1998)
is not cited. Please cite or remove.
Chilean Journal of Statistics
Vol. 3, No. 1, April 2012, 118
UNCORRECTED PROOFS 1
Statistical Modeling 2
Research Paper 3
On linear mixed models and their inuence 4
diagnostics applied to an actuarial problem 5
Luis Gustavo Bastos Pinho, Juv encio Santos Nobre

and Slvia Maria de Freitas 6


Department of Applied Mathematics and Statistics, Federal University of Ceara, Fortaleza, Brazil 7
(Received: 10 June 2011 Accepted in nal form: 23 September 2011) 8
Abstract 9
In this paper, we motivate the use of linear mixed models and diagnostic analysis in 10
practical actuarial problems. Linear mixed models are an alternative to traditional cred- 11
ibility models. Frees et al. (1999) showed that some mixed models are equivalent to some 12
widely used credibility models. The main advantage of linear mixed models is the use 13
of diagnostic methods. These methods may help to improve the model choice and to 14
identify outliers or inuential subjects which deserve better attention by the insurer. As 15
an application example, the well known data set in Hachemeister (1975) is modeled by 16
a linear mixed model. We can conclude that this approach is superior to the traditional 17
credibility one since the former is more exible and allows the use of diagnostic methods. 18
Keywords: Credibility models Hachemeister model Linear mixed models 19
Diagnostics Local inuence Residual analysis. 20
Mathematics Subject Classication: 62J05 62J20. 21
1. Introduction 22
One of the main concerns in actuarial science is to predict the future behavior for the 23
aggregate amount of claims of a certain contract based on its past experience. By accurately 24
predicting the severity of the claims, the insurer is able to provide a fairer and thus more 25
competitive premium. 26
Statistical analysis in actuarial science generally belongs to the class of repeated measures 27
studies, where each subject may be observed more than once. By subject we mean each 28
element of the observed set which we want to investigate. Workers of a company, class of 29
employees, and dierent states are possible examples of subjects in actuarial science. To 30
model actuarial data, a large variety of statistical models can be used, but it is usually 31
dicult to choose a model due to the data structure, in which within-subject correlation 32
is often seen. Correlation misspecication may lead to erroneous analysis. In some cases 33
this error is very severe. A clear example may be seen in Demidenko (2004, pp. 2-3) and 34
a similar articial situation is reproduced in Figure 1, that shows the relation between 35
the number of claims and the number of policy holders of an insurer for nine dierent 36
regions within a country. In each region the two variables are measured once a year on the 37

Corresponding author. Email: juvencio@ufc.br


ISSN: 0718-7912 (print)/ISSN: 0718-7920 (online)
c Chilean Statistical Society Sociedad Chilena de Estadstica
http://www.soche.cl/chjs
2 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas
same day for three consecutive years. In Figure 1(a) we do not consider the within-region 38
(within-subject) correlation. The dashed line is a simple linear regression and suggests that 39
the more the policy holders, the less claims occur. In Figure 1(b) we joined the observations 40
for each region by a solid line. It is clear now that the number of claims increases with the 41
number of policy holders. 42
12 14 16 18 20
5
0
0
5
5
0
6
0
0
Number of policy holders (thousands)
N
u
m
b
e
r

o
f

c
l
a
i
m
s
12 14 16 18 20
5
0
0
5
5
0
6
0
0
Number of policy holders (thousands)
N
u
m
b
e
r

o
f

c
l
a
i
m
s
Figure 1. (a) Not considering the within-subject correlation, (b) considering the within-subject correlation.
It is necessary to take into consideration that each region may have a particular behavior 43
which should be modeled, but only this is usually not enough. Techniques summarized 44
under the name of diagnostic procedures may help to identify issues of concern, such as high 45
inuential observations, which may distort the analysis. For linear homoskedastic models, 46
a well known diagnostic procedure is the residual plot. For linear mixed models better 47
types of residuals are dened. Besides residual techniques, which are useful, there is a less 48
used class of diagnostic procedures, which includes case deletion and measuring changes in 49
the likelihood of the adjusted model under minor perturbations. Several important issues 50
may not be noticed without the aid of these last diagnostics methods. 51
For introductory information regarding regression models and respective diagnostic anal- 52
ysis; see Cook and Weisberg (1982) or Drapper and Smith (1998). For a comprehensive 53
introduction to linear mixed models, see Verbeke and Molenberghs (2000), McCulloch 54
and Searle (2001) and Demidenko (2004). Diagnostic analysis of linear mixed models were 55
presented and discussed in Beckman et al. (1987), Christensen and Pearson (1992), Hilden- 56
Minton (1995), Lesare and Verbeke (1998), Banerjee and Frees (1997), Tan et al. (2001), 57
Fung et al. (2002), Demidenko (2004), Demidenko and Stukel (2005), Zewotir and Galpin 58
(2005), Gumedze et al. (2010) and Nobre and Singer (2007, 2011). 59
The seminal work of Frees et al. (1999) showed some similarities and equivalences be- 60
tween mixed models and some well known credibility models. Applications to data sets in 61
actuarial context may be seen in Antonio and Beirlant (2006). Our contribution is to show 62
how to use diagnostic methods for linear mixed models applied to actuarial science. We 63
illustrate how to identify outliers and inuential observations and subjects. We also show 64
how to use diagnostics as a tool for model selection. These methods are very important 65
and usually overlooked by most of the actuaries. 66
This paper is divided as follows. In Section 2 we present a motivational example using 67
a well known data set. In Section 3 we briey present the linear mixed models. Section 4 68
contains a short introduction to the diagnostic methods used in the example. In Section 69
5 we present an application based on the motivational example. Section 6 shows some 70
conclusions. Finally, in an Appendix, we present mathematical details of some formulas 71
and expressions used in the text. 72
Chilean Journal of Statistics 3
2. Motivational Example 73
For a practical example, consider the Hachemeister (1975) data on private passenger bodily 74
injury insurance. The data were collected from ve states (subjects) in the US, through 75
twelve trimesters between July 1970 and June 1973, and show the mean claim amount and 76
the total number of claims in each trimester. The data may be found in the actuar package 77
(see Dutang et al., 2008) from R (R Development Core Team, 2009) and are partially shown 78
in Table 1. 79
Table 1. Hachemeisters data.
Trimester State Mean claim amount Number of claims
1 1 1738 7861
1 2 1364 1622
1 3 1759 1147
1 4 1223 407
1 5 1456 2902
2 1 1642 9251
.
.
.
.
.
.
.
.
.
.
.
.
12 1 2517 9077
12 2 1471 1861
12 3 2059 1121
12 4 1306 342
12 5 1690 3425
In Figure 2 we plot the individual proles for each state and the mean prole. It suggests 80
that the claims have a dierent behavior along the trimesters for each state. One may 81
notice that the claims from state 1 are greater than those from other states for almost 82
every observation, and the claims from states 2 and 3 seem to grow more slowly than 83
those from state 1. If the insurer wants to accurately predict the severity, the subjects 84
individual behavior must also be modeled. Traditionally this is possible with the aid of 85
credibility models; see, e.g., B uhlmann (1967), Hachemeister (1975) and Dannenburg et 86
al. (1996). These models assign weights, known as credibility factors, to a pair of dierent 87
estimates of severity. 88
2 4 6 8 10 12
1
0
0
0
1
5
0
0
2
0
0
0
2
5
0
0
Trimester
A
v
e
r
a
g
e

c
l
a
i
m


a
m
o
u
n
t
State 1
State 2
State 3
State 4
State 5
Average
Figure 2. Individual proles and mean prole for Hachemeister (1975) data.
Credibility models may be functionally dened as
ZB + (1 Z)C,
4 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas
where A represents the severity in a given state, Z is a credibility factor restricted to [0, 1], 89
B is a priori estimate of the expected severity for the same estate and C is a posteriori 90
estimate also of the expected severity. Considering a particular state, B may be equal to 91
the sample mean of the severity of its observations and C equal to the overall sample mean 92
of the data in the same period. 93
Frees et al. (1999) showed that it is possible to nd linear mixed models equivalent to 94
some known credibility models, such as B uhlmann (1967) and Hachemeister (1975) models. 95
Information about linear mixed models is provided in the next section. 96
3. Linear Mixed Models 97
Linear mixed models are a popular alternative to analyze repeated measures. Such models 98
may be functionally expressed as 99
y
i
= X
i
+Z
i
b
i
+e
i
, i = 1, . . . , k, (1)
where y
i
= (y
1
, y
2
, . . . , y
n
i
)

is a n
i
1 vector of the observed values of the response variable 100
for the ith subject, X
i
is a n
i
p known full rank matrix, is a p 1 vector of unknown 101
parameters, also known as xed eects, which are used to model E[y
i
], Z
i
is a n
i
q known 102
full rank matrix, b
i
is a q 1 vector of latent variables, also known as random eects, 103
used to model the within-subject correlation structure, and e
i
= (e
i1
, e
i2
, . . . , e
in
i
)

is the 104
n
i
1 random vector of (within-subject) measurement errors. It is usually also assumed 105
that e
i
ind
N
ni
(0,
2
I
ni
), where I
ni
denotes the identity matrix of order n
i
for i = 1, . . . , k, 106
b
i
iid
N
q
(0,
2
G) for i = 1, . . . , k in which G is a q q positive denite matrix, and 107
e
i
and b
j
are independent i, j. Under these assumption, this is called a homoskedastic 108
conditional independence model. It is possible to rewrite model given in Equation (1) in a 109
more concise way as 110
y = X +Zb +e, (2)
where y = (y

1
, . . . , y

k
)

, X = (X

1
, . . . , X

k
)

, Z =

k
i=1
Z
i
, b = (b

1
, . . . , b

k
)

and 111
e = (e

1
, . . . , e

k
)

, with

representing the direct sum. 112
It can be shown that, conditional on known covariance parameters of the model, it is 113
conditional to the elements of G and
2
, the best linear unbiased estimator (BLUE) for 114
and the best linear unbiased predictor (BLUP) for b are given by 115

= (X

V
1
X)
1
X

V
1
y, (3)
and 116

b = DZ

V
1
(y X

),
respectively, where D =
2
G, V =
2
(I
n
+ ZGZ

), with n =

k
i=i
n
i
; see Hachemeister 117
(1975). 118
Maximum likelihood (ML) and restricted maximum likelihood (RML) methods can be 119
used to estimate the variance components of the model. The latter, proposed in Patterson 120
and Thompson (1971), is usually chosen since it often generates less biased estimators 121
related to the variance structure. When estimates for V are used in Equation (3) to 122
obtain

and

b, they are called empirical BLUE (EBLUE) and empirical BLUP (EBLUP), 123
respectively. Usually the estimation of the parameters involves the use of iterative methods 124
for maximizing the likelihood function. 125
Chilean Journal of Statistics 5
Linear mixed models are not the only way to deal with repeated measures studies. Other 126
popular alternatives are the generalized estimation equations (see Liang and Zeger, 1986; 127
Diggle et al., 2002) and multivariate models as seen in Johnson and Whichern (1982) 128
and Vonesh and Chinchilli (1997). But usually these alternatives are more restrictive than 129
linear mixed models, and they only model the marginal expected value of the response 130
variable. 131
4. Diagnostic Methods 132
Diagnostic methods comprehend techniques whose purpose is to investigate the plausi- 133
bility and robustness of the assumptions made when choosing a model. It is possible to 134
divide the techniques shown here in two classes: residual analysis, which investigates the 135
assumptions on the distribution of errors and presence of outliers; and sensitivity analysis, 136
which analyzes the sensitivity of a statistical model when subject to minor perturbations. 137
Usually, it would be far more dicult, or even impossible, to observe these aspects in a 138
traditional credibility model. 139
In the context of traditional linear models (homoskedastic and independent), examples 140
of diagnostic methods may be seen in Hoaglin and Welsch (1978), Belsley et al. (1980) and 141
Cook (1982). Linear mixed models, extensions and generalizations are briey discussed 142
here and may be seen in Beckman et al. (1987), Christensen and Pearson (1992), Hilden- 143
Minton (1995), Lesare and Verbeke (1998), Banerjee and Frees (1997), Tan et al. (2001), 144
Fung et al. (2002), Demidenko (2004), Demidenko and Stukel (2005), Zewotir and Galpin 145
(2005), Nobre and Singer (2007, 2011) and Gumedze et al. (2010). 146
4.1 Residual analysis 147
In the linear mixed models class three dierent kinds of residuals may be considered. The 148
conditional residuals: e = y X

b, the EBLUP: Z

b, and the marginal residuals: 149

= y X

. These predict respectively conditional error e = y E[y|b] = y X Zb, 150


random eects Zb = E[y|b] E[y], the marginal error = y E[y] = y X. Each of 151
the mentioned residuals is useful to verify some assumption of the model, as seen in Nobre 152
and Singer (2007) and briey presented next. 153
4.1.1 Conditional residuals 154
To identify cases with a possible high inuence on
2
in linear mixed models, Nobre and 155
Singer (2007) suggested the standardization for the conditional residual given by 156
e

i
=
e
i

q
ii
,
where q
ii
represents the ith element in the main diagonal of Q dened as 157
Q =
2
(V
1
V
1
X(X

V
1
X)
1
X

V
1
).
Under normality assumptions on e, this standardization identies outlier observations and 158
subjects; see Nobre and Singer (2007). To do so, the same authors consider the quadratic 159
form M
I
= y

QU
I
(U

I
QU
I
)
1
U

I
Qy, where U
I
= (u
ij
)
(nk)
= (U
i1
, . . . , U
ik
), with U
i
160
representing the ith column of the identity matrix of order n. To identify an outlier subject 161
let I be the index set of the subject observations and evaluate M
I
for this subset. 162
6 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas
Table 2. Diagnostic techniques involving residuals.
Diagnostic Graph
Linearity of xed eects

vs. explanatory variables (tted values)
Presence of outliers e vs. observation index
Homoskedasticity of the conditional errors e vs. tted values
Normality of the conditional errors QQ plot for the least confounded residuals
Presence of outlier subjects Mahalanobis distance vs. observation index
Normality of the xed eects weighted QQ plot for

b
i
4.1.2 Confounded residuals 163
It can be shown that, under the assumptions made by model given in Equation (1), we 164
have 165
e = RQe +RQZb and Z

b = ZGZ

QZb +ZGZ

Qe, 166
where R =
2
I
n
. These identities tell us that e and Z

b depend on b and e and thus 167


are called confounded residuals; see Hilden-Minton (1995). To verify the normality of the 168
conditional errors using only e may be misleading because of the presence of b in the 169
above formulas. Hilden-Minton (1995) dened the confounding fraction as the proportion 170
of variability in e due to the presence of b. The same work suggested the use of a linear 171
transformation L such that L

e has the least confounding fraction possible. The suggested 172


transformation also generates uncorrelated homoskedastic residuals. It is more appropri- 173
ated to analyze the assumption of normality in the conditional errors using L

e instead 174
of e as suggested by Hilden-Minton (1995) and veried by simulation in Nobre and Singer 175
(2007). 176
4.1.3 EBLUP 177
The EBLUP is useful to identify outlier subjects given that it represents the distance 178
between the population mean value and the value predicted for the ith subject. A way of 179
using the EBLUP to search for outliers subjects is to use the Mahalanobis distance (see 180
Waternaux et al., 1989),
i
=

b

i
(

Var[

b
i
b
i
])
1

b
i
. It is also possible to use the EBLUP 181
to verify the random eects normality assumption. For more information; see Nobre and 182
Singer (2007). In Table 2 we summarize diagnostic techniques involving residuals discussed 183
in Nobre and Singer (2007). 184
4.2 Sensitivity analysis 185
Inuence diagnostic techniques are used to detect observations that may produce excessive 186
inuence in the parameters estimates. There are two main approaches of such techniques: 187
global inuence, which is usually based on case deletion; and local inuence, which intro- 188
duces small perturbations in dierent components of the model. 189
In normal homoskedastic linear regression, examples of sensitivity measures are the 190
Cook distance, DFFITS and the COVRATIO; see Cook (1977), Belsley et al. (1980) and 191
Chatterjee and Hadi (1986, 1988). 192
4.2.1 Global influence 193
A simple way to verify the inuence of a group of observations in the parameters es-
timates is to remove the group and observe the changes in the estimation. The group
of observations are inuential if the changes are considerably large. However, in LMM,
it may not be practical to reestimate the parameters every time a set of observations is
removed. To avoid doing so, Hilden-Minton (1995) presented an update formula for the
Chilean Journal of Statistics 7
BLUE and BLUP. Let I = {i
1
, . . . , i
k
} be the index set of the removed observations and
U
I
= (U
i1
, . . . , U
ik
). Hilden-Minton (1995) showed that

(I)
= (X

MX)
1
X

MU
I

(I)
and

b

b
(I)
= DZ

QU
I

(I)
,
where the subscript (I) indicates that the estimates were obtained without the observations 194
indexed by I and

(I)
= (U

I
QU
I
)
1
U

I
Qy. 195
A suggestion to measure the inuence on the parameters estimates in linear mixed
models is to use the Cook distance (see Cook, 1977) given by
D
I
=
(

(I)
)

(X

V
1
X)
1
(

(I)
)
c
=
(y y
(I)
)

V
1
(y y
(I)
)
c
,
such as seen in Christensen and Pearson (1992) and Banerjee and Frees (1997), where c
is a scale factor. However, it was pointed out by Tan et al. (2001) that D
I
is not always
able to measure the inuence on the estimation properly in the mixed models class. The
same authors suggest the use of a measure similar to the Cook distance, but conditional
to BLUP (

b). The conditional Cook distance is dened for the ith observation as
D
cond
i
=
k

j=1
P

j(i)
Var[y|b]
1
P
j(i)
(n 1)k + p
, i = 1, . . . , k,
where P
j(i)
= y
j
y
j(i)
= (X
j

+Z
j

b
j
)(X
j

(i)
+Z
j

b
j(i)
). The same authors decomposed 196
D
cond
i
= D
cond
i1
+ D
cond
i2
+ D
cond
i3
and commented the interpretation of each part of the 197
decomposition. D
cond
i1
is related to the inuence in the xed eects, D
cond
i2
is related to the 198
inuence on the predicted values and D
cond
i3
to the covariance of the BLUE and the BLUP, 199
which should be close to zero if the model is valid. 200
When all the observations from a subject are deleted, it is not possible to obtain the 201
BLUP for the random eects of that subject, making it impossible to obtain D
cond
I
as 202
stated above. For this purpose, Nobre (2004) suggested using D
cond
i
= (n
i
)
1

jI
D
cond
j
, 203
where I indexes the observation from a subject, as a way to measure the inuence of a 204
subject on the parameters estimates when its observations are deleted. 205
There are natural extensions of leverage measures for linear mixed models. These can 206
be seen in Banerjee and Frees (1997), Fung et al. (2002), Demidenko (2004) and Nobre 207
(2004). However, they only provide information about leverage regarding tted marginal 208
values. This has two main limitations as commented in Nobre and Singer (2011). First 209
we may be interested in detecting high-leverage within-subject observations. Second, in 210
some cases the presence of high-leverage within-subject observations does not imply that 211
the subject itself is detected as a high-leverage subject. Suggestions of how to evaluate 212
the within-subject leverage may be seen in Demidenko and Stukel (2005) and Nobre and 213
Singer (2011). 214
4.2.2 Local influence 215
The concept of local inuence was proposed by Cook (1986) and consists in analyzing the 216
sensitivity of a statistical model when subjected to small perturbations. It is suggested 217
to use an inuence measure called the likelihood displacement. Considering the model 218
described in Equation (2), the log-likelihood function may be written as 219
L() =
k

i=1
L
i
() =
1
2
k

i=1
_
ln |V
i
| + (y
i
X
i
)

V
1
(y
i
X
i
)
_
.
8 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas
The likelihood displacement is dened as LD() = 2{L(

) L(

)}, where is a l 1
perturbations vector in an open set R
l
; is the parameters vector of the model,
including covariance parameters;

is the ML estimate of and

is the ML estimate when


the model is perturbed. It is necessary to assume that
0
exists such that L(

) = L(

0
)
and such that LD has its rst and second derivatives in a neighborhood of (

0
)

. Cook
(1986) considered a R
l+1
surface formed by the inuence function () = (

, LD(

))

and the normal curvature in the vicinity of


0
in the direction of a vector d, denoted by
C
d
. In this case, the normal curvature is given by
C
d
= 2|d

L
1
Hd|,
where

L =
2
L()/

and H =
2
L(

)/

both evaluated at =

; see Cook 220
(1986). It can be shown that C
d
always lies between the minimum and maximum eigen- 221
value of the matrix

F = H

L
1
H, so d
max
, the eigenvector associated to the highest 222
eigenvalue, gives information about the direction that exhibits more sensitivity of LD() 223
in a
0
neighborhood. Beckman et al. (1987) made some comments on the eectiveness of 224
the local inuence approach. Lesare and Verbeke (1998) and Nobre (2004) showed some 225
examples of perturbation schemes in the linear mixed models context. 226
Perturbation scheme for the covariance matrix of the conditional errors. To ver-
ify the sensitivity of the model to the conditional homoskedasticity assumption, pertur-
bations are inserted in the covariance matrix of the conditional errors. This can be done
by considering Var[] =
2
(), where () = diag(), with = (
1
, . . . ,
N
)

, the
perturbation vector. For this case we have
0
= 1
N
. The log-likelihood function in this
case is given by
L = L(

) =
1
2
_
ln |V ()| + (y X)

V ()
1
(y X)
_
,
where V

= ZDZ

+
2
(). 227
Perturbation scheme for the response. For the local inuence approach, Beckman et 228
al. (1987) proposed the perturbation scheme 229
y() = y + s,
where s represents a scale factor and is a n 1 perturbation vector. For this scheme 230
we have
0
= 0, with 0 representing the n 1 null vector. In this case, the perturbed 231
log-likelihood function is proportional to 232
L(

) =
1
2
(y + s X)

V
1
(y + s X).
Perturbation scheme for the random effects covariance matrix. It is possible to 233
assess the sensitivity of the model in relation to the random eects homoskedasticity 234
assumption by perturbing the matrix G. Nobre (2004) suggested the use of Var[b
i
] =
i
G 235
as a perturbation scheme. In this case is a q 1 vector and
0
= 1
q
. The perturbed 236
log-likelihood function is proportional to 237
L() =
1
2
k

i=1
_
ln |V
i
()| + (y
i
X
i
)
1
V ()
1
(y
i
X
i
)
_
.
Chilean Journal of Statistics 9
Perturbation weighted case. Verbeke (1995) and Lesare and Verbeke (1998) suggested 238
perturbing the log-likelihood function as 239
L(|) =
k

i=1

i
L
i
().
Such a perturbation scheme is appropriate for measuring the inuence of the ith subject 240
using the normal curvature in its direction and is given by 241
C
i
= 2|d

i
H

L
1
Hd
i
|,
where d
i
is a vector whose entries are 1 in the ith coordinate and zero everywhere else. 242
Verbeke (1995) showed that if C
i
has a high value, then the ith subject has great inuence 243
in the value of

. A threshold of twice the mean value of all C
j
s helps to decide whether 244
or not the observation is inuential. 245
Lesare and Verbeke (1998) extracted from C
i
some interpretable measures. They es- 246
pecially propose using X
i
X

i

2
, R
i

2
Z
i
Z

i

2
, I
ni
R
i
R

i

2
and

V
1
i

2
, where 247
X
i
=

V
1/2
X
i
, Z
i
=

V
1/2
i
Z
i
, R
i
=

V
1/2
i
e
i
, to evaluate the inuence of the ith subject 248
in the model parameter estimates. The actual interpretation of each of these terms can be 249
seen in the original paper. 250
4.2.3 Conform local influence 251
The C
d
measure proposed by Cook (1986) is not invariant to scale re-parametrization. 252
To obtain a similar standardized measure and make it more comparable, Poon and Poon 253
(1999) used the conform normal curvature instead of the normal curvature given by 254
B
d
() =
2|d

L
1
Hd|
2H

L
1
H
.
It can be shown that 0 B
d
() 1 to d direction and that B
d
is invariant to conform scale 255
re-parametrization. A re-parametrization is said to be conform if its jacobian J is such that 256
J

J = tI
s
, to some real t and integer s. They showed that if
1
, . . . ,
l
are the eigenval- 257
ues of

F matrix with v
1
, . . . , v
l
representing the respective normalized eigenvectors, then 258
the value of the conform normal curvature in v
i
direction is equal to
i
/
_

l
i=1

2
i
and 259

l
i=1
B
2
vi
() = 1. If every eigenvector has the same conform normal curvature, its value is 260
equal to 1/

l. Poon and Poon (1999) proposed to use this measure as a referential to mea- 261
sure the intensity of the local inuence of an eigenvector. It can also be shown that when 262
d has the direction of d
max
the conform normal curvature also attains its maximum. In 263
this way, the normal curvature and the conform normal curvature are equivalent methods. 264
5. Application 265
According to Frees et al. (1999), the random coecient models are equivalent to the 266
Hachemeister linear regression model which is used for the example data in Hachemeister 267
(1975). The random coecient model to the data in Table 1 may be described as 268
y
ij
=
i
+ j
i
+ e
ij
, i = 1, . . . , 5, j = 1, . . . , 12,
10 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas
where y
ij
represents the average claim amount for state i in the jth trimester,
i
= 269
+ a
i
and
i
= + b
i
, with xed and , and (a
i
, b
i
)

N
2
(0, D), in which D is a 270
2 2 covariance matrix. Before adjusting the model to the data, we used R to apply the 271
asymptotic likelihood ratio test described in Giampaoli and Singer (2009) to compare the 272
suggested random coecients model and a random intercept model. The p-value obtained 273
from the test was 0.0514. It indicates that it may be enough to consider the random 274
eect for the intercept only. This decision is also supported by the Bayesian information 275
criterion (BIC), which is equal to 808.3 for the single random eect model and 811.6 for the 276
model with two random eects. We may also use another set of tests, involving bootstrap, 277
monte-carlo and permutational methods, to investigate whether or not should we prefer 278
the random intercept model. These tests may be seen in Crainiceanu and Ruppert (2004), 279
Greven et al. (2008) and Fitzmaurice et al. (2007). However, this is very distant from our 280
goals and is not discussed here. For the sake of simplicity and based on the presented 281
reasons we shall use the random intercept model, which diers a little from the model 282
proposed by Frees et al. (1999). Thus, the model to be adjusted for the data in this 283
example is 284
y
ij
=
i
+ j +
ij
, i = 1, . . . , 5, j = 1, . . . , 12, (4)
where
i
= +a
i
, and are the same as dened before. Assume also that Var[
ij
] =
2

285
and Var[a
i
] =
2
a
. 286
The model parameter estimates were obtained by the RML method using the lmer() 287
function from the lme4 package in R. The standard errors were obtained from SAS c (SAS 288
Institute Inc., 2004) using the proc MIXED. The estimates are shown in Table 3. 289
Table 3. Model parameter estimates.
Parameter
2


2
a
Estimate 1460.32 32.41 32981.53 73398.25
SE 131.07 6.79 6347.17 24088.00
Figure 3 shows the ve conditional regression lines obtained from the linear mixed model 290
given in Equation (4). The adjusted model clearly suggests that the claim amount is higher 291
in state 1. Also it suggests a similarity in the claim amounts from states 2 and 4. Besides 292
that, we can expect a smaller risk from policies in state 5, since they are much closer to 293
the respective adjusted conditional line. Further information is explored by the diagnostic 294
analysis commented next. 295
0 10 20 30 40 50 60
1
0
0
0
1
5
0
0
2
0
0
0
2
5
0
0
Conditional regression lines
Observation
A
g
g
r
e
g
a
t
e

c
l
a
i
m

a
m
o
u
n
t
State 1
State 2
State 3
State 4
State 5
Figure 3. Conditional regression lines.
Chilean Journal of Statistics 11
5.1 Diagnostic analysis 296
The standardized residuals proposed by Nobre and Singer (2007) suggest that observation 297
4.7 (obtained from state 4 in the seventh trimester) may be considered an outlier as shown 298
in Figure 4(a). According to the QQ plot in Figure 4(b) it is reasonable to assume that the 299
conditional errors are normally distributed. The Mahalanobis distance in Figure 4(c) was 300
normalized to t the interval [0, 1] and suggests that the rst state may be an outlier. The 301
measure M
I
proposed by Nobre and Singer (2007) in Figure 4(d), also normalized, suggests 302
that none of the states have outlier observations. The Mahalonobis distance should not 303
be confounded with M
I
. The rst is based on the EBLUP and the last is based on the 304
conditional errors, and thus they have dierent meanings. For both analyses, an observation 305
is highlighted if it is greater than twice the mean of the measures. 306
1 2 3 4 5

1
0
1
2
3
(a)
State
S
t
a
n
d
a
r
d
i
z
e
d

c
o
n
d
i
t
i
o
n
a
l

r
e
s
i
d
u
a
l
4.7
2 1 0 1 2

1
0
1
2
3
(b)
Quantiles of N(0,1)
L
e
a
s
t

c
o
n
f
o
u
n
d
e
d

r
e
s
i
d
u
a
l

s
t
a
n
d
a
r
d
i
z
e
d
1 2 3 4 5
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
(c)
State
M
a
h
a
l
a
n
o
b
i
s


d
i
s
t
a
n
c
e
1
1 2 3 4 5
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
(d)
State
M
I
Figure 4. Residual analysis: (a) standardized residuals, (b) least confounding residuals, (c) EBLUP, (d) values for
M
I
.
The conditional Cook distance is shown in Figure 5. The distances were normalized for 307
comparison. Figure 5(a) suggests that observation 4.7 is inuential in the model estimates. 308
The rst term of the distance decomposition suggests that no observations were inuential 309
in the estimate of as shown in Figure 5(b). The second term of the decomposition 310
suggests that observation 4.7 is potentially inuential in the prediction of b as seen on 311
Figure 5(c). The last term, D
i3
, is as close to zero as expected and is omitted. 312
12 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas
1 2 3 4 5
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
(a)
State
C
o
o
k

s

c
o
n
d
i
t
i
o
n
a
l

d
i
s
t
a
n
c
e
4.7
1 2 3 4 5
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
(b)
State
D
1
1 2 3 4 5
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
(c)
State
D
2
4.7
Figure 5. (a) Conditional Cook distance, (b) D
i1
, (c) D
i2
.
Figure 6 shows the local inuence analysis using three dierent perturbation schemes. 313
The rst, in Figure 6(a), is related to the conditional errors covariance matrix, as suggested 314
in Beckman et al. (1987), and indicates that the observations from the fourth state, espe- 315
cially 4.7, are possibly inuential in the homoskedasticity and independence assumption 316
for the conditional errors. Notice that it is possible to explain the inuence of observation 317
4.7 analyzing Figure 2. This observation has a value considerably higher than the others 318
from the same state. Figure 6(b) demonstrates the perturbation scheme for the covariance 319
matrix associated to the random eects as shown in Nobre (2004). Alternative perturba- 320
tion schemes for this case can be seen at Beckman et al. (1987). These schemes suggest that 321
all states are equally inuential in the random eects covariance matrix estimate. Finally, 322
there are evidences that the observations in the fourth state may not be well predicted by 323
the model; see Figure 6(c). 324
After the diagnostic we proceed to a conrmatory analysis by removing the observations 325
from states 1 and 4, one at a time and then both at the same time. The new estimates are 326
shown in Table 4. For each parameter, we calculate the relative change in the estimated 327
values, dened for parameter , as 328
RC() =

(i)

100%.
Chilean Journal of Statistics 13
1 2 3 4 5
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
(a)
State
A
b
s
o
l
u
t
e

v
a
l
u
e
s

o
f

d
m
a
x

c
o
m
p
o
n
e
n
t
s
4.7
4.1
4.11
4.12
1.2
1 2 3 4 5
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
(b)
State
A
b
s
o
l
u
t
e

v
a
l
u
e

o
f

d
m
a
x

c
o
m
p
o
n
e
n
t
s
1 2 3 4 5
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
(c)
State
|
|
I

R
R
T
|
|
2
4
Figure 6. Perturbation schemes: (a) conditional covariance matrix, (b) random eects covariance matrix, (c) values
for I
n
i
R
i
R

i

2
.
Table 4. Estimates and relative changes for the model given in Equation (4) parameter estimates with and without
states 1 and 4.
Situation
2


2
a
Complete data 1460.32 32.41 32981.53 73398.25
Without State 1 1408.63 (3.67) 25.26 (22.06) 34666.31 ( 5.11) 34335.64 (53.22)
Without State 4 1530.94 (4.61) 33.50 ( 3.36) 24940.12 (24.38) 59214.50 ( 19.32)
Without States 1 and 4 1485.56 (1.70) 24.32 (24.96) 24497.48 (25.72) 23707.07 (67.70)
If all ve states were equally inuential, we would expect the value for RC to lie around 329
1/5 = 20% after removing a state. If RC() exceeds two times this value, that is 40%, 330
for some parameter we consider the state was potentially inuential. It is possible to 331
conclude that three observations from state 1 were inuential in the within-subject variance 332
estimate. From Figure 2, one can explain this inuence noticing that all the observations 333
from state 1 had higher values compared to the other states. Notice that such inuence was 334
not detected in Figure 5(b), but was pointed out by the Mahalanobis distance in Figure 335
4(c). Removing state 1 from our analysis and running every diagnostic procedure again 336
we detect no excessive inuence and the only issue is the observation 4.7, which is still an 337
outlier. From this result the model is validated and it is assumed to be robust and ready 338
for use. 339
14 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas
340
6. Conclusions 341
The use of linear mixed models in actuarial science should be encouraged given their 342
capability to model the within-subject correlation, their exibility and the presence of 343
diagnostic tools. Insurers should not use a model without validating it rst. For the specic 344
example seen here, the decision makers may consider a dierent approach for state 1. 345
After removing observations from state 1 there was a relative change of more than 50% in 346
the random eect variance estimate, which reects signicantly in the premium estimate. 347
Such analysis would not be possible in the traditional credibility models approach. This 348
illustrates how the model can be used to identify dierent sources of risk and can be used 349
in portfolio management. Linear mixed models are also usually easier to understand and 350
to present, when compared to standard actuarial methods, such as the credibility models 351
and Bayesian approach for determining the fair premium. The natural extension of this 352
work is to repeat the estimation and diagnostic procedures, adapting what is necessary, 353
to the generalized linear mixed models, which are also useful to actuarial science. Some 354
works have already been made in this area; see, e.g., Antonio and Beirlant (2006). It is also 355
interesting to continue a further analysis of the example in Hachemeister (1975), using the 356
diagnostic procedures again when weights are introduced to the covariance matrix of the 357
conditional residuals in the random coecient models, and to evaluate the robustness of 358
the linear mixed models equivalent to the other classic credibility models. Again, this care 359
is justied because the fairest premium is more competitive in the market. 360
Appendix 361
We present here expressions for matrix H and the derivatives seen in the dierent pertur- 362
bation schemes presented in Section 4.2.2. These calculations are taken from Nobre (2004) 363
and are presented here to make this text more self-content. 364
Appendix A. Perturbation Scheme for the Covariance Matrix of the 365
Conditional Errors 366
Let H
(k)
be the kth column of H and f be the number of distinct components of matrix 367
D, then 368
H
(k)
=
_
_

2
L()

,

2
L()

2
,

2
L()

1
, . . . ,

2
L()

f
_

,
where 369

2
L()

;=
0
= X

D
k
e,
370

2
L()

;=0
=
1
2
_

2
tr
_
D
k
Z

DZ

_
2 e

D
k

V
1
e +
2
e

D
k
e
_
,
371

2
L()

;=
0
=
1
2
_
tr
_
D
k
Z

D
i
Z

_
2 e

D
k
Z

D
i
Z

e
_
,
Chilean Journal of Statistics 15
with 372
D
k
=
V
1
()

;=0
=
2

V
(k)
(

V
(k)
)

, e

D
i
=
D

;=0
,
and V
(k)
representing the kth column of V
1
. 373
Appendix B. Perturbation Scheme for the Response 374
It can be shown that 375

2
L()

;=0
= s

V
1
X,
376

2
L()

;=
0
= s

V
1

V
1
e,
377

2
L()

;=
0
= sV
1
Z

D
i
Z

V
1
e,
implying 378
H

= sV
1
_
X,

V
1
e, Z

D
1
Z

V
1
e, . . . , Z

D
f
Z

V
1
e
_
.
Appendix C. Perturbation Scheme for the Random Effects Covariance 379
Matrix 380
For this scheme we have 381
H
(k)
=
_
_

2
L()

,

2
L()

2
,

2
L()

1
, . . . ,

2
L()

f
_

.
It can be shown that 382

2
L(
k
)

;=
0
= X

k

V
1
k
Z
k

GZ

k

V
1
k
e
k
,
383

2
L(
k
)

;=0
= tr
_

V
1
k
Z
k

G
k
Z

k
_

2 e

k

V
1
k
Z
k

GZ

k

V
1
k

V
1
k
e
k
,
384

2
L(
k
)

;=0
= tr
_

V
1
k
Z
k

G
k
Z

k

V
1
k
Z
k

G
i
Z

k
_
e

k

V
1
k
Z
k

GZ

k

V
1
k
Z
k

G
i
Z

k

V
1
k
e
k
.
16 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas
Acknowledgements 385
We are grateful to Conselho Nacional de Desenvolvimento Cientco e Tecnolgico (CNPq 386
project # 564334/2008-1) and Fundao Cearense de Apoio ao Desenvolvimento Cientco 387
e Tecnolgico (FUNCAP), Brazil for partial nancial support. We also thank an anonym 388
referee and the executive editor for their careful and constructive review. 389
References 390
Antonio, K., Beirlant, J., 2006. Actuarial statistics with generalized linear mixed models. 391
Insurance: Mathematics and Economics, 75, 643676. 392
Banerjee, M., Frees, E.W., 1997. Inuence diagnostics for linear longitudinal models. Jour- 393
nal of the American Statistical Association, 92, 9991005. 394
Beckman, R.J., Nachtsheim, C.J., Cook, R.D., 1987. Diagnostics for mixed-model analysis 395
of variance. Technometrics, 29, 413426. 396
Belsley, D.A., Kuh, E., Welsch, R.E., 1980. Regression Diagnostics: Identifying Inuential 397
Data and Sources of Collinearity. John Wiley & Sons, New York. 398
B uhlmann, H., 1967. Experience rating and credibility. ASTIN Bulletin, 4, 199207. 399
Chatterjee, S., Hadi, A.S., 1986. Inuential observations, high leverage points, and outliers 400
in linear regression (with discussion). Statistical Science, 1, 379393. 401
Chatterjee, S., Hadi, A.S., 1988. Sensitivity Analysis in Linear Regression. John Wiley & 402
Sons, New York. 403
Christensen, R., Pearson, L.M., 1992. Case-deletion diagnostics for mixed models. Tech- 404
nometrics, 34, 3845. 405
Cook, R.D., 1977. Detection of inuential observation in linear regression. Technometrics, 406
19, 1528. 407
Cook, R.D., 1986. Assessment of local inuence (with discussion). Journal of The Royal 408
Statistical Society Series B - Statistical Methodology, 48, 117131. 409
Cook, R.D., Weisberg, S., 1982. Residuals and Inuence in Regression. Chapman and Hall, 410
London. 411
Crainiceanu, C.M., Ruppert, D., 2004. Likelihood ratio tests in linear mixed models with 412
one variance component. Journal of The Royal Statistical Society Series B - Statistical 413
Methodology, 66, 165185. 414
Dannenburg, D.R., Kaas, R., Goovaerts, M.J., 1996. Practical actuarial credibility models. 415
Institute of Actuarial Science and Economics, University of Amsterdam, Amsterdam. 416
Demidenko, E., 2004. Mixed Models - Theory and Applications. Wiley, New York. 417
Demidenko, E., Stukel, T.A., 2005. Inuence analysis for linear mixed-eects models, 418
Statistics in Medicine, 24, 893909. 419
Diggle, P.J., Heagerty, P., Liang, K.Y., Zeger, S.L., 2002. Analysis of Longitudinal Data. 420
Oxford Statistical Science Series. 421
Drapper, N.R., Smith, N., 1998. Applied Regression Analysis. Wiley, New York. 422
Dutang, C., Goulet, V., Pigeon, M., 2008. Actuar: an R package for actuarial science. 423
Journal of Statistical Software, 25, 137. 424
Fitzmaurice, G.M., Lipsitz, S.R., Ibrahim, J.G., 2007. A note on permutation tests for 425
variance components in multilevel generalized linear mixed models. Biometrics, 63, 942 426
946. 427
Frees, E.W., Young, V.R., Luo, Y., 1999. A longitudinal data analysis interpretation of 428
credibility models. Insurance: Mathematics and Economics, 24, 229247. 429
Fung, W.K., Zhu, Z.Y., Wei, B.C., He, X., 2002. Inuence diagnostics and outliers tests 430
for semiparametric mixed models. Journal of The Royal Statistical Society Series B - 431
Statistical Methodology, 64, 565579. 432
Chilean Journal of Statistics 17
Giampaoli, V., Singer, J., 2009. Restricted likelihood ratio testing for zero variance com- 433
ponents in linear mixed models. Journal of Statistical Planning and Inference, 139, 434
14351448. 435
Greven, S., Crainiceanu, C.M., Kuchenho, H., Peters, A., 2008. Likelihood ratio tests for 436
variance components in linear mixed models. Journal of Computational and Graphical 437
Statistics, 17, 870891. 438
Gumedze, F.N., Welham, S.J., Gogel, B.J., Thompson, R., 2010. A variance shift model 439
for detection of outliers in the linear mixed model. Computational Statistics and Data 440
Analysis, 54, 21282144. 441
Hachemeister, C.A., 1975. Credibility for regression models with application to trend. 442
Proceedings of the Berkeley Actuarial Research Conference on Credibility, pp. 129163. 443
Henderson, C.R., 1975. Best linear unbiased estimation and prediction under a selection 444
model. Biometrics, 31, 423447. 445
Hilden-Minton, J.A., 1995. Multilevel diagnostics for mixed and hierarchical linear models. 446
Ph.D. Thesis. University of California, Los Angeles. 447
Hoaglin, D.C., Welsch, R.E., 1978. The hat matrix in regression and ANOVA. American 448
Statistical Association, 32, 1722. 449
Johnson, R.A., Whichern, D.W., 1982. Applied Multivariate Stastical Analysis. Sixth edi- 450
tion. Prentice Hall. pp. 273332. 451
Lesare, E., Verbeke, G., 1998. Local inuence in linear mixed models. Biometrics, 54, 452
570582. 453
Liang, K.Y., Zeger, S.L., 1986. Longitudinal analysis using generalized linear models. 454
Biometrika, 73, 1322. 455
McCulloch, C.E, Searle, S.R., 2001. Generalized, Linear, and Mixed Models. Wiley, New 456
York. 457
Nobre, J.S., 2004. Mtodos de diagnstico para modelos lineares mistos. Unpublished Master 458
Thesis (in portuguese). IME/USP, Sao Paulo. 459
Nobre, J.S., Singer, J.M., 2007. Residuals analysis for linear mixed models. Biometrical 460
Journal, 49, 863875. 461
Nobre, J.S., Singer, J.M., 2011. Leverage analysis for linear mixed models. Journal of 462
Applied Statistics, 38, 10631072. 463
Patterson, H.D., Thompson, R., 1971. Recovery of interblock information when block sizes 464
are unequal. Biometrika, 58, 545554. 465
Poon, W.Y., Poon, Y.S., 1999. Conformal normal curvature and assessment of local in- 466
uence. Journal of The Royal Statistical Society Series B - Statistical Methodology, 61, 467
5161. 468
R Development Core Team, 2009. R: A Language and Environment for Statistical Com- 469
puting. R Foundation for Statistical Computing, Vienna, Austria. 470
SAS Institute Inc., 2004. SAS 9.1.3 Help and Documentation. SAS Institute Inc., Cary, 471
North Carolina. 472
Tan, F.E.S., Ouwens, M.J.N., Berger, M.P.F., 2001. Detection of inuential observation in 473
longitudinal mixed eects regression models. The Statistician, 50, 271284. 474
Verbeke, G., 1995. The linear mixed model. A critical investigation in the context of longi- 475
tudinal data analysis. Ph.D. Thesis. Catholic University of Leuven, Faculty of Science, 476
Department of Mathematics, Leuven, Belgium. 477
Verbeke, G., Molenberghs, G., 2000. Linear Mixed Models for Longitudinal Data. Springer. 478
Vonesh, E.F., Chinchilli, V.M., 1997. Linear and Nonlinear Models for the Analysis of 479
Repeated Measurements. Marcel Dekker, New York. 480
Waternaux, C., Laird, N.M., Ware, J.H., 1989. Methods for analysis of longitudinal data: 481
blood-lead concentrations and cognitive development. Journal of the American Statisti- 482
cal Association, 84, 3341. 483
18 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas
Wei, B.C., Hu, Y.Q., Fung, W.K., 1998. Generalized leverage and its applications. Scan- 484
dinavian Journal of Statistics, 25, 2537. 485
Zewotir, T., Galpin, J.S., 2005. Inuence diagnostics for linear mixed models. Journal of 486
Data Science, 3, 53177. 487

You might also like