You are on page 1of 19

This article was downloaded by: [Universiti Pendidikan Sultan Idris], [Nor Azah Samat]

On: 27 June 2012, At: 19:09


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied Statistics


Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/cjas20

Vector-borne infectious disease


mapping with stochastic difference
equations: an analysis of dengue
disease in Malaysia
a

N. A. Samat & D. F. Percy

Department of Mathematics, Faculty of Science and


Mathematics, Universiti Pendidikan Sultan Idris, 35900 Tanjong
Malim, Perak, Malaysia
b

Salford Business School, University of Salford, Greater


Manchester, M5 4WT, UK
Version of record first published: 27 Jun 2012

To cite this article: N. A. Samat & D. F. Percy (2012): Vector-borne infectious disease mapping
with stochastic difference equations: an analysis of dengue disease in Malaysia, Journal of Applied
Statistics, DOI:10.1080/02664763.2012.700450
To link to this article: http://dx.doi.org/10.1080/02664763.2012.700450

PLEASE SCROLL DOWN FOR ARTICLE


Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation
that the contents will be complete or accurate or up to date. The accuracy of any
instructions, formulae, and drug doses should be independently verified with primary
sources. The publisher shall not be liable for any loss, actions, claims, proceedings,
demand, or costs or damages whatsoever or howsoever caused arising directly or
indirectly in connection with or arising out of the use of this material.

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

Journal of Applied Statistics


2012, iFirst article

Vector-borne infectious disease mapping


with stochastic difference equations: an
analysis of dengue disease in Malaysia
N.A. Samata and D.F. Percyb
a Department

of Mathematics, Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris,


35900 Tanjong Malim, Perak, Malaysia; b Salford Business School, University of Salford, Greater
Manchester, M5 4WT, UK
(Received 3 May 2011; final version received 2 June 2012)

Few publications consider the estimation of relative risk for vector-borne infectious diseases. Most of
these articles involve exploratory analysis that includes the study of covariates and their effects on disease
distribution and the study of geographic information systems to integrate patient-related information. The
aim of this paper is to introduce an alternative method of relative risk estimation based on discrete time
space stochastic SIR-SI models (susceptibleinfectiverecovered for human populations; susceptible
infective for vector populations) for the transmission of vector-borne infectious diseases, particularly
dengue disease. First, we describe deterministic compartmental SIR-SI models that are suitable for dengue
disease transmission. We then adapt these to develop corresponding discrete timespace stochastic SIRSI models. Finally, we develop an alternative method of estimating the relative risk for dengue disease
mapping based on these models and apply them to analyse dengue data from Malaysia. This new approach
offers a better model for estimating the relative risk for dengue disease mapping compared with the other
common approaches, because it takes into account the transmission process of the disease while allowing
for covariates and spatial correlation between risks in adjacent regions.
Keywords:

1.

relative risk; disease mapping; dengue disease; tract-count data; SIR-SI models

Introduction

Dengue is a common, serious, infectious, mosquito-borne, viral disease in tropical and subtropical
regions of the world. Dengue viruses are transmitted to humans through the bites of infective
female Aedes mosquitoes, which live in clear and stagnated water that is mostly generated by
human activity and rainfall. There is currently no vaccine available for the prevention or treatment
of dengue disease. However, dengue can be prevented and controlled if detected early. Therefore,
the use of statistical models for studying the transmission of dengue disease and the estimation
Corresponding

author. Email: norazah@fsmt.upsi.edu.my

ISSN 0266-4763 print/ISSN 1360-0532 online


2012 Taylor & Francis
http://dx.doi.org/10.1080/02664763.2012.700450
http://www.tandfonline.com

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

N.A. Samat and D.F. Percy

of relative risk for disease mapping are important contributions to the prevention and control
strategies for dengue.
This paper investigates geographical distribution and disease mapping particularly for dengue
disease. Relative risk estimation is one of the most important issues when studying geographical
distributions of disease occurrence. Many studies of disease mapping use regression-type models
in which observable (fixed effects) and unobservable (random effects) variables are included to
give a clean map and so depict the true excess risk [24,13,16,19,32]. In spite of this, published
studies that use structural disease transmission models for disease mapping are scarce [9].
Specifically for the case of dengue disease, few researchers use stochastic processes to estimate
the relative risk for disease mapping. Rather, most dengue studies are based on exploratory
data analysis accompanied by pictorial maps, which includes the study of covariates and their
effects on dengue disease distribution. See, for example, [10,27]. Furthermore, some authors use
a geographic information system to integrate the patient-related information [31].
In attempting to develop an improved model and a complementary analysis, our research
introduces an alternative method to estimate the relative risk of dengue disease transmission
based initially on discrete-time, discrete-space, stochastic SIR-SI models (susceptibleinfective
recovered for human populations; susceptibleinfective for vector populations). This method is
designed to overcome the drawbacks of relative risk estimation in disease mapping using the
classical approach based on standardized morbidity ratios (SMRs). It involves extending the
fundamental Poisson-gamma model and developing a Bayesian analytic approach.
In the remainder of this paper, we first describe existing deterministic compartmental SIR-SI
models for dengue disease transmission. Then, we derive a discrete timespace stochastic SIRSI model for dengue disease transmission, which adapts and extends the stochastic SIR models
described by Lawson [14]. We then continue with explanations about an alternative method of
relative risk estimation for dengue disease mapping, which we develop based on this new stochastic
SIR-SI model. This method is then applied to dengue data of Malaysia to demonstrate the models
in practice.
2.

Compartmental SIR-SI models for dengue disease transmission

The compartmental model displayed in Figure 1 is the most common model used in the study of
dengue disease transmission and is adapted from [7,22]. In this study, for i = 1, 2, . . . , M study

Figure 1. Compartmental SIR-SI model for dengue disease transmission.

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

Journal of Applied Statistics

(h)
regions, and j = 1, 2, . . . , T time periods, Si,j
represents the total number of susceptible humans
(h)
(h)
at time j, Ii,j represents the total number of infective humans at time j, and Ri,j
represents the total
number of recovered humans at time j. We use the superscript (h) to distinguish the variables and
parameters as representing the human population rather than the vector population for which we
(v)
use the superscript (v). Furthermore, in Figure 1, Si,j
represents the total number of susceptible
(v)
mosquitoes at time j, Ii,j represents the total number of infective mosquitoes at time j, (h) and
(v) represent the (assumed equal) birth and death rates of humans per week and the (assumed
equal) birth and death rates of mosquitoes per week, respectively, (h) represents the rate at which
humans recover per week, b represents the biting rate per week, m represents the number of
alternative hosts available as the blood source, A represents the constant recruitment rate for the
mosquito vector, (h) represents the transmission probability from mosquitoes to humans, (v)
represents the transmission probability from humans to mosquitoes, Ni(h) represents the human
population size for the study region i and Ni(v) represents the mosquito population size for the
study region i. These definitions and notations hold throughout this paper.
For the case of dengue, susceptible people can become infective and then recover or die due
to the infection. However, susceptible Aedes mosquitoes can become infective but they will not
recover or die due to the infection because infective mosquitoes stay infective for the remainder
of their lifetimes.
For discrete-time intervals, the compartmental model in Figure 1 can also be written mathematically as a system of difference equations. Therefore, the deterministic SIR-SI model for dengue
disease transmission in human populations is given by




(h) b
(h)
(h)
(v)
(h) (h)
(h)
Si,j = Ni + 1
,
(1)
Ii,j1 Si,j1
Ni(h) + m


(h) b
(h)
(v)
(h)
(h)
(h) (h)
Ii,j = (1 )Ii,j1 +
Si,j1
,
(2)
Ii,j1
Ni(h) + m
(h)
(h)
(h)
= (1 (h) )Ri,j1
Ri,j
+ (h) Ii,j1
.

(3)

Similarly, the deterministic SIR-SI model for dengue disease transmission in vector populations
is given by




(v) b
(v)
(h)
(v)
(v) (v)
(v)
Ii,j1 Si,j1
,
(4)
Si,j = Ni + 1
Ni(h) + m


(v) b
(v)
(h)
(v)
(v) (v)
Ii,j = (1 )Ii,j1 +
Ii,j1
Si,j1
.
(5)
Ni(h) + m
The combined model derived above has the same form as the deterministic SIR-SI model used
by Esteva and Vargas [7]. Here, Ni(h) and Ni(v) are assumed to be constant, such that Ni(h) =
(h)
(h)
(v)
+ Ii,j(h) + Ri,j
and Ni(v) = Si,j
+ Ii,j(v) . This formulation can then be used to provide a link to
Si,j
stochastic means, which will be explained in the next section.
3.

Stochastic SIR-SI model for dengue disease transmission

A deterministic analysis provides a good approximation to the stochastic means for a major outbreak when the sample size is large [12]. Therefore, in the following analysis we use a formulation
of the deterministic model to provide an approximation to the stochastic means.

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

N.A. Samat and D.F. Percy

Lawson [14] developed a stochastic SIR model for direct transmission of infectious diseases.
Although it only considered discrete time and discrete space, this model proved very effective for
analysing the spread of influenza. We now extend this model to enable the analysis of indirectly
transmitted infectious diseases, similarly taking into account of correlations among neighbouring
regions, using a spatial prior as described later in this section. However, in this study we include
(h)
(h)
(v)
the terms i,j
, i,j
and i,j
to represent the numbers of newly infective humans, newly recovered
humans and newly infective mosquitoes, respectively, all in the interval or time period (j 1, j],
and study region i. This is because the dengue data that we observe are weekly new infective cases
in human populations, and we are interested in finding the posterior mean of the new infective
dengue cases each week.
For i = 1, 2, . . . , M study regions and j = 1, 2, . . . , T time periods, our discrete timespace
stochastic SIR-SI model for dengue disease transmission in human populations follows by adapting Equations (1)(5) and including a probability distribution to reflect the randomness inherent
in the data as shown:
(h)
(h)
(h)
Si,j
= (h) Ni(h) + (1 (h) )Si,j1
i,j
,

(6)

(h)
i,j

(7)

(h)
Poisson(i,j
),

(h)
i,j
= exp(0(h) + ci(h) )

(h) b
Ni(h)

+m

(v)
(h)
Si,j1
,
Ii,j1

(h)
(h)
(h)
Ii,j(h) = (1 (h) )Ii,j1
+ i,j
i,j
,

(8)
(9)

(h)
(h)
(h)
= (1 (h) )Ri,j1
+ i,j
,
Ri,j

(10)

(h)
(h)
= (h) Ii,j1
i,j
.

(11)

Furthermore, in this study and due to the general unavailability of sufficient data for vectors, the
discrete-time discrete-space SIR-SI models for dengue disease transmission in vector populations
are assumed non-stochastic and are as follows:
(v)
(v)
(v)
= (v) Ni(v) + (1 (v) )Si,j1
i,j
,
Si,j


(v) b
(v)
(h)
(v)
i,j
=
Si,j1
.
Ii,j1
Ni(h) + m
(v)
(v)
Ii,j(v) = (1 (v) )Ii,j1
+ i,j
.

(12)
(13)
(14)

We use the Poisson distribution to model the number of new infectives, as this is the fundamental
(h)
is chosen to match the deterministic form in Equation (2) with
model for count data. Its mean i,j
a positive multiplicative factor to represent spatial correlation as explained below.
The formulations above show that the counts of new infective humans are assumed to follow
independent Poisson distributions, where the expected numbers of new infectives include elements
of the transmission, which are the simple direct dependence of current infective counts on previous
counts in the same spatial unit and a linear predictor term that can include covariates or random
effects.
As these counts are conditional upon other variables, the Poisson assumption cannot be tested in
isolation, but rather by trying other candidate distributions and comparing overall goodness-of-fit
as described in Section 5.3. However, the Poisson assumption is the default for log linear models
such as this, and we leave the testing of other distributions to future investigations.
In Equation (8), 0(h) is a constant term to describe the overall rates of the process for human
populations, and ci(h) is a random effect that is designed to absorb residual spatial variation for

Journal of Applied Statistics

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

human populations. In this study, a conditional autoregressive (CAR) prior is used as a family of
prior distributions for the random effect. This CAR model was proposed by Besag et al. [3], where
the probability densities of values at any given location are conditional on the neighbouring areas.
The advantage of this intrinsic CAR model is that the conditional moments are defined as simple
functions of the neighbouring values and the number of neighbours mi by means of a conditional
distribution defined by


r
ci(h) |cj(h) (j  = i) Normal c i(h) ,
.
mi
In other words, under the CAR prior, the random effect ci(h) at site i, conditional upon the random
effects at all other sites, is normally distributed with mean equal to the average of the neighbouring
cj(h) and variance equal to r/mi , where r is an unknown variance parameter. This intrinsic Gaussian
CAR model allows for over-dispersion and spatial correlation among neighbouring areas. However, Lawson [15] points out that this intrinsic CAR model is not the only available specification
of a Gaussian Markov random field model. In fact, a proper CAR model formulation can also be
used. The application, comparison and discussion of a proper CAR prior to the analysis of our
stochastic SIR-SI dengue disease transmission model will be included in future investigations to
improve this methodology.
The discrete timespace stochastic SIR-SI model for dengue disease transmission that we
propose here will be used in the estimation of relative risk for dengue disease mapping. However,
the methods extend readily to apply more generally to other vector-borne infectious diseases. A
discussion about this is presented and explained in the next section.
4.

Relative risk estimation for disease mapping

Many studies on disease mapping use regression-type models to estimate the risk. Here, we
introduce an alternative method of relative risk estimation of disease mapping based on the disease
transmission model adapted specially for dengue disease. Our computational analysis is performed
using WinBUGS software, which is a package designed to carry out Markov chain Monte Carlo
computations for a wide variety of Bayesian models [29]. A discussion and application of Bayesian
analysis of disease mapping using this software can be found in Lawson and Clark [17].
In general, for i = 1, 2, . . . , M study regions and j = 1, 2, . . . , T time periods, a pseudo-random
(h)
sample of observations ijk
for k = 1, 2, . . . , n is generated from the posterior distribution for
the mean number of infectives ij(h) . From this sample, the posterior expected mean number of
infectives can be approximated using the unbiased sample mean
ij(h) =

1  (h)
.
n k=1 ijk
n

(15)

Next, the relative risk parameter ij(h) is defined by


ij(h) =

ij(h)
eij(h)

(16)

Therefore, the posterior expected relative risk can also be approximated using an unbiased sample
mean
(h)
n
n
ij(h)
1  (h)
1  ijk
ij(h) =
ijk =
=
.
(17)
n k=1
n k=1 eij(h)
eij(h)

N.A. Samat and D.F. Percy

5. Application of relative risk estimation for dengue disease in Malaysia


This section demonstrates and displays the results of relative risk estimation based on an
application of the preceding discrete timespace stochastic SIR-SI models for dengue disease
transmission with five alternative assumptions about the mosquito population. The results are
compared and presented in tables and a map, and a powerful model for relative risk estimation
and dengue disease mapping is revealed.

5.1

Data set

Data used in this study were provided by the Ministry of Health, the Institute for Medical Research
and the Department of Statistics, all in Malaysia. All methods presented here are applied to dengue
data in the form of counts of cases within the states of Malaysia for epidemiology weeks 153
during a 1-year period spanning 20082009. Figure 2 displays the available data, which refer
to observed new infective dengue cases of humans in time periods or intervals (j 1, j] for
j = 1, 2, . . . , 53.
The values for (h) and (v) are chosen to be 0.50 and 0.75, respectively, and the number of
alternative hosts available as the blood source m is assumed to be zero. Furthermore, the weekly
rate values for (h) , (v) and (h) are 0.0002736, 0.4028 and 0.7903, respectively, and b is 2.33. All
of these rates are converted from daily rates that we derived from the literature [22,25]. Moreover,
700

Perlis
Kedah

600

Numbers of New Infective Dengue Cases

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

In other words, the posterior expected relative risk is equal to the posterior expected mean number
of infectives, ij(h) , divided by the corresponding nave mean number of infectives based on the
human population across all study regions, eij(h) .
We then use this formulation in the estimation of relative risk for disease mapping, based on
the discrete timespace stochastic SIR-SI model for disease transmission using data in the form
of counts of cases for all tracts under consideration.

P.Pinang
Perak

500

Kelantan
Terengganu

400

Pahang
Selangor
K.Lumpur

300

Putrajaya
N.Sembilan

200

Melaka
Johor

100

Sarawak
Labuan

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53

Sabah

Epidemiology Week

Figure 2. Time series plot for numbers of new infective dengue cases from epidemiology weeks 153 during
1-year period spanning 20082009 for all 16 states in Malaysia.

Journal of Applied Statistics

since there are no routine data available for dengue mosquitoes, we impute suitable values based
on studies conducted by other researchers. This process is explained in the next section.

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

5.2 Estimation of vector mosquito populations


Implementation of the SIR-SI models requires dengue mosquito vector data. Since there are
no available routine data for vector mosquito populations, specifically data for newly infective
mosquitoes that are difficult to collect, we propose three simple methods to impute values in
order to generate better results for relative risk estimation than would otherwise be possible. First,
the estimation is based on seasonal averages reported in relevant journal publications, which
specifically study dengue in Malaysia, written by Rohani et al. [26] and Lee and Inder Singh [18].
Second, the estimation is based on the SIR-SI model for dengue disease transmission where the
starting values are set and the estimation propagates from the SIR-SI equations. Here, some of the
estimation is based on information taken from an article by Nishiura [22]. Third, the estimation
is based on an assumption that the infective mosquito data follow the pattern of weekly data for
new infective humans.
5.2.1

Estimation of vector mosquito populations based on seasonal averages

Rohani et al. [26] identified about 40 infective adult mosquitoes in a sample of 5508. In order to
progress, it is feasible to interpret this information as
(v)
Si,0
(v)
Ii,0

1367
5508 40
=
,
40
10

(v)
(v)
0.00732Si,0
.
Ii,0

(18)
(19)

The calculation above clearly assumes that the ratio of susceptibles to infectives for mosquitoes
is approximately constant, which is a reasonable first-order assumption.
Now consider a study by Lee and Inder Singh [18], who conducted monthly surveillance of
adult mosquitoes in Kuala Lumpur, Malaysia, continuously from January to December 1990 to
monitor their population. Results of the study give the distribution and numbers of adult Aedes
collected in sentinel traps and the total number of adult mosquitoes for each month. Sentinel
traps are typically huts or rooms or houses, which are used to collect mosquitoes. Normally, two
humans stay inside the hut as a bait to attract mosquitoes. Therefore, the numbers of mosquitoes
collected here refer to the numbers corresponding to the population of susceptible humans at risk.
In this investigation, Lee and Inder Singh [18] observed a total of 8518 mosquitoes among 556
susceptible humans in the year 1990. Since we plan to use the number of susceptible vectors as
the starting point at time j = 0 for each region in our analysis, we have
(v)
(v)
Si,0
+ Ii,0
(h)
Si,0

4259
8518
=
.
556
278

(20)

Rearranging Approximation (20) gives


(v)
Ii,0

4259 (h)
(v)
.
S Si,0
278 i,0

(21)

N.A. Samat and D.F. Percy

Substituting Approximation (21) into Approximation (19) now gives


1367 4259 (h)


(v)
(v)
Si,0
S Si,0
10
278 i,0

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

(v)
(h)
Si,0
15.21Si,0
.

(22)

However, the data for adult mosquitoes in Lees paper represent monthly periods, and the lifespan
of Aedes mosquitoes in nature typically ranges from 2 weeks to a month depending on environmental conditions [21]. Consequently, we need to redefine Equation (22) by transforming to a
single generation of Aedes mosquito. Under this redefinition, Approximation (20) changes so that
the appropriate revised form of Equation (22) becomes
(v)
Si,0

(h)
15.21Si,0
(h)
.
= 7.605Si,0
2

(23)

Hence, there are seven or eight susceptible mosquitoes for every susceptible human, on average.
Relations (19)(23) give some idea of what the average values are for the infective mosquito
population and susceptible mosquito population, which we assume as initial values for our inves(v)
tigation. In this analysis, the value for the infective mosquito count Ii,0
is used as the average value
over the first time period, which we then propagate using one of three alternative assumptions.
These values are then imputed in Equation (8), giving three similar sets of results arising from
our relative risk estimation.
First, we assume that the data for infective mosquitoes are constant over time for all the states
in Malaysia (Assumption 1). Figure 3 shows a graph of the estimated infective mosquito data for
each state in Malaysia from epidemiology weeks 153 during the course of 20082009. That is,
from Equations (19) and (23) we estimate the number of infective mosquitoes for the start of the
time period for each state, which we then assume constant for all time periods.
Second, we assume that the data for infective mosquitoes follow a cyclical seasonal pattern
(Assumption 2). This is because many researchers have reported in their studies that the seasonal
patterns of outbreak of dengue coincide with the rainy season [8,23,28,30].

Figure 3. Imputed infective mosquitoes without seasonality.

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

Journal of Applied Statistics

Figure 4. Imputed infective mosquitoes with piecewise constant seasonality.

According to Okogun et al. [23], rainfall is an important factor which regulates the abundance
of outdoor breeding mosquito populations and consequently directly associates with the higher
prevalence levels of mosquito diseases. This view is supported by Foo et al. [8], who found that the
monthly incidence of dengue is associated with the monthly rainfall, which provides the breeding
sites for mosquito populations. In Malaysia, the northeast monsoon is the major rainy season in
the country, which brings heavy rainfall from mid November to early March [20]. Therefore, it
is expected that the number of infective mosquitoes will increase during this monsoon season. In
this study, we assume that the number of infective mosquitoes is piecewise constant over time,
where the value is in a range between 10% above and 10% below the estimated average number of
infective mosquitoes in each state (Figure 4). Here, the number of infective mosquitoes is assumed
to be large during epidemiology weeks 111 and 4653, corresponding to the raining season in
Malaysia, and small during the other epidemiology weeks.
Third, we again assume that the data for infective mosquitoes follow a cyclical seasonal pattern,
but that this seasonality is now represented by a sinusoidal function ranging from 20% below the
estimated average value to 20% above the estimated average value in each state (Assumption 3).
The idea of using a sinusoidal function is to model the seasonal variation continuously throughout
the year, as a better representation of the true cyclical behaviour than in Assumption 2. Figure 5
shows the imputed infective mosquito data based on this assumption.
In any particular state i, we fit the sinusoidal function for infective mosquitoes by considering
the continuous-time equivalent to Ii,j(v) , which is
Ii(v) (t) = ai + bi sin(c + dt),
where ai is the mean response, bi is the amplitude, c is the phase, 2/d is the period and t
represents time. In this research, ai represents the estimated average value used in Assumptions
1 and 2, bi reflects the amplitude of 20% about the average and t interpolates epidemiology
weeks j = 1, 2, . . . , 53. The parameters c = 39/53 and d = 2/53 are assumed constant across
all states.

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

10

N.A. Samat and D.F. Percy

Figure 5. Imputed infective mosquitoes with sinusoidal seasonality.

Therefore, dt measures annual cycles, taking the values [0, 2 ) for year 1, [2 , 4 ) for year
2, [4 , 6 ) for year 3 and so on. Here, we choose c in the interval [0, 2 ), but any value equal
to this plus a multiple of 2 will give the same imputed values for Ii(v) (t). As for Assumption 2,
the rainy season falls during epidemiology weeks 111 and 4653. Therefore, it is assumed that
the number of infected mosquitoes is high in this duration compared with the other epidemiology
weeks.
These three alternative assumptions for mosquito data are then imputed in the discrete time
space stochastic SIR-SI model for dengue disease transmission for all states in Malaysia, to obtain
comparable posterior expected relative risks.
5.2.2

Estimation of vector mosquito populations based on propagation

Several articles used the same information as Nishiura [22] in their studies of dengue disease
transmission [7,25]. Here, we use information from Nishiura [22] in order to estimate the total
mosquito population Ni(v) in state i = 1, 2, . . . , M. In his analysis, Nishiura assumed the total
human population Ni(h) to be 10,000 and the recruitment rate of mosquitoes to be 5000 per day.
Converting this daily rate to weekly rate gives the recruitment rate of mosquitoes to be 35,000 per
week. We know that the recruitment rate of the mosquito population is (v) Ni(v) , and in this study
the birth and death rates for the mosquito population are both (v) 0.4028 per week. Therefore,
Ni(v)

35, 000
86, 892,
0.4028

and this leads to


Ni(v) 8.6892Ni(h) .

(24)

Based on Approximation (24), we can now estimate the total mosquito population for each state.
These data are then imputed in Equation (12), which is then substituted in Equations (13) and (14).
This subsequently gives estimated values for the numbers of infective mosquitoes Ii,j(v) which we
propagate from Equation (14). We refer to this approach as Assumption 4.

Journal of Applied Statistics


35000

11
Perlis
Kedah

Imputed Infective Mosquitoes

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

30000

P.Pinang
Perak

25000

Kelantan
Terengganu
Pahang

20000

Selangor
K.Lumpur

15000

Putrajaya
N.Sembilan

10000

Melaka
Johor
5000

Sarawak
Labuan

0
1

9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53

Sabah

Epidemiology Week

Figure 6. Imputed infective mosquitoes based on propagation.

Figure 6 shows the corresponding values of Ii,j(v) for each state in Malaysia for epidemiology
weeks 153 corresponding to the 12 months from 1 January 2008 to 3 January 2009. These values
are finally imputed in Equation (8) to give the posterior expected means of new infective humans,
which subsequently give the posterior expected relative risks of dengue disease.
5.2.3 Estimation of vector mosquito populations from human populations
Here, we assume that the infective mosquito population counts follow the cyclical pattern of infective human population counts, with a constant ratio between infective mosquitoes and infective
humans (Assumption 5).
This assumption is based on our belief that there is a positive correlation between the numbers
of infective mosquitoes and the numbers of new infective humans. We assume that when there
is an increase in the number of new infective humans, there will also be an increase in the
number of infective mosquitoes. Figure 7 shows the pattern of Ii,j(v) for each state in Malaysia
from epidemiology weeks 153 for the same year during 20082009, based on Assumption 5.
These data are then imputed in Equation (8) to give the posterior expected means of new infective
humans, which subsequently give the posterior expected relative risks of dengue disease.

5.3 Analysis and results: comparison of posterior expected relative risks


The aim of this research is to improve the accuracy and reliability of the existing methods for
mapping vector-borne infectious diseases. In this paper, the estimation of relative risk is based
on our stochastic SIR-SI model for disease transmission. Many published studies of general
infectious diseases, including [1,5,6,24], use stochastic terms in their models as probabilistic
statements about the progression of the disease. These studies generally agree that stochastic
models are more realistic than deterministic models, the latter being a special case of the former.
To demonstrate the possible benefits of our approach, we focus on the spread of dengue disease
in Malaysia. We adopt Bayesian methods of analysis for improved robustness in estimation and
decision-making. However, this paper is primarily concerned with the models and methods, so we

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

12

N.A. Samat and D.F. Percy

Figure 7. Imputed infective mosquitoes based on human populations.

choose reference (uniform) priors for illustration, except for the CAR prior for spatial variability.
Future work will investigate the impact of more informative priors by means of sensitivity analyses.
We now present the results of relative risk estimation based on our discrete timespace stochastic
SIR-SI model for dengue disease transmission using the five alternative methods for imputing
vector mosquito populations described in the previous section. The model in this analysis is
posterior sampled and is run to convergence using WinBUGS software. Figures 812 show time
series plots for posterior expected relative risks across all states, based on our discrete timespace
stochastic SIR-SI models for dengue disease transmission in epidemiology weeks 153 during
20082009 using Assumptions 15, respectively.
Figures 812 suggest a conclusion that all states have similar patterns of posterior expected
relative risk for all epidemiology weeks, though different methods give different values of posterior
expected relative risk. Based on the posterior expected relative risks for epidemiology week 53
in Table 1, all methods lead to the same conclusion that the state with the highest risk is Putrajaya
and the state with the lowest risk is Sabah, except for Assumption 5 which concludes that the
state of Labuan has the lowest risk. The risks for the other 14 states seem to be quite similar
for all five assumptions. This consistency is most encouraging and suggests that the disease
maps are not overly sensitive to the accuracy of the assumption made for imputing mosquito
counts. Consequently, there appears to be little to gain from expensive efforts to collect actual
data on mosquito populations, so long as reference values are available, such as those used in
our analysis. Mathematical considerations lead towards Assumption 3 as best representing the
physical process, but we now evaluate model goodness-of-fit measures to help us determine
which mosquito population assumption is most appropriate.
The use of goodness-of-fit measures is common in statistics for comparing fitted models. Lawson [15] discusses several methods that can be used to assess goodness-of-fit, including chi-square
statistics, Akaike information criterion, Bayesian information criterion, deviance information criterion (DIC) and posterior predictive loss. In this study, we use the DIC because it is readily
available in WinBUGS software and because Lawson [15] identifies weaknesses with the other
measures, particularly for models that involve several random effects. The DIC is defined by

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

Journal of Applied Statistics

13

Figure 8. Posterior expected relative risks under Assumption 1.

Figure 9. Posterior expected relative risks under Assumption 2.

Spiegelhalter et al. [29] as


DIC = 2E|x {D} D{E |x ( )},
where D() is the deviance of the model and x represents the observed data. It uses the average
of the posterior samples of to produce an expected value of . This value can also be computed
from a sample output from a chain. According to Spiegelhalter et al. [29], the model with the
smallest DIC is the model that would best predict a replicate data set of the same structure as that
currently observed. While Lawson and Clark [17] point out that the other overall goodness-of-fit

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

14

N.A. Samat and D.F. Percy

Figure 10. Posterior expected relative risks under Assumption 3.

Figure 11. Posterior expected relative risks under Assumption 4.

measures are useful for helping model selection, they give little help in assessing how well the
model fits the data.
Table 2 shows the DIC values for the new infective humans for epidemiology weeks 153 for
all states in Malaysia based on our five different assumptions for the mosquito populations. From
the DIC values in Table 2, we can say that the model with Assumption 5 fits best because it gives
the smallest DIC, compared with the other models. We conclude that the discrete timespace
stochastic SIR-SI model that assumes that infective mosquito counts are proportional to infective
human counts is the best model to be used in the analysis specifically for estimating relative risk.

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

Journal of Applied Statistics

15

Figure 12. Posterior expected relative risks under Assumption 5.

Table 1. Posterior expected relative risks for epidemiology week 53.


Assumption 1 Assumption 2 Assumption 3
I (v)
State

I (v)
constant

1. Perlis
2. Kedah
3. Pulau Pinang
4. Perak
5. Kelantan
6. Terengganu
7. Pahang
8. Selangor
9. Kuala Lumpur
10. Putrajaya
11. Negeri Sembilan
12. Melaka
13. Johor
14. Sarawak
15. Labuan
16. Sabah

0.3471
0.3881
0.6753
0.7790
0.6972
0.7176
0.3891
1.9350
1.4450
2.2420
0.6184
0.4911
0.5444
0.2766
0.5723
0.1532

piecewise
constant
I (v) sinusoidal
seasonality
seasonality
0.3955
0.4422
0.7695
0.8876
0.7944
0.8177
0.4433
2.2040
1.6470
2.5550
0.7046
0.5595
0.6204
0.3151
0.6522
0.1745

Assumption 4
I (v)

propagated
from SIR-SI
equations

0.4171
0.4663
0.8115
0.9361
0.8377
0.8623
0.4675
2.3250
1.7370
2.6950
0.7430
0.5900
0.6542
0.3323
0.6878
0.1841

Assumption 5
I (v)

0.6454
0.8397
0.9692
0.9843
0.5154
0.6239
0.5375
3.0420
1.6210
5.7400
0.6123
0.3274
0.6684
0.4213
0.1551
0.1251

estimated from
human infectives
I (h)
0.2960
0.8963
1.0710
0.8168
0.5351
0.5518
0.7151
3.1830
1.5570
5.2370
0.8455
0.2817
0.5983
0.2829
0.000000112
0.0926

Table 2. DIC evaluated for Assumptions 15.


Assumption 1 Assumption 2 Assumption 3 Assumption 4 Assumption 5
New infective humans, (h)

8993.57

9515.41

10087.4

10137.5

7982.23

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

16

N.A. Samat and D.F. Percy

Figure 13. Discrete timespace stochastic SIR-SI model disease map for epidemiology week 53.

This conclusion gives some insight into the possibility of collecting other mosquito data in future
and how modelling of the mosquito populations could be improved. It would be helpful to collect
more mosquito data at various stages in the future to generate better estimates of relative risk.
Figure 13 shows a relative risk disease map for infective dengue cases based on the discrete time
space stochastic SIR-SI model for dengue disease transmission (Assumption 5) for epidemiology
week 53 for the 16 states in Malaysia. Each state is categorized into five different classes of risk,
corresponding to very low, low, medium, high and very high risks. This figure shows that the
risk for infective dengue cases is very high in the state of Putrajaya. This is followed by the state
of Selangor with a high risk. In this map, no states have medium risk, but five have low risk,
including Kedah, Pulau Pinang, Perak, Kuala Lumpur and Negeri Sembilan. The other nine states
have very low risk and include Perlis, Kelantan, Terengganu, Pahang, Melaka, Johor, Sarawak,
Labuan and Sabah. From this map, interested parties can easily see which states have a very high
risk and need closer scrutiny or further attention.

6.

Conclusion

Dengue is not just a disease that continues to remain as a public health problem in tropical countries
in the world, but also an economic and social problem that burdens nations globally. The impact
of dengue disease to social and economic problems is discussed by Gubler [11]. Many control
strategies have been taken to eradicate the disease, but very few have turned out to be effective
and incur very high amounts of money and time. Disease mapping has been recognized as an
important tool in the prevention and control strategies for a disease. An accurate disease map
relies on modelling used to estimate the relative risk for most of the map.
Statistical modelling used in estimating relative risk based on the dengue disease transmission
model with tract-count data introduced in this paper offers better alternative models compared
with the common models used in the study of disease mapping such as the classical model based
on the SMR and the earliest example of Bayesian mapping which involves a Poisson-gamma
model. This is because this model is a more detailed description of the biological process, which
takes into account the transmission of the disease. The model also enables covariate adjustments
and allows for spatial correlation between risks in nearby areas. The characteristics of this model

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

Journal of Applied Statistics

17

can overcome the problems of SMR, especially when there are no observed count data in certain regions, and the problems of the Poisson-gamma model, where covariate adjustments are
impossible and it is not possible to allow for spatial correlation between risks in adjacent areas.
Possible extensions to this work include the development of a model for dengue disease mapping
with continuous time and discrete space, in order to improve the accuracy of disease mapping
further and for particular applicability to vector-borne infectious diseases that are rare or in their
early stages. We anticipate that the results of this analysis will further strengthen our conclusions
about tract-count data using the above analysis. The techniques presented in this paper offer an
alternative method for estimating the relative risk in the study of disease mapping particularly for
diseases with indirect transmission.
Acknowledgements
The authors acknowledge Universiti Pendidikan Sultan Idris and the Ministry of Higher Education in Malaysia for their
financial support in respect of this study.

References
[1] C.L. Addy, I.M. Longini, Jr., and M. Haber, A generalized stochastic model for the analysis of infectious disease
final size data, Biometrics 47 (1991), pp. 961974.
[2] L. Bernardinelli, D.G. Clayton, C. Pascutto, C. Montomoli, M. Ghislandi, and M. Songini, Bayesian analysis of
spacetime variation in disease risk, Stat. Med. 14 (1995), pp. 24332443.
[3] J. Besag, J. York, and A. Mollie, Bayesian image restoration with two applications in spatial statistics, Ann. Inst.
Stat. Math. 43 (1991), pp. 159.
[4] D. Boehning, E. Dietz, and P. Schlattmann, Spacetime mixture modelling of public health data, Stat. Med. 19
(2000), pp. 23332344.
[5] D. Clancy, A stochastic SIS infection model incorporating indirect transmission, J. Appl. Probab. 42 (2005), pp. 726
737.
[6] D. Clancy and P.D. ONeill, Bayesian estimation of the basic reproduction number in stochastic epidemic models,
Bayesian Anal. 3 (2008), pp. 737758.
[7] L. Esteva and C. Vargas, Analysis of a dengue disease transmission model, Math. Biosci. 150 (1998), pp. 131151.
[8] L.C. Foo, T.W. Tim, H.L. Lee, and R. Fang, Rainfall, abundance of Aedes aegypti and dengue infection in Selangor,
Malaysia, Southeast Asian J. Trop. Med. Public Health 16 (1985), pp. 560568.
[9] A. Gemperli, P. Vounatsou, N. Sogoba, and T. Smith, Malaria mapping using transmission models: An application
to survey data from Mali, Am. J. Epidemiol. 163 (2006), pp. 289297.
[10] D.J. Gubler, Dengue and dengue haemorrhagic fever, Clin. Microbiol. Rev. 11 (1998), pp. 480496.
[11] D.J. Gubler, Epidemic dengue/dengue haemorrhagic fever as a public health, social and economic problem in the
21st century, Trends Microbiol. 10 (2002), pp. 100103.
[12] V. Isham, Stochastic models for epidemics: Current issues and development, in Celebrating Statistics, A.C. Davison,
,
Y. Dodge and N. Wermuth, eds., Oxford University Press, Oxford, 2005, pp. 2754.
[13] L. Knorr-Held and J. Besag, Modelling risk from a disease in time and space, Stat. Med. 17 (1998), pp. 20452060.
[14] A.B. Lawson, Statistical Methods in Spatial Epidemiology, 2nd ed., John Wiley & Sons, Chichester, UK, 2006.
[15] A.B. Lawson, Bayesian Disease Mapping, CRC Press, Boca Raton, FL, 2009.
[16] A.B. Lawson, W.J. Browne, and C.L Vidal Rodeiro, Disease Mapping with WinBUGS and MLwiN, John Wiley &
Sons, Chichester, UK, 2003.
[17] A.B. Lawson and A. Clark, Spatial mixture relative risk models applied to disease mapping, Stat. Med. 21 (2002),
pp. 359370.
[18] H.L. Lee and K. Inder Singh, Sequential sampling for Aedes aegypti and Aedes albopictus (Skuse) adults: Its use in
estimation of vector density threshold in dengue transmission and control, J. Biosci. 2 (1991), pp. 914.
[19] Y.C. MacNab and C.B Dean, Spatio-temporal modelling of rates for the construction of disease maps, Stat. Med. 21
(2002), pp. 347358.
[20] Malaysian Meteorological Department, Monsoon season in Malaysia. Available at http://www.met.gov.my (2 April
2010).
[21] Maricopa County Environmental Services, Lifecycle and information on Aedes aegypti mosquitoes, Maricopa County.
Available at http://www.maricopa.gov/EnvSvc/VectorControl/Mosquitos/MosqInfo.aspx (20 July 2009).
[22] H. Nishiura, Mathematical and statistical analysis of the spread of dengue, Dengue Bull. 30 (2006), pp. 5167.

Downloaded by [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] at 19:09 27 June 2012

18

N.A. Samat and D.F. Percy

[23] R.A.G. Okogun, E.B.N. Bethran, N.O. Anthony, C.A. Jude, and C.E. Anegbe, Epidemiological implication of
preferences of breeding sites of mosquito species in Midwestern Nigeria, Ann. Agric. Environ. Med. 10 (2003),
pp. 217222.
[24] P.D. ONeil, A tutorial introduction to Bayesian inference for stochastic epidemic models using Markov chain Monte
Carlo methods, Math. Biosci. 180 (2002), pp. 103114.
[25] P. Pongsumpun, K. Patanarapelert, M. Sripom, S. Varamit, and I.M. Tang, Infection risk to travellers going to dengue
fever endemic regions, Southeast Asian J. Trop. Med. Public Health 35 (2004), pp. 155159.
[26] A. Rohani, I. Asmaliza, S. Zainah, and H.L. Lee, Detection of dengue from field Aedes aegypti and Aedes albopictus
adults and larvae, Southeast Asian J. Trop. Med. Public Health 28 (1997), pp. 138142.
[27] M.G. Rosa-Freitas, P. Tsouris, A. Sibajev, E.T. Weimann, A.U. Marques, R.L Ferreire, and F.C.L. Gards-Moura,
Exploratory temporal and spatial distribution analysis of dengue notifications in Boa Vista, Roraima, Brazilian
Amazon, 19992001, Dengue Bull. 27 (2003), pp. 6380.
[28] H. Rozilawati, J. Zairi, and C.R. Adanan, Seasonal abundance of Aedes albopictus in selected urban and suburban
areas in Penang, Malaysia, Trop. Biomed. 24 (2007), pp. 8394.
[29] D. Spiegelhalter, A. Thomas, N. Best, and D. Lunn, WinBUGS User Manual Version 1.4, MRC Biostatistics Unit,
Cambridge, UK, 2003.
[30] S. Sulaiman, Z.A. Pawanchee, J. Jeffery, I. Ghauth, and V. Buspavani, Studies on the distribution and abundance
of Aedes aegypti (L.) and Aedes albopictus (Skuse) (Diptera: Culicidae) in an endemic area of dengue/dengue
haemorrhagic fever in Kuala Lumpur, Mosq.-Borne Dis. Bull. 8 (1991), pp. 3539.
[31] A. Tran, X. Deparis, P. Dussart, J. Morran, P. Rabarison, F. Remy, L. Polidori, and J. Gardon, Dengue spatial and
temporal patterns, French Guiana, 2001, Emerg. Infect. Dis. 10 (2004), pp. 615621.
[32] L.A. Waller, B.P. Carlin, H. Xia, and A.E. Gelfand, Hierarchical spatio temporal mapping of disease rates, J. Am.
Stat. Assoc. 92 (1997), pp. 607617.

You might also like