You are on page 1of 11

WATER RESOURCES RESEARCH, VOL. 38, NO. 11, 1239, doi:10.

1029/2001WR001125, 2002

Regional estimation of rainfall intensity-duration-frequency curves


using generalized least squares regression of partial duration series
statistics
Henrik Madsen
DHI Water and Environment, Hrsholm, Denmark

Peter Steen Mikkelsen, Dan Rosbjerg, and Poul Harremoes


Environment and Resources DTU, Technical University of Denmark, Lyngby, Denmark
Received 5 December 2001; revised 3 April 2002; accepted 11 April 2002; published 15 November 2002.

A general framework for regional analysis and modeling of extreme rainfall


characteristics is presented. The model is based on the partial duration series (PDS)
method that includes in the analysis all events above a threshold level. In the PDS model
the average annual number of exceedances, the mean value of the exceedance magnitudes,
and the coefficient of L variation (LCV) are considered as regional variables. A
generalized least squares (GLS) regression model that explicitly accounts for intersite
correlation and sampling uncertainties is applied for evaluating the regional heterogenity
of the PDS parameters. For the parameters that show a significant regional variability the
GLS model is subsequently adopted for describing the variability from physiographic and
climatic characteristics. For determination of a proper regional parent distribution L
moment analysis is applied for discriminating between the exponential distribution and
various two-parameter distributions in the PDS model. The resulting model can be used
for estimation of rainfall intensity-duration-frequency curves at an arbitrary location in a
region. For illustration, the regional model is applied to rainfall data from a rain gauge
INDEX TERMS: 1854 Hydrology: Precipitation (3354); 1869 Hydrology:
network in Denmark.

[1]

Stochastic processes; 1833 Hydrology: Hydroclimatology; 1821 Hydrology: Floods; KEYWORDS: rainfall
extremes, regional estimation, partial duration series, generalized least squares regression, L moments
Citation: Madsen, H., P. S. Mikkelsen, D. Rosbjerg, and P. Harremoes, Regional estimation of rainfall intensity-duration-frequency
curves using generalized least squares regression of partial duration series statistics, Water Resour. Res., 38(11), 1239,
doi:10.1029/2001WR001125, 2002.

1. Introduction
[2] The rainfall intensity-duration-frequency (IDF) relationship is widely applied in hydraulic and hydrological
engineering for design of structures that control storm
runoff and flooding. For a site where rainfall measurements
are available frequency analysis can be performed for
development of the IDF relationship. For design purposes
national maps of IDF relationships have been constructed,
e.g., in the United States [Hershfield, 1961], UK and
Ireland [Natural Environment Research Council (NERC ),
1975], and Denmark [Danish Water Pollution Control
Committee (DWPCC ), 1974]. These maps are based on
pooling information from regional stations and simple
interpolation between the sites.
[3] The construction of rainfall IDF maps in the traditional manner has two major deficiencies. First, if the
length of the at-site rainfall record is small as compared to
the design return period, the estimated IDF relationship is
very uncertain. Secondly, since the spatial variability of
extreme rainfall characteristics may be large even within
Copyright 2002 by the American Geophysical Union.
0043-1397/02/2001WR001125

small areas [e.g., Harremoes and Mikkelsen, 1995] simple


pooling may provide unreliable IDF estimates. Presently,
the guidelines for estimation of IDF curves are being
revised in a number of countries, e.g., Canada [Adamowski et al., 1996], United States [Fennessey, 1998; Schaefer,
1998], and Denmark [Mikkelsen et al., 1998]. More and
longer rainfall time series are now available and recent
developments in regional frequency analysis procedures
provide the basis for more comprehensive statistical
analyses of these data.
[4] In regional rainfall frequency analysis, rainfall
measurements from several sites in a region are combined
for estimation of regional IDF curves. By utilizing
regional data the estimation uncertainty can be reduced
significantly. Furthermore, by relating the extreme rainfall
characteristics to relevant climatic and physiographic variables, IDF curves can be estimated at an arbitrary site in
the region. The regional modeling includes three basic
elements: (1) delineation of regions, (2) estimation of
regional parameters, and (3) determination of a regional
distribution.
[5] Regions used in precipitation analyses are often
defined as geographically coherent areas with similar climatic and physical characteristics. Such regions, however,

21 - 1

21 - 2

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES

may not be homogenous with respect to the extreme rainfall


statistics (e.g., mean, coefficient of variation and higher
order moments). For instance, considerable geographical
variability was observed in extreme rainfall characteristics
in Denmark even within small areas having a relatively flat
topography [Madsen et al., 1994; Harremoes and Mikkelsen, 1995; Mikkelsen et al., 1996]. Schaefer [1990] found
that the initial division of Washington State into 5 regions
according to geography and topography provided statistically heterogeneous regions. Moreover, he found that the
extreme rainfall characteristics varied systematically with
the mean annual precipitation (MAP). In the analysis of
rainfall extremes in Denmark [Mikkelsen and Harremoes,
1993] and Canada [Adamowski et al., 1996] it was also
found that certain statistical characteristics varied systematically with MAP.
[6] For estimation of regional parameters and determination of a regional parent distribution procedures based on L
moments [Hosking, 1990] have shown to be efficient. As
compared to product moment estimates, L moment estimates
are unbiased and relatively insensitive to outliers; properties
that are extremely important in regional studies. Regional
parameter estimation procedures based on L moments, or
equivalently probability weighted moments [Greenwood et
al., 1979], were introduced by Wallis [1980] and Greis and
Wood [1981]. Hosking and Wallis [1993] developed a
heterogeneity measure for evaluating the significance of
the regional variability of second and higher order L
moments. Hosking and Wallis [1993] also developed a
goodness of fit measure based on L moments for determining the regional distribution. The L moment based procedures have been widely applied in regional precipitation
analyses [e.g., Schaefer, 1990; Cong et al., 1993; Naghavi
and Yu, 1995; Adamowski et al., 1996; Fennessey, 1998;
Schaefer, 1998].
[7] Most rainfall frequency studies have been based on
analyzing records of annual maxima. As an alternative to
the annual maximum series (AMS) approach, the partial
duration series (PDS) method that includes all events above
a threshold level has been advocated. Recent studies by
Madsen et al. [1997a, 1997b] revealed that the widely
applied PDS model with generalized Pareto distributed
exceedances, in general, is more efficient than the corresponding AMS model based on the generalized extreme
value distribution for both at-site and regional quantile
estimation. The PDS method has been applied in regional
rainfall frequency analyses by Van Montfort and Witter
[1986], Fitzgerald [1989], Madsen et al. [1994], and Mikkelsen et al. [1995, 1996].
[8] In this paper a general framework for regional
analysis and modeling of extreme rainfall characteristics
is presented. The model is based on an extension of the
regional PDS index-flood method developed by Madsen
and Rosbjerg [1997a, 1997b]. The regional rainfall frequency model includes a generalized least squares regression methodology for analyzing regional homogeneity and
describing any regional variability of the PDS parameters
from climatic and physiographic characteristics. For determination of a regional parent distribution L moment
analysis is applied. The resulting model provides an estimate of the IDF curve and the associated uncertainty at an
arbitrary site in the region. For illustration, the developed

model is applied to a regional data set of Danish rainfall


extremes.

2. Modeling Framework
[9] The rainfall intensity for a given duration is described
by a stochastic variable Z with observations {zi, i = 1, 2, . . ., m}
where m is the total number of rain events in the historical
time series. The extreme value model is based on the PDS
method where the population includes all events above a
threshold level. Basically, two different methods are available for choosing the threshold level. In type I sampling, a
threshold level z0 is explicitly defined and the events {zi >
z0, i = 1, 2, . . ., n} are included in the PDS. In type II
sampling, the n largest events are included in the PDS
{z(1)  z(2)  . . .  z(n)}.
[10] In the present study type I sampling is applied,
implying that the average annual number of threshold
exceedances l becomes a regional variable. In the case of
type II sampling using a constant l in the region to define
the PDS, the threshold level becomes a regional variable.
An example of this approach is given by Mikkelsen et al.
[1995, 1996]. The regional modeling procedure described in
the following can be applied also in the case of type II
sampling, simply by considering z0 as a regional variable
instead of l. Furthermore, if neither z0 nor l are considered
constant in the region, the regional modeling procedure is
easily extended by including an additional parameter to be
regionalized.
[11] The exceedances in the PDS are described by a
stochastic variable X = Z z0, Z > z0 with observations {xi,
i = 1, 2, . . ., n} where n is the PDS sample size. It is generally
assumed that the occurrence of exceedances can be described
by a Poisson process with constant or one-year periodic
intensity, implying that the number of exceedances is Poisson-distributed with intensity l. In the basic PDS model
the one-parameter exponential distribution was applied for
modeling the exceedance magnitudes [Shane and Lynn,
1964; Todorovic and Zelenhasic, 1970]. Several alternative
two-parameter distributions have been proposed, including
the generalized Pareto [e.g., Hosking and Wallis, 1987;
Rosbjerg et al., 1992a; Madsen and Rosbjerg, 1997a,
1997b; Madsen et al., 1997a, 1997b], gamma [Zelenhasic,
1970], Weibull [Ekanayake and Cruise, 1993], and lognormal [Rosbjerg et al., 1991] distributions.
[12] In a PDS context the T-year event is usually defined
as the (1-1/(lT ))-quantile in the distribution of the exceedances [e.g., Rosbjerg, 1985]. Denote by F(x; A) the cumulative distribution for the exceedance magnitudes, the T-year
event can be written


1
zT z0 F1 1 
;A
lT

where A are the parameters of the exceedance distribution.


An estimate of the T-year event is obtained from (1) by
inserting estimates of the PDS parameters. An estimate of
the Poisson parameter l is given as the average number of
observed exceedances per year, i.e., l n=t where t is the
record length. For estimation of the parameters of F(x) the
method of L moments [Hosking, 1990] is adopted. For a
two-parameter exceedance distribution the T-year event
estimate is then given by

21 - 3

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES






1
^ m
^; t
^; t
^zT z0 F1 1 
^2
;m
^2 z0 g l;
^
lT

s; m R s; t R s
zRT s z0 g l
2

where s refers to the location of the site, and R denotes a


regional estimate. The uncertainty of the regional T-year
event estimate is approximately
VarfzRT sg

p
X

bk Aik ei di

kl

^ is the sample mean (equal to the first L moment


where m
estimate), and t
^2 is an estimate of the coefficient of L
variation (LCV). For estimation of L moments unbiased
probability weighted moment estimators are used [Landwehr et al., 1979; Hosking and Wallis, 1995].
[13] The regional analysis of extreme rainfalls is based on
the parameterization defined above. That is, the Poisson
parameter, the mean exceedance, and the LCV are considered as regional variables. The aim of the regional modeling
is to provide estimates of the three parameters and their
associated uncertainties at an arbitrary site in the region.
The regional T-year event estimate can be written


q i b0


@g 2
R sg
Varfl
@l
 2
 2
@g
@g
VarfmR sg
VarftR2 sg 4

@m
@t2

where
partial derivatives
are evaluated at (l, m, t,) =

 R the
s; m R s; t R s . In (4) it has been utilized that the
l
2
mutual dependence between the regional parameter estimates can be neglected [Madsen and Rosbjerg, 1997a].
[14] For estimation of the regional IDF relationships using
(3) (4) the modeling framework presented herein includes
the following elements: (1) evaluation of regional homogeneity for each of the three PDS parameters, (2) for the
parameters that show a significant regional variability, evaluation of the potential of describing the variability from
relevant physiographic and climatic characteristics, and (3)
determination of a regional distribution for the exceedances.
In the following sections the different modeling elements are
described.

3. Regional Modeling of PDS Parameters


[15] To investigate the regional variability of the PDS
parameters, a regression analysis is carried out. This analysis has a twofold purpose. First, the regression model is
applied for assessment of the regional heterogeneity of the
PDS parameters. Secondly, for parameters that show a
significant regional variability, the regression model is
subsequently adopted to evaluate the potential of describing
the variability from physiographic and climatic characteristics. The regression model is based on a generalized least
squares (GLS) estimation method that explicitly accounts
for sampling uncertainty and intersite dependence. The GLS
model is briefly described below. A more detailed description of the model is given by Stedinger and Tasker [1985,
1986] and Madsen and Rosbjerg [1997b].
3.1. GLS Regression Model
[16] Denote by q i an estimate of a PDS parameter at
station no. i, i = 1, 2, . . ., M, where M is the number of sites
in the region. The following model is considered

where Aik are the considered physiographic and climatic


characteristics, bk are the regression parameters, ei is a
random sampling error, and di is the residual model error.
The sampling error and the residual model error are
assumed to have zero mean and covariance structures

Covfei ; ej g

8 2
< sei ;

ij

i 6 j

sei sej reij ;

Covfdi ; dj g

8
< s2d ;

ij

i 6 j

0;

where sei2 is the sampling error variance, reij is the


correlation coefficient due to concurrent observations at
stations i and j (intersite correlation coefficient), and sd2 is
the residual model error variance. It should be noted that the
intersite correlation defined above is due to the sampling of
concurrent extreme rainfall events at the different locations
in the region. Thus it does not consider any physical or
causal dependence in extreme rainfall patterns in the region.
For instance, if two records do not overlap in time, the
intersite correlation due to sampling extreme events from
the two records is equal to zero.
[17] If the sampling error covariance matrix is known, the
GLS estimates of the regression parameters and the residual
model error variance are obtained from the following set of
equations
T

AT 1 A B A 1 ;   ABT 1   AB M  p  1


7

where

T
 q 1 ; q 2 ; . . . ; q M

T
B 0b0 ; b1 ; . . . ; bp
1
1 A11 A1p
C
B
C
B
B
:
: C
AB:
C
A
@
1 AM1 AMp

and  is the error covariance matrix of the total errors

ij Covfei di ; ej dj g

8 2
< sei s2d ;

ij

i 6 j

sei sej reij ;

The solution of (7) requires an iterative scheme. In some


cases, no positive value of sd2 can satisfy (7). In these
instances, the sampling errors more than account for the
difference between  and AB, and sd2 is set equal to zero
[Madsen and Rosbjerg, 1997b]. The GLS (regional)
estimate of the PDS parameter and the associated variance
at an arbitrary site are given by

q R s AsT B;

Varfq R sg AsT

X
B

As s 2d

10

where A(s)T = (1, A1(s), A2(s), . . ., Ap(s)), and B =


(AT1A)1 is the covariance matrix of the estimated

21 - 4

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES

regression parameters. The variance of the regional estimate


accounts for sampling errors, corrected for intersite dependence, and residual model errors.
[18] When only the intercept b0 is included in the
regression equation (referred to as the regional mean
model), the GLS regression model provides an estimate of
the regional average PDS parameter and the associated
uncertainty (note that, in general, the regional estimate is
different from the arithmetic regional average because the
GLS algorithm weights the PDS parameter estimates
according to the covariance matrix of the errors). In this
case, the GLS estimate of the residual model error variance
can be interpreted as a measure of regional heterogeneity;
that is, if s
^2d 0, the region can be considered homogeneous. Madsen and Rosbjerg [1997b] showed that the
regional average GLS estimator is a general extension of
the record-length-weighted average commonly applied in
the index-flood procedure [e.g., Stedinger et al., 1993;
Stedinger and Lu, 1995; Madsen and Rosbjerg, 1997a].
The record-length-weighted average estimator considers
neither intersite correlation nor regional heterogeneity.
^2d > 0, the region is heterogeneous, and the GLS
[19] If s
regression model in (5) can subsequently be applied to
evaluate the potential of describing this variability. Alternatively, the region can be divided into distinct subregions
according to similarities in station characteristics. To evaluate the homogeneity of the defined subregions and for
estimation of the subregional average parameters the
regional mean model is adopted. It should be noted that
the subregional approach provides discontinuities in
extreme value characteristics across subregional boundaries.
3.2. Estimation of the Error Covariance Matrix
[20] The solution of the GLS regression equations, cf. (7),
requires an estimate of the sampling error covariance
matrix. In general, approximate expressions of the sampling
error variances of the PDS parameters can be formulated in
terms of the population parameters (see, e.g., Madsen and
Rosbjerg [1997a] for expressions for the PDS model with
generalized Pareto distributed exceedances). Estimates of
the sampling error variances can then be obtained by
substituting the population parameters by the sample estimates. However, for solving the GLS regression equations,
the error covariance estimator should be independent, or
nearly so, of the PDS parameter estimator ^qi [Stedinger and
Tasker, 1985]. Following the approach by Tasker [1980],
estimates of the sampling error variances for the three PDS
parameters that fulfill this independence criterion are presented below.
[21] For the Poisson parameter, the sampling error variance is given by sei2 = li/ti. An estimate of the sampling
error variance that is virtually independent of the parameter
estimate l i can be obtained by substituting the population
parameter li by the regional average of the parameter
estimates, i.e.
c
s
^2ei ;
ti

M
1 X
l i
M i1

11

For the mean exceedance, the sampling error variance is


sei2 = si2/ni where si2 is the population variance. A reasonable
estimate of sei2 is then given by

s
^2ei

c
;
ni

M
1 X
s
^2
M i1 i

12

For estimation of the sampling error variance of the LCV


estimator a Monte Carlo simulation procedure is applied,
following the lines of calculation of different regional
statistics by Hosking and Wallis [1993]. In this case, a large
number of regional samples are generated from a kappa
distribution based on the regional average L moment
statistics (regional average LCV, L skewness and L kurtosis)
and regional average number of threshold exceedances n.
From these simulations the variance of the LCV estimates
V2 is calculated. The estimate of sei2 for the LCV estimate
can then be determined by
s
^2ei

nV 2
ni

13

The flexible four-parameter kappa distribution is used in the


simulations to avoid choosing a particular distribution (such
as the exponential distribution and various two-parameter
distributions) as a regional parent at this stage of the
analysis.
[22] For estimation of the intersite correlation between
parameter estimates two types of correlation are considered
(1) correlation between the annual number of exceedances,
and (2) correlation between concurrent exceedance magnitudes. The correlation coefficient between the estimated
Poisson parameters, rlij, is equal to the correlation coefficient between the annual number of threshold exceedances
[Mikkelsen et al., 1996]. The correlation coefficient between
the sample mean values, rmij, is equal to the correlation
coefficient between concurrent exceedances rij. The correlation between higher order sample moments depends on
the order of the moment [Stedinger, 1983]. For the LCV
estimates the intersite correlation coefficient is given by
rtij = rij2 [Madsen and Rosbjerg, 1997a]. Thus the effect of
intersite dependence is less severe when estimating higher
order moments.
[23] Calculation of the correlation between the annual
numbers of exceedances is based on the concurrent observation years at the two stations. For estimation of the
correlation between the exceedances a series of concurrent
exceedances at the two stations is defined based on the time
of onset and termination of the extreme events. From this
series a conditional correlation coefficient can be calculated
(conditional upon extreme events occurring at the two
stations at the same time). To account for extreme events
that do not overlap temporally, an unconditional correlation
coefficient is calculated (see Mikkelsen et al. [1996] for
details). It should be noted that due to moving rain cells and
frontal systems two extreme events may be concurrent in
meteorological terms without overlapping in time. However, to take this kind of correlation into account detailed
meteorological information of each extreme rain event in
the region is needed, and this information was not available
in the present project. In the work of Mikkelsen et al. [1996]
the movement of rain cells was conceptually addressed and
it was shown that this did not have a significant impact on
the estimation of the intersite correlation.
[24] In general, the estimated intersite correlation coefficients have relatively large sampling uncertainties. A

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES

21 - 5

Figure 1. Empirical probability distributions of the 10-min rain intensity for the 41 stations.
direct use of the sample estimates may result in an error
covariance matrix  that can not be inverted, and hence
provides an ill-posed solution of (7) [Tasker and Stedinger,
1989]. To overcome this problem the correlation coefficients are smoothed by relating the sample estimates to the
distance between stations. In this case an exponential
correlation function is used
h
rij j

dij
wdij 1


exp



dij
1nj
wdij 1

14

where dij is the distance between stations.

4. L Moment Analysis
[25] For determination of a regional parent distribution L
moment analysis is applied. The goodness of fit measure
proposed by Hosking and Wallis [1993] was formulated for
discriminating between various three-parameter distributions. For specific use in PDS modeling where the candidate
distributions are the one-parameter exponential distribution
and different two-parameter distributions, the test statistic
has been reformulated. First, consider a two-parameter
distribution. Since the L skewness of a two-parameter
distribution is determined by the LCV estimate, the distance
between the regional estimate of the L skewness and the
theoretical L skewness for the considered distribution can
be used as a measure of the goodness of fit. To test the
significance of this distance, it is related to the sampling
uncertainty of the regional L skewness estimate. Thus the
goodness of fit measure can be formulated
G3

t3  tDIST
3
s3

15

where t3 is the regional (record-length-weighted) average L


skewness, t3DIST is the theoretical L skewness for the
considered distribution, and s3 is the standard deviation of
the regional L skewness estimate. The goodness of fit
measure for the one-parameter exponential distribution can
be formulated in a similar way. In this case the goodness of

fit is based on the distance between the regional LCV


estimate and the theoretical LCV, i.e.
G2

t2  tEXP
2
s2

16

where t2 is the regional (record-length-weighted) average


LCV, t2EXP is the theoretical LCV for the exponential
distribution (equal to 1/2), and s2 is the standard deviation
of the regional LCV estimate.
[26] The standard deviations of the regional LCV and L
skewness estimates are determined by simulating a large
number of homogenous regions from a kappa distribution by
using the regional average L moment statistics and number of
observations corresponding to the observed series. A significance test for the goodness of fit measure can be formulated,
assuming that the L moment estimates are independent,
homogenous and normally distributed. In this case, the G
statistic in (15) (16) is approximately normally distributed.
The goodness of fit measure of Hosking and Wallis [1993]
based on the L kurtosis includes a bias correction. However,
since LCV and L skewness have negligible biases, bias
corrections have not been included in (15) (16).
[27] Hosking and Wallis [1993] also proposed a test
statistic based on L moment estimates for assessing regional
homogeneity. It should be noted, however, that the L
moment statistic does not account for intersite dependence.
The lack of power of the test in the case of significant
intersite dependence may lead to erroneous conclusions
with respect to regional homogeneity [Madsen and Rosbjerg, 1997b]. The GLS model used here explicitly accounts
for intersite correlation, and, in addition, it provides an
estimate of the uncertainty of the regional mean.

5. Application Example
5.1. Rainfall Data
[28] In 1979 a new system of high-resolution automatic
rain gauges was introduced in Denmark. The measuring
network covers a total area of 43,000 km2, and the distances
between gauging stations range between 1 and 300 km. All

21 - 6

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES

Figure 2. Spatial structure of the intersite correlation due to concurrent exceedances: (top) 10-min
intensity and (bottom) 24-hour intensity.
stations are equipped with tipping bucket gauges with a
bucket size of 0.2 mm. The raw data that consist of the
number of tips per min are transformed into one-min
intensities for individual rain events. The preliminary separation of rain events is defined as periods exceeding one
hour without precipitation. At present, 90 stations have been
connected to the measurement system, and the longest
records consist of more than 18 years of data. In the regional
study, 41 stations with more than 10 years of data have been
included, corresponding to a total of about 650 station
years. Details of the measurement system and the quality
control of the data are given by Jrgensen et al. [1998].
[29] From the original data of one-min intensities, maximum intensities averaged over different durations were
extracted. Denote by i(t) the one-min rain intensity at time
t. The mean intensity at time t, y(t), with duration t is
defined as
R tt=2
yt

tt=2

ixdx

17

A rain event corresponding to the considered duration t is


defined as an uninterrupted sequence of positive y(t).
Denote by t0i and t1i the onset and the termination of the
defined rain event, the maximum mean intensity of the
event is then given by
zi Maxfyt; t0i t t1i g

18

With the above definition two rain events are independent if


the time between two consecutive tips of the rain gauge is
larger than t. However, due to the preliminary separation of
rain events, for t < 60 min independent rain events are
separated by at least one hour without precipitation.
Maximum mean intensities (for simplicity denoted intensities in the following) were abstracted for 8 different
durations ranging between 10 min and 48 hours.
[30] For a preliminary visual assessment of the extreme
rainfall characteristics in the region, the empirical probability distributions from the 41 stations have been plotted.
As an example, the 10-min intensities are shown in Figure 1

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES


Table 1. Explanatory Variables Used in the Regression Analysis
Characteristic

Units

Mean annual precipitation


Altitude
Geographical location
Distance to the sea or a large lake
Shelter index

mm
m
eastings, northings
km
degrees

where the exceedance probabilities, or equivalently the


return periods, of the observations are calculated using the
median plotting position formula [e.g., Rosbjerg et al.,
1992b]. At a first glance, the extreme intensities show a
pronounced regional variability. The first stage in the
regional modeling is to analyze whether this variability is
significant (i.e. if it reflects true regional differences) or can
be explained by sampling uncertainties. If the variability is
significant, in the second stage the potential for describing
the variability from physiographic and climatic characteristics is analyzed.
[31] The PDS for the 41 stations were defined by using
the same threshold level at all stations (see Figure 1). The
threshold level was chosen on the basis of a preliminary
sensitivity analysis of regional average extreme value characteristics as a function of the threshold level, implying a
regional average number of exceedances per year in the
range 2.5 3.2 for the analyzed rainfall variables.
5.2. Intersite Correlation Analysis
[32] In Figure 2 the intersite correlation structure due to
concurrent exceedances is shown for two of the analyzed
rainfall variables, the 10-min and the 24-hour intensity. In
Figure 2 the sample estimates, a moving average curve
based on a moving window including 10 data points, and
the fitted exponential correlation function are shown. The
two parameters j and w of the correlation function were
determined by a visual adjustment of the function to the
moving average curve. In general, the intersite correlation is
a decreasing function of the distance. Furthermore, the
correlation structure depends strongly on the considered
duration, the correlation being larger for larger durations.
This structure reflects the fact that extreme intensities for
large durations are mainly caused by moving frontal rain
systems with a large spatial coverage, whereas extreme
intensities for small durations are caused by convective rain
cells with a limited spatial extent.
[33] The intersite correlation between the annual number
of threshold exceedances has no apparent spatial structure,
and both large positive and negative correlations are observed. In this case a constant function equal to the regional
average correlation coefficient was fitted to the data.
5.3. GLS Regression Analysis
[34] The characteristics used as explanatory variables in
the regression analysis are shown in Table 1. The mean
annual precipitation (MAP) is determined by interpolation
of MAP for the standard normal period 1961 1990 from
300 locations in Denmark [Frich et al., 1997] based on
daily measurements. The shelter index is calculated as the
average of the angle between the gauge orifice and the
horizon for 8 directions. This parameter is included to

21 - 7

analyze the bias of the measurements due to different shelter


conditions for the 41 gauges. Any significant correlation
between a PDS parameter and the shelter index has to be
treated as an additional source of uncertainty (measurement
error).
[35] In the regression analysis the Cooks D statistic has
been calculated [Tasker and Stedinger, 1989]. For the GLS
regional mean model large values of Cooks D indicate
stations that diverge significantly from the group as a whole,
and hence the statistic can be used to identify possible
outliers. Stations being identified as possible outliers should
be carefully examined for gross errors in their data. If the
data seems acceptable, the possibility of moving the station
to another region should be considered.
[36] Results of the regression analysis for the Poisson
parameter are shown in Table 2. For all analyzed rainfall
variables the Poisson parameter has a pronounced regional
variability. A large part of this variability can be explained
by the mean annual precipitation. In this case the Poisson
parameter is an increasing function of MAP, i.e. the larger
MAP the more events above the threshold are observed (see
Figure 3). The correlation between the Poisson parameter
and MAP is more pronounced for intensities with large
durations. Inclusion of other characteristics in the regression
equation did not improve the regional model.
[37] The results for the mean exceedance are shown in
Table 3. For small durations (t 1 hour) the region can be
considered homogeneous (note that for the 10-min rain
intensity one station is identified as an outlier and is
excluded from the analysis). For larger durations a significant metropolitan effect was observed. The average extreme
intensities for durations between 1 and 12 hours are significantly larger in the Western part of the Copenhagen area
than in the rest of the country. For durations between 12 and
48 hours also the eastern part of the Copenhagen area has
significantly larger average extreme intensities than in the
rest of the country. A regional model is defined that divides
the country into three subregions: (1) the western Copenhagen area, (2) the eastern Copenhagen area, and (3) the rest
of the country. For durations between 1 and 12 hours
regions (2) and (3) are pooled into one region, whereas
for durations larger than 12 hours regions (1) and (2) are
pooled. The mean exceedance in subregion (3) can be
considered homogeneous. In the Copenhagen area, however, the mean exceedance has a significant variability, but
none of the considered climatic and physiographic characteristics provide any significant explanations of this variaTable 2. Results of the GLS Regression Analysis for the Poisson
Parameter
Regional Mean Model

Duration

Mean, years1

Residual Variance,
(years1)2

Regression Model
Based on MAP,
Residual Variance,
(years1)2

10 min
30 min
60 min
3h
6h
12 h
24 h
48 h

3.22
3.10
3.10
3.01
2.81
2.52
2.63
3.02

0.195
0.278
0.256
0.211
0.166
0.215
0.460
0.664

0.120
0.181
0.184
0.172
0.071
0.043
0.056
0.075

21 - 8

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES

Figure 3. GLS estimate of the Poisson parameter and corresponding 95% confidence interval compared
with observed values: (top) 10-min intensity and (bottom) 24-hour intensity.

Table 3. Results of the GLS Regression Analysis for the Mean Exceedancea
Regional Mean Model

Subregional Mean Models


Subregion 1

Subregion 2

Subregion 3

Duration

Mean,
mm/s

Residual
Variance,
(mm/s)2

Mean,
mm/s

Residual
Variance,
(mm/s)2

Mean,
mm/s

Residual
Variance,
(mm/s)2

Mean,
mm/s

Residual
Variance,
(mm/s)2

10 minb
30 min
60 min
3h
6h
12 h
24 h
48 h

3.33
1.61
0.948
0.436
0.260
0.166
9.42  102
5.71  102

0
0
0
1.13  103
3.37  104
3.31 104
9.64 105
3.21  105

3.33
1.61
0.948
0.517
0.340
0.234
0.131
7.56  102

0
0
0






3.33
1.61
0.948
0.432
0.257
0.162
0.131
7.56  102

0
0
0
0
0
0
1.76  104
6.44  104

3.33
1.61
0.948
0.432
0.257
0.162
9.40  102
5.81  102

0
0
0
0
0
0
0
0

a
b

2.41
6.83
1.04
1.76
6.44

103
104
104
104
104

Subregions: 1, western Copenhagen area; 2, eastern Copenhagen area; 3, the rest of the country.
One outlier station excluded from the analysis (large Cooks D statistic).

21 - 9

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES


Table 4. Results of the GLS Regression Analysis for the LCV
Duration
a

10 min
30 mina
60 mina
3h
6h
12 h
24 h
48 h
a

Mean

Residual Variance

0.516
0.545
0.536
0.521
0.542
0.536
0.543
0.528

0
0
0
0
0
0
0
7.0  104

One outlier station excluded from the analysis (large Cooks D statistic).

bility. Thus in all the subregions the mean exceedance is


modeled by a regional mean model.
[38] For some of the rainfall variables the mean exceedance and the shelter index were found to be slightly
correlated, indicating a tendency that gauges with better
shelter conditions measure larger intensities. As mentioned
above, the variability due to different shelter conditions has
to be considered as an additional error that is included in the
resultant uncertainty measure in the regional model (i.e. this
portion of the regional variability cannot be modeled from
climatic and physiographic characteristics). Alternatively,
the measurements should be corrected to eliminate the
effect, see e.g., examples of precipitation correction formulae of Allerup et al. [1997].
[39] Results of the regression analysis for the LCV are
shown in Table 4. For the 10, 30 and 60-min intensity one
station diverges significantly from the group as a whole
(large Cooks D statistic). If this station is excluded in the
analysis, LCV can be considered homogeneous for all
analyzed rainfall variables, except for the 48-hour intensity.
None of the considered climatic and physiographic characteristics were able to explain the regional variability of the
48-hour intensity. Thus for all rainfall variables a regional
mean model was adopted for modeling LCV.
5.4. L Moment Analysis
[40] For the analyzed rainfall variables L moment diagrams were constructed. In the L moment diagram LCV and
L skewness estimates are compared to the theoretical

Table 5. Goodness of Fit Measures for the Gamma (GAM),


Weibull (WEI), Lognormal (LN), Generalized Pareto (GP), and
Exponential (EXP) Distributions
Duration

GAM

WEI

LN

GP

EXP

10 min
30 min
60 min
3h
6h
12 h
24 h
48 h

2.8
1.1a
 1.2a
2.8
3.1
2.5
2.5
2.1

1.7a
0.5a
0.2a
1.6a
1.6a
1.2a
1.0a
0.7a

7.6
9.2
8.7
6.4
6.4
6.2
6.6
7.8

0.7a
2.6
1.9a
0.5a
0.6a
0.4a
0.9a
1.1a

2.7
6.0
4.7
3.2
6.2
4.5
5.6
4.9

Distribution cannot be rejected at a 5% level of significance (jGj < 1.96).

relationships for a number of candidate distributions,


including the generalized Pareto (GP), lognormal (LN),
gamma (GAM), Weibull (WEI), and exponential (EXP)
distributions. As an example the L moment diagram for
the 10-min intensity is shown in Figure 4. The theoretical L
moment relationships for the considered distributions are
given by Hosking and Wallis [1997]. Note that the GP,
GAM, and WEI distributions contain the exponential distribution as a special case.
[41] The goodness of fit statistics for the 5 candidate
distributions are compared to the quantiles of a standard
normal distribution in Table 5. For all rainfall variables the
LN and EXP distributions are rejected at a 5% level of
significance. The GAM distribution is rejected for 6 of the
variables, whereas the GP and the WEI distribution are
generally accepted. Analysis of the L moment diagrams
reveals that the cloud of points in the diagram is better
described by the theoretical GP line than the WEI line, and
hence the GP distribution should be preferred (see Figure 4
as an example). Thus it is concluded that the GP distribution
can be adopted as a regional parent for all rainfall variables.
5.5. Regional Estimation Model
[42] Based on the above findings a regional model can be
formulated for estimation of T-year intensities and associated
uncertainties at an arbitrary site in the region. For a GP-

Figure 4. L moment ratio estimates for the 10-min intensity compared to the theoretical relationships
for a number of candidate distributions.

21 - 10

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES

Figure 5. IDF curve and corresponding 68% confidence interval for a location in the Copenhagen area
(region 1) with a mean annual rainfall of 600 mm. Return period T is given in years.
distributed parent, the regional T-year event estimate, cf. (3),
reads
2
R
1

41 
zRT s z0 m R s R
k s
k R s

1
^ R sT
l

!k R s 3
5;
19

1
2
t R2 s

The uncertainty of this estimate can be found from (4).


[43] The estimation procedure can be summarized as
follows.
1. Based on an estimate of MAP at the site in question,
the mean annual number of exceedances and the associated
uncertainty are estimated from the GLS regression equation.
2. For intensities with small durations, t 1 hour, the
mean value of the exceedances and the associated uncertainty
are obtained from the regional mean model covering the
whole region. For larger durations, the regional mean model
for the relevant subregion is adopted.
3. The regional estimate of LCV and associated uncertainty is obtained from the regional mean model covering the
whole region.
4. The regional PDS parameter estimates and estimates of
the uncertainties are finally inserted in (19) and (4). An
example of a regional IDF curve is shown in Figure 5.

6. Conclusions
[44] A general framework for regional analysis and modeling of extreme rainfall characteristics has been developed.
The model is based on a PDS parameterization, which
includes the average annual number of exceedances, the
mean value of the exceedance magnitudes, and the LCV to
be assessed from regional data. A GLS regression model that
explicitly accounts for intersite correlation and sampling
uncertainties has been introduced for evaluating the regional
heterogenity of the PDS parameters. For the parameters that
show a significant regional variability, the GLS model is
subsequently applied for describing the variability from
physiographic and climatic characteristics. The resulting

GLS models can then be applied for estimation of the PDS


parameters and associated variances at an arbitrary location
in the considered region.
[45] For determination of a proper regional parent distribution L moment analysis is applied. In this respect,
regional goodness of fit measures have been formulated
that specifically apply for discriminating between the exponential distribution and various two-parameter distributions
in the PDS model.
[46] The regional model was applied to rainfall data from
a high-resolution rain gauge network in Denmark. The GLS
regression analysis revealed that a large part of the regional
variability of the mean annual number of exceedances in the
PDS can be explained by MAP; that is, for larger MAP
the more extreme events are observed. The mean value of the
exceedance magnitudes can be assumed constant in the
region for small durations (less than one hour). For larger
durations a significant metropolitan effect was observed, the
mean intensities in the Copenhagen area being significantly
larger than in the rest of the country. For LCV a regional
mean model was adopted. The L moment analysis revealed
that the GP distribution is an appropriate regional distribution. The analysis led to IDF-curve estimates and associated
uncertainties at arbitrary locations in Denmark.
[47] Acknowledgments. The research was supported by the Den
Kommunale Momsfond and the Water Pollution Control Committee of the
Danish Society of Engineers. The rainfall data was provided by the Water
Pollution Control Committee and the Danish Meteorological Institute.

References
Adamowski, K., Y. Alila, and P. J. Pilon, Regional rainfall distribution for
Canada, Atmos. Res., 42, 75 88, 1996.
Allerup, P., H. Madsen, and F. Vejen, A comprehensive model for correcting point precipitation, Nord. Hydrol., 28, 1 20, 1997.
Cong, S., Y. Li, J. L. Vogel, and J. C. Schaake, Identification of the underlying distribution form of precipitation by using regional data, Water
Resour. Res., 29(4), 1103 1111, 1993.
Danish Water Pollution Control Committee (DWPCC), Estimation of IDF
curves (in Danish), Publ. 16, Danish Soc. of Eng., Teknisk Forlag, Denmark, 1974.
Ekanayake, S. T., and J. F. Cruise, Comparisons of Weibull- and exponen-

MADSEN ET AL.: REGIONAL ESTIMATION OF RAINFALL IDF CURVES


tial-based partial duration stochastic flood models, Stochastic Hydrol.
Hydraul., 7(4), 283 297, 1993.
Fennessey, N. M., Development of a regional model of extreme precipitation for the northeast United States, Eos Trans. AGU, 79(17), Spring
Meet. Suppl., S90, 1998.
Fitzgerald, D. L., Single station and regional analysis of daily rainfall extremes, Stochastic Hydrol. Hydraul., 3, 281 292, 1989.
Frich, P., S. Rosenrn, H. Madsen, and J. J. Jensen, Observed precipitation
in Denmark 1961 1990, Tech. Rep. 97-8, Danish Meteorol. Inst., Copenhagen, 1997.
Greenwood, J. A., J. M. Landwehr, N. C. Matalas, and J. R. Wallis, Probability weighted moments: Definition and relation to parameters of several distributions expressible in inverse form, Water Resour. Res., 15(5),
1049 1054, 1979.
Greis, N. P., and E. F. Wood, Regional flood frequency estimation and
network design, Water Resour. Res., 17(4), 1167 1177, 1981. (Correction, Water Resour. Res., 19(2), 589 590, 1983.)
Harremoes, P., and P. S. Mikkelsen, Properties of extreme point rainfall, I,
Results from a rain gauge system in Denmark, Atmos. Res., 37, 277
286, 1995.
Hershfield, D. M., Rainfall frequency atlas of the United States for durations from 30 minutes to 24 hours and return periods from 1 to 100 years,
Tech. Pap. 40, U.S. Weather Bur., Washington, D. C., 1961.
Hosking, J. R. M., L-moments: Analysis and estimation of distributions
using linear combinations of order statistics, J. R. Stat. Soc., Ser. B, 52(1),
105 124, 1990.
Hosking, J. R. M., and J. R. Wallis, Parameter and quantile estimation for
the generalized Pareto distribution, Technometrics, 29(3), 339 349,
1987.
Hosking, J. R. M., and J. R. Wallis, Some statistics useful in regional
frequency analysis, Water Resour. Res., 29(2), 271 281, 1993. (Correction, Water Resour. Res., 31(1), 251, 1995.)
Hosking, J. R. M., and J. R. Wallis, A comparison of unbiased and plottingposition estimators of L moments, Water Resour. Res., 31(8), 2019
2025, 1995.
Hosking, J. R. M., and J. R. Wallis, Regional Frequency Analysis: An
Approach Based on L-moments, Cambridge Univ. Press, New York,
1997.
Jrgensen, H. K., S. Rosenrn, H. Madsen, and P. S. Mikkelsen, Quality
control of rain data used for urban runoff systems, Water Sci. Technol.,
37(11), 113 120, 1998.
Landwehr, J. M., N. C. Matalas, and J. R. Wallis, Probability weighted
moments compared with some traditional techniques in estimating Gumbel parameters and quantiles, Water Resour. Res., 15(5), 1055 1064,
1979.
Madsen, H., and D. Rosbjerg, The partial duration series method in regional
index-flood modeling, Water Resour. Res., 33(4), 737 746, 1997a.
Madsen, H., and D. Rosbjerg, Generalized least squares and empirical
Bayes estimation in regional partial duration series index-flood modeling, Water Resour. Res., 33(4), 771 781, 1997b.
Madsen, H., D. Rosbjerg, and P. Harremoes, PDS-modelling and regional
Bayesian estimation of extreme rainfalls, Nord. Hydrol., 25(4), 279 300,
1994.
Madsen, H., P. F. Rasmussen, and D. Rosbjerg, Comparison of annual
maximum series and partial duration series for modeling extreme hydrologic events, 1, At-site modeling, Water Resour. Res., 33(4), 747 757,
1997a.
Madsen, H., C. P. Pearson, and D. Rosbjerg, Comparison of annual maximum series and partial duration series for modeling extreme hydrologic
events, 2, Regional modeling, Water Resour. Res., 33(4), 759 769,
1997b.
Mikkelsen, P. S., and P. Harremoes, Uncertainties in urban runoff extreme
value calculations caused by statistical/geographical variation in rainfall
data, in Proceedings of Sixth International Conference on Urban Storm
Drainage, vol. 1, edited by J. Marsalak and H. Tomo, pp. 471 476,
Seapoint, Victoria, British Columbia, Canada, 1993.
Mikkelsen, P. S., P. Harremoes, and D. Rosbjerg, Properties of extreme

21 - 11

point rainfall, II, Parametric data interpretation and regional uncertainty


assessment, Atmos. Res., 37, 287 304, 1995.
Mikkelsen, P. S., H. Madsen, D. Rosbjerg, and P. Harremoes, Properties of
extreme point rainfall, III, Identification of spatial inter-site correlation
structure, Atmos. Res., 40, 77 98, 1996.
Mikkelsen, P. S., H. Madsen, K. Arnbjerg-Nielsen, H. K. Jrgensen, D.
Rosbjerg, and P. Harremoes, A rationale for using local and regional
point rainfall data for design and analysis of urban storm drainage systems, Water Sci. Technol., 37(11), 7 14, 1998.
Naghavi, B., and F. X. Yu, Regional frequency analysis of extreme precipitation in Louisiana, J. Hydraul. Eng., 121(11), 819 827, 1995.
Natural Environment Research Council (NERC), Flood Studies Report,
London, 1975.
Rosbjerg, D., Estimation in partial duration series with independent and
dependent peak values, J. Hydrol., 76, 183 195, 1985.
Rosbjerg, D., P. F. Rasmussen, and H. Madsen, Modelling of exceedances
in partial duration series, paper presented at the International Hydrology
and Water Resources Symposium, Inst. of Eng., Perth, UK, 1991.
Rosbjerg, D., H. Madsen, and P. F. Rasmussen, Prediction in partial duration series with generalized Pareto-distributed exceedances, Water Resour. Res., 28(11), 3001 3010, 1992a.
Rosbjerg, D., J. Correa, and P. F. Rasmussen, Justification des formules de
probabilite empirique basees sur la mediane de la statistique dordre, Rev.
Sci. Eau, 5, 529 540, 1992b.
Schaefer, M. G., Regional analyses of precipitation annual maxima in Washington State, Water Resour. Res., 26(1), 119 131, 1990.
Schaefer, M. G., Magnitude-frequency characteristics of precipitation annual maxima in southern British Columbia, Eos Trans. AGU, 79(17),
Spring Meet. Suppl., S90, 1998.
Shane, R. M., and W. R. Lynn, Mathematical model for flood risk evaluation, J. Hydraul. Div. Am. Soc. Civ. Eng., 90(HY6), 1 20, 1964.
Stedinger, J. R., Estimating a regional flood frequency distribution, Water
Resour. Res., 19(2), 503 510, 1983.
Stedinger, J. R., and L.-H. Lu, Appraisal of regional and index flood quantile estimators, Stochastic Hydrol. Hydraul., 9(1), 49 75, 1995.
Stedinger, J. R., and G. D. Tasker, Regional hydrologic analysis, 1, Ordinary, weighted and generalized least squares compared, Water Resour.
Res., 21(9), 1421 1432, 1985. (Correction, Water Resour. Res., 22 (5),
844, 1986.)
Stedinger, J. R., and G. D. Tasker, Regional hydrologic analysis, 2, Modelerror estimators, estimation of sigma and log-Pearson type 3 distributions, Water Resour. Res., 22(10), 1487 1499, 1986.
Stedinger, J. R., R. M. Vogel, and E. Foufoula-Georgiou, Frequency analysis of extreme events, in Handbook of Hydrology, edited by D. R.
Maidment, chap. 18, McGraw-Hill, New York, 1993.
Tasker, G. D., Hydrologic regression with weighted least squares, Water
Resour. Res., 16(6), 1107 1113, 1980.
Tasker, G. D., and J. R. Stedinger, An operational GLS model for hydrologic regression, J. Hydrol., 111, 361 375, 1989.
Todorovic, P., and E. Zelenhasic, A stochastic model for flood analysis,
Water Resour. Res., 6(6), 1641 1648, 1970.
Van Montfort, M. A. J., and J. V. Witter, The generalized Pareto distribution
applied to rainfall depths, Hydrol. Sci. J., 31(2), 151 162, 1986.
Wallis, J. R., Risk and uncertainties in the evaluation of flood events for the
design of hydraulic structures, in Piene e Siccita`, edited by E. Guggino,
G. Rossi, and E. Todini, pp. 3 36, Fond. Politec. del Mediterraneo,
Catania, Italy, 1980.
Zelenhasic, E., Theoretical probability distributions for flood peaks, Hydrol.
Pap. 42, Colorado State Univ., Fort Collins, Colo., 1970.




P. Harremoes, D. Rosbjerg, and P. Steen Mikkelsen, Environment and


Resources DTU, Technical University of Denmark, Bygningstorvet,
Building 115, DK-2800 Kgs. Lyngby, Denmark.
H. Madsen, DHI Water and Environment, Agern Alle 11, DK-2970,
Hrsholm, Denmark. (hem@dhi.dk)

You might also like