You are on page 1of 14

Journal of Hydrology (2008) 356, 56 69

available at www.sciencedirect.com

journal homepage: www.elsevier.com/locate/jhydrol

A Bayesian approach to decision-making under


uncertainty: An application to real-time forecasting
in the river Rhine
a,b,* b
P. Reggiani , A.H. Weerts

a
Section of Hydraulic Structures and Probabilistic Design, Delft University of Technology, P.O. Box 5048,
2600GA Delft, The Netherlands
b
Deltares-Delft Hydraulics, P.O. Box 177, 2600MH Delft, The Netherlands

Received 31 May 2007; received in revised form 13 March 2008; accepted 31 March 2008

KEYWORDS Summary Enhanced ability to forecast peak discharges remains the most relevant non-
Uncertainty; structural measure for flood protection. Extended forecasting lead times are desirable
Decision support; as they facilitate mitigating action and response in case of extreme discharges. Forecasts
Operational flood remain however affected by uncertainty as an exact prognosis of water levels is inherently
forecasting; impossible. Here, we implement a dedicated uncertainty processor, that can be used
Bayesian revision; within operational flood forecasting systems.
River Rhine The processor is designed to support decision-making under conditions of uncertainty.
The scientific approach at the basis of the uncertainty processor is general and indepen-
dent of the deterministic models used. It is based on Bayesian revision of prior knowledge
on the basis of past evidence on model performance against observations. The revision of
the prior distributions on water levels and/or flow rates leads to posterior probability
distributions that are translated into an effective decision support under uncertainty.
The processor is validated on the operational real-time river Rhine flood forecasting
system.
2008 Elsevier B.V. All rights reserved.

Introduction

Extreme river runoff events, which include both, high and


* Corresponding author. Address: Deltares-Delft Hydraulics, P.O. low-flows, have had large social and economic impact
Box 177, 2600MH Delft, The Netherlands. Tel.: +31 15 285 8882; fax: worldwide and continue to pose a regular concern to
+31 15 285 8582. society. Recent large floods in Europe, such as those that
E-mail address: paolo.reggiani@deltares.nl (P. Reggiani). have occurred in the Meuse and Rhine basins in 1995, over

0022-1694/$ - see front matter 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.jhydrol.2008.03.027
A Bayesian approach to decision-making under uncertainty 57

large areas of the United Kingdom in 1998 and 2000 and in model (Kitanidis and Bras, 1980a,b; Szollosi-Nagy and Mekis,
the Elbe basin in summer 2002, have led to increased inter- 1988; Georgakakos and Smith, 1990) or to an integrated hy-
est in research and development on upgrading existing oper- dro-meteorological model (Georgakakos, 1987). Forecasts
ational flood forecasting systems in Europe. specifying an ensemble of hydrographs were produced via
Following recent conclusions of the fourth IPCC climate a deterministic hydrological model for short lead times (Lar-
assessment report (IPCC, 2007), enhanced meteorological det and Obled, 1994). While these systems pointed paths to
extremes are to be expected during the 21st century. A pos- probabilistic forecasting, they have limitations: the first
sible acceleration of the hydrological cycle may lead to type of system did not output the entire predictive probabil-
large fluctuations in discharges in European river systems. ity distribution function, whereas the second type of system
Extreme discharges may become more frequent, calling did not account for all sources of uncertainty. The introduc-
for structural and non-structural interventions. Significant tion of Ensemble weather predictions in recent years gave
investments for the installation and upgrading of opera- way to producing multiple stream-flow forecasts (Schaake
tional flood forecasting systems are already on the agenda and Larsson, 1998). The currently ongoing large-scale inter-
of national hydro-meteorological services. The World Mete- national hydrological ensemble prediction experiment (HEP-
orological Organization (WMO, 2006) acknowledges that in EX, Schaake et al. (2006)) explicitly addresses uncertainty
many parts of the world forecasting remains the only effec- assessment in the context of ensemble stream-flow
tive measure which can be realistically implemented to pro- predictions.
tect life and property in the face of extreme meteorological An adequate assessment of the uncertainty has the addi-
events. tional benefit of clearly separating the responsibilities and
Real-time forecasting constitutes a non-structural mea- tasks of the organization, operating the forecasting system,
sure by providing warning and issuing alerts ahead of an from those of the decision-maker (Krzysztofowicz, 2001).
emergency. Extending the forecasting horizon allows time Here, we present an implementation of a probabilistic
allocation for mitigating action. A reliable assessment of uncertainty processor which elaborates a forecast in proba-
certainty of event occurrence in a real-time context safe- bilistic terms and offers an measure of the uncertainty of
guards operational users from issuing false alarms and insti- the forecast. The theory of the processor has been devel-
tutional decision-makers from calling for unwarranted oped in the Bayesian forecasting system (BFS) by Krzysztofo-
action. Real-time flood forecasting systems are currently wicz (1999) and Krzysztofowicz and Kelly (2000). The
operational in many parts of the world, including the Neth- processor is executed off-line, after a forecast, and evalu-
erlands, where forecasting systems for the Rhine and Meuse ates the probability of occurrence of the predicted flow rate
have been installed. Potential flooding caused by those two or water level, conditional on all information available in
river systems raises security issues for the low-lying terri- the forecasting process. The statistics performed by the
tory in the country. processor take historical model performance against obser-
A single deterministic water level or flow rate forecast vations into account. Consequently, decisions under condi-
several hours ahead via a model simulation is of little value tions of uncertainty can be taken objectively once an
to decision-makers, as there remains inherent uncertainty acceptable damage level has been defined at the institu-
associated with model output. The sources for the uncer- tional level. In this perspective quantifying the uncertainty
tainty in the forecast are manifold (Krzysztofowicz, 1999) contributes to establishing an effective decision support
and will be addressed in this paper. If an exceptional event framework for end-users of flood forecasting products.
is forecasted, evacuation is a last resort intervention to save The general framework for a Bayesian forecasting sys-
lives. It is considered in situations, for which safety of the tem (BFS) presented by Krzysztofowicz (1999) constitutes
population in risk areas can no longer be guaranteed. It re- a significant effort in formalizing the quantification of
mains a costly action, which causes disruption to a range of uncertainty in the flood forecasting process, given arbitrary
societal activities. Institutional bodies generally regard the deterministic models. The BFS consists of three compo-
conclusion to initiate evacuation as a last resort option, nents: (i) an input uncertainty processor, (ii) a hydrologic
which should be based on carefully weighted and objective uncertainty processor and (iii) an integrator of uncertainty.
decision-making. The uncertainty in the forecast at the ba- The BFS is based on a complete and consistent theory,
sis of the decision, needs to be fully incorporated in the which identifies and separates relevant sources of uncer-
decisional process and combined with a cost function (Raif- tainty. It provides a general reference framework for build-
fa and Schlaifer, 1961; De Groot, 1970; Todini, 2007). By ing the uncertainty processors and the integrator,
means of such a cost function, weighted by the probability components which can be readily implemented in a real-
of event occurrence, the decision-maker can soundly deter- time flood forecasting system. Sequel papers have imple-
mine if the total expected damage in issuing an alert is high- mented various BFS components as a one-branch processor
er than in the case of taking no action (see also Section (precipitation-independent model, Kelly and Krzysztofo-
Discussion). Providing an objective measure of the wicz (2000) and Krzysztofowicz and Kelly (2000)) or two-
uncertainty associated with a forecast constitutes thus an branch processor (precipitation-dependent model, Krzysz-
essential prerequisite for sound decision support. tofowicz and Herr (2001)), analyzed their statistical proper-
Research in the past 30 years has addressed many of ties and carried out preliminary verifications. One of the
these aspects, including the development of forecasting sys- most salient features of the BFS theory is the provision of
tems that output some (partial or approximate) measure of a formal basis for the development and successive revision
predictive uncertainty. Forecasts specifying the statistics of of its components, whilst preserving internal consistency
discharge were produced for short lead times via the Kal- and statistical properties, which make it appealing for
man filter applied either to a deterministic hydrological operational use.
58 P. Reggiani, A.H. Weerts

In a recent paper (Reggiani and Weerts, 2008), we have /hn;X 1 jsn;X 1 ; h0;X 1 ; . . . ; h0k;X m ; u; h; . . . 4
presented an application of the input uncertainty processor
expressing the probability density on the water level hn;X 1 at
(IUP) proposed by Krzysztofowicz (2004). The aim of the
the forecasting location, conditional on the corresponding
present paper is an implementation of the hydrological
modeled water level sn;X 1 and the level observations at a ser-
uncertainty processor (HUP) for the operational flood fore-
ies of observing stations X 1 ; X 2 ; . . . ; X m at times 0  k ahead
casting system of the river Rhine. Given the large river basin
of t0 . The conditional probability distribution can in princi-
size (160,000 km2) and an basin contraction time between 4
ple be considered conditional on additional information
and 5 days, flood propagation and travel times in the princi-
such as the internal state vector u of the models and the
pal river system are predictable with reasonable accuracy.
model parameter vector h. However, for the present appli-
We present an application of the HUP, whereby we empha-
cation, we assume the models to be well calibrated, such
size the importance of an optimal specification of the prior
that the parameters be considered fixed and certain. The
density in the particular Bayesian revision process. In con-
internal model states are considered optimized at the begin
trast to Krzysztofowicz and Kelly (2000), who assume the
of the forecast (e.g. by means of data assimilation tech-
river system to behave as a Markov chain stochastic process,
niques), such that they can be considered certain as well.
we propose and parameterize a prior density on river stages
On the basis of these assumptions (4) reduces to the follow-
as a linear regression, in which multiple observations at up-
ing conditional probability density function:
stream observing stations are taken into consideration.
The paper is structured into four sections: Section Prin- /hn;X 1 jsn;X 1 ; h0;X 1 ; . . . ; h0k;X m 5
ciples describes the principles underlying the Bayesian
processor, Section Uncertainty processor describes the We note the absence of any dependence on the meteorolog-
theory, Section Application introduces the operational ical input, like precipitation and temperature, in (4) and
Rhine flood forecasting system and describes the data elab- (5). The reason for this choice is the fact that the modeled
oration steps for the Bayesian processor, Section Experi- water level sn;X 1 at the forecasting location is a result of
ments discusses the numerical experiments and Section uncertain meteorological input, that has already been pro-
Discussion draws the conclusions. cessed through the hydrological and hydraulic models. The
uncertainty in the input propagates through the model
chains and translates into uncertain water level predictions.
Principles The assessment and quantification of this uncertainty is the
aim of the remaining part of this paper.
Variates In this context, we note that ensemble weather forecasts
are frequently used in operational streamflow forecasting
Adopting the notation by Krzysztofowicz (1999), we intro- with the goal of introducing additional randomization into
duce the following random variates, that describe the fore- the meteorological forcing. These products are generally
casting process: The ensemble obtained by perturbing initial condition vectors of the mete-
Hn;X H1;X ; . . . ; Hn;X 0 1 orological model. Ensemble weather forecasts essentially
represent the total uncertainty on weather model output
called the predictand, are water levels recorded at fore- (the forecasted precipitation), rather than conditional
casting times 1; . . . ; n at location X. These quantities lie in uncertainty on the actual precipitation to be expected.
the future with respect to those observed at the same loca- Additional Bayesian processing of meteorological model
tions, at an arbitrary number of historical times 1; . . . ; k up output ensembles, along lines proposed by Krzysztofowicz
to the time t0 0 at the onset of the forecast: (2004) and implemented in Reggiani and Weerts (2008),
are required to get probabilities on actual precipitation,
H0k;X H0;X ; . . . ; H0k;X 0 2
which are conditional on meteorological forecasts and the
Last, physical state of the atmosphere. This precipitation can
be subsequently used as additional conditioning variable in
Sn;X S1;X ; . . . ; Sn;X 0 3 Eq. (4). However, the combination of precipitation with
is an ensemble of sets of modeled water levels at location X additional conditioning variables in the expression for pre-
and at the same times of the observations Hn;X . The realiza- dictive uncertainty in Eq. (4) remains matter of further
tions of the variates Hn;X , H0k;X and Sn;X are denoted with investigation.
the lowercase letters hn;X , h0k;X and sn;X , respectively.
Uncertainty processor
Predictive uncertainty
Operational requirements
The predictive uncertainty Krzysztofowicz (2001) can be de-
fined as a measure of the degree of certitude of the occur- The uncertainty processor for a water level forecast is pro-
rence of an event, conditional on all the information vided by an estimator for the conditional probability density
available in the forecasting process. In operational river function / in (5). We seek a parameterization of the condi-
flow forecasting, an event consist in the exceedance of tional density / in a series of procedural steps, that have
a critical water level at a forecasting location X 1 . The total been laid out in Krzysztofowicz and Kelly (2000). The
predictive uncertainty on the predicted water level with estimator will be formulated as Bayesian processor, which
lead time n can be expressed in terms of the following con- revises a prior density on hn;X 1 by means of a likelihood func-
ditional probability density function: tion. In the context of a real-time forecasting system, the
A Bayesian approach to decision-making under uncertainty 59

operational requirements impose limitations on the meth- /hn;X 1 jsn;X 1 ; h0;X 1 ; . . . ; h0k;X m
ods to be used in estimating the marginal densities. Because fsn;X 1 jhn;X 1 ; h0;X 1 ; . . . ; h0k;X m ghn;X 1 jh0;X 1 ; . . . ; h0k;X m
of time constraints one should avoid to need carrying out
ksn;X 1 jh0;X 1 ; . . . ; h0k;X m
large number of simulations on-line, as for instance required
when processing Monte Carlo sampled input data through a 6
deterministic model. Modeling the densities with analytical-
where
numerical expressions prepared off-line and only evaluated
on-line, as proposed by Krzysztofowicz and Kelly (2000) and Z 1

Kelly and Krzysztofowicz (2000), is a powerful approach to ksn;X 1 jh0;X 1 ; . . . ; h0k;X m fsn;X 1 jhn;X 1 ; h0;X 1 ; . . . ; h0k;X m
1
increasing computing efficiency.
ghn;X 1 jh0;X 1 ; . . . ; h0k;X m dhn;X 1 7

is the expected density on model output, conditional on all


Bayesian formulation information involved in the forecasting process.

The proposed uncertainty processor assesses the predictive


uncertainty by means of Bayesian revision of prior informa- Application
tion. Through inference from a prior probability distribution
function, the processor learns from historical knowledge The river Rhine forecasting system
on the model performance given uncertain meteorological
input. A proper choice of the prior distribution function is The HUP is applied to the operational flood forecasting sys-
crucial for the efficiency of the revision, as it should contain tem for the river Rhine with a total surface area of
as much information on the behavior of the system as known 160,000 km2. The system is embedded in the forecasting
from observations only, in absence of a model. Krzysztofo- platform Delft-FEWS (Flood Early Warning System), an
wicz and Kelly (2000) propose a prior density, assuming that open-architecture data management environment, which
the river can be modeled with a first-order Markov chain. facilitates interfacing generic hydrological and hydraulic
Their choice of the first-order Markov process facilitates models with online data streams and a central data base
manipulations involved in the Bayesian formulation and (Werner et al., 2004; Werner and Heynert, 2006). The online
was applied to a relatively small system of 1450 km2. How- data include water levels and precipitation and temperature
ever, larger systems, like the river Rhine, are unlikely to be- observations from a network of about 200 meteo-stations
have first-order Markov, as rising and falling limb cannot be spread across the entire basin. The hydrologic response is
discerned due to one single supporting observation h0;X 1 . simulated with the HBV model (Bergstrom, 1995). The mod-
Moreover, the performance of the first-order Markov model el discretizes the Rhine basin into 134 districts or subbasins.
deteriorates rapidly with increasing lead-time of the fore- The hydrological model calculates the runoff from the trib-
cast. For these reasons, we propose to derive a prior distri- utaries towards the main river Rhine channel. The propaga-
bution function ghn;X 1 jh0;X 1 ; . . . ; h0k;X m from a linear tion of the flood wave along the main channel is modeled
regression model, which involves level observations at mul- with the dynamic wave solver SOBEK Stelling and Verwey
tiple upstream stations. Increasing the number of observa- (2005). The hydrodynamic model extends from the location
tion points allows to model more accurately raising and Maxau in the central Rhine basin, 670 km upstream, to the
falling limbs of the flood wave at the forecasting point by river mouth. Fig. 1 depicts the Rhine basin with its principal
including m supporting observations k hours ahead of t0 . tributaries, selected observing stations and the travel time
The effects of various uncertainty sources enter the revi- isochrones. Here we focus on the basin control section Lob-
sion process via the marginal density fj of the model out- ith situated at the Dutch-German border, 170 km upstream
put S, conditional on the observations H. The family of from the principal river mouth (Port of Rotterdam). The
densities fsn;X 1 j; h0;X 1 ; h0k;X 2 ; . . . ; h0k;X m represent the hydrological and the hydrodynamic models are run hourly
likelihood of the predictand hn;X 1 , given all information (up- in historical and in forecast mode. In historical mode, the
stream observations and model predictions) available at the models are forced by real-time precipitation and tempera-
onset of the forecast. A dependence of the likelihood on ture observed over a historical period of 196 h prior to the
additional variates, such as the internal model states u is start time of the forecast t0. Over the historical simulation
in principle possible. However, here we assume that the period the hydrological model output is updated with
internal model states are fixed and known and exclude these stream flow observations by means of an automatic error-
from the likelihood formulation. The residuals vector correction method (Broersen and Weerts, 2005). The
n sn;X 1  hn;X 1 represents the model error that is attribut- error-corrected hydrological model output is subsequently
able to the presence of uncertain meteorological input, assigned as lateral inflow to the hydrodynamic model for
model conceptualization errors and possible implicit errors the main river channel.
due to sub-optimal initial conditions and internal model In forecast mode, the hydrological model is forced by
states, which we do not keep explicitly in the formulation. deterministic weather forecasts providing gridded precipi-
The likelihood function can therefore be seen as equivalent tation and temperature data, which are transformed into
to a full stochastic characterization of the model error. The sub-basin averaged timeseries. The weather forecasts with
likelihood function f imports a probabilistic description of a 72 h lead time and 7 7 km grid size are provided by the
the predicative capability of the model into the Bayesian HIRLAM model operated at the Royal Dutch Meteorological
revision process. The total predictive uncertainty is given Service. Water levels simulated at Lobith at t0 are denoted
by the revised posterior density: with s0;L , while those at the forecasting day n with sn;L . The
60 P. Reggiani, A.H. Weerts

Figure 1 The river Rhine basin with travel time isochrones.

corresponding observed levels are indicated with h0;L and a continuous three-year period starting at the 1st of July
hn;L . As additional supporting information we use water lev- 2002 and ending at the 30th of June 2005.
els observations at the gauging station Cologne. Cologne is
situated about 150 km upstream of Lobith. The travel time Data elaboration
of a flood wave between the two stations is about 30 h.
The water levels observed at Cologne k days ahead of t0 The data for all three years are grouped into months to ac-
are denoted with h0k;C . We examine hourly forecasts over count for non-stationarity of the underlying processes. The
A Bayesian approach to decision-making under uncertainty 61

Jan, modelled
0.9 Apr, modelled
Jul, modelled
Oct, modelled
0.8 Jan, empirical
Apr, empirical
Jul, empirical
Empirical and 2-Parmameter Weibull fit

0.7 Oct, empirical


Jan, modeled
Apr, modeled
0.6 Jul, modeled
Oct, modeled

0.5

0.4

0.3

0.2

0.1

0
6 7 8 9 10 11 12 13 14 15 16
water level H0,L (m)

Figure 2 Empirical and modelled cdfs, modelled pdfs, 20032005, Lobith.

river Rhine system behaves differently in winter than in hood, prior, an expected or a posterior density are obtained
summer, which is characterized by predominantly low in the normal space. The subsequent transformation back
flows. High flows rates generally occur during the winter into the original space yields parametric expressions for
months, especially December and January. Fig. 2 shows the conditional densities and distributions in the original
cumulative probability distributions on water levels ob- variable space.
served in January (winter), April (spring), July (summer) Gaussian variates are obtained by matching an empirical
and October (autumn) at Lobith for the three-year observing probability distribution C with a Gaussian distribution
period. The pronounced cusps in the January curve indicate Q  and then performing an inversion Q 1 . The NQT, indi-
inundation of the winter bed (flood plain) and the resulting cated with Q 1 , of a non-Gaussian distribution yields thus
change in dynamics due to the double-trapezoidal shape of a series of corresponding transformed standard normal
the cross sections at Lobith. It is clear that water levels in variables:
spring and summer are significantly lower than in autumn g Q 1 Ch 8
or winter. The dashed and dotted curves indicate the 2-
1 Q 1 Ds 9
parameter Weibull fits Ch to the data and the derived
pdf ch. We have tried to fit alternative models such as where C and D are the empirical distributions of corre-
the Gamma, Log-Weibull and Log-Logistic distribution func- sponding modeled distributions. The inverse C1 Q  and
tions, which all delivered sub-optimal fits. D1 Q  yield again the non-gaussian variate in the original
space. In Fig. 3a, we see a plot of the stage 1n;L forecasted
at Lobith against the stage gn;L observed at the same time,
Normal quantile transform both transformed into the normal space, for lead times of
1, 2 and 3 days, respectively. The continuous line is the lin-
The next procedural step is to map the respective variates ear regression, while the dashed-dotted lines indicate the
into the normal space through the application of the normal 80% confidence interval in the normal space. It is evident
quantile transform (NQT) (Van der Waerden, 1952, that the correlations between the forecasted and the ob-
1953a,b). The application of the NQT to non-gaussian ran- served water levels are significant. As we would expect,
dom variables yields gaussian surrogate variates in the nor- the variance becomes larger with increasing lead-time.
mal space, where expressions for moments of distributions Fig. 3b shows the forecasted versus the observed water le-
are obtained analytically. Relationships between statistical vel in the original space. The continuous curve is the trans-
variables can be fitted via linear regression. Parametric formed linear regression in the original space, while the
expressions for conditional marginal densities such as likeli- dashed-dotted lines indicated the transformed confidence
62 P. Reggiani, A.H. Weerts

Normal space: likelihood, Lobith, July


3
2
s1 1
0
-1
-2
-3
-3 -2 -1 0 1 2 3
w
1

2
x2

-2

-4
-3 -2 -1 0 1 2 3
w
2
4

2
x3

-2

-4
-3 -2 -1 0 1 2 3
w3

Figure 3a Normal space: likelihood, Lobith, July.

envelope. More on the analytical expressions underlying the where q is the normal density operator and s2n VARgn;L j
mapping of the regression and the envelope curves can be g0;L ; g0k;C . The values for the regression constants an ,
found in Krzysztofowicz and Kelly (2000). bn , cn ,and sn for lead times n 24; 48; 72 hours, level
observations h0k;C at Cologne for k 24 hours, and
Parameterizing the processor selected months, are listed in Table 1. We see that with
increasing lead time the standard deviation increases
The Bayesian processor (6) is specified by deriving paramet- consistently for all seasons, as we would expect. One can
ric expressions for the family of the prior density moreover see that correlations between hn;L and h0;L de-
ghn;L jh0;L ; h0k;C and the family of likelihood functions crease for increasing n, while the (negative) correlation
fsn;L jhn;L ; h0;L ; h0k;C . The evaluation of the prior density with the level h024;C observed at Cologne is on average
and the likelihood is carried out following the steps laid strongest at lead-time n 48 h. For n 72 the correlation
out in Krzysztofowicz and Kelly (2000). Details of the manip- with h024;C is strongest in April, and October, while it is
ulations are omitted and only the results synthesized weaker in January. This is explained by the faster flow
hereunder. velocity in winter due to higher discharge, thus shorter
travel times for the peak to reach Lobith from Cologne.
Prior density
In July the correlation significantly increases between
The prior density is modeled assuming a linear relationship
n 48 and n 72 due to the slow velocity during the sum-
between transformed normal variables. The transform gn;L
mer low flow period. It takes thus longer for a flow peak
of the level forecasted at tn for Lobith, is related via a linear
observed at Cologne to reach Lobith. The regression shift
regression to the normal transforms g0;L and g0k;C of obser-
constant cn is very small and is thus approximated with
vations at Lobith and Cologne at times t0 and t0k ,
zero implying that the the regression line goes through
respectively:
the origin. Given the fact that the marginal densities of
gn;L an g0;L bn g0k;C cn Nn 10 the transformed variables are standard normal, one could
where the parameters an , bn and cn are regression con- therefore also make the direct assumption of linear
stants, while the residual Nn is statistically independent of stochastic dependence between variables. Under such a
(g0;L ; g0k;C ) and normally distributed with zero mean and hypothesis one could avoid to perform a regression
variance s2n . The parametric expression of the conditional because the joint bi-normal distribution of the variables,
prior density becomes (Krzysztofowicz and Kelly, 2000): e.g. between fgn;L and g0;L g will only be a function of their
  cross-correlation coefficient. However, for the mathemat-
1 gn;L  an g0;L  bn g0k;C  cn ical developments in this paper we decided to continue
gQn gn;L jg0;L ; g0k;C q 11
sn sn using linear regressions.
A Bayesian approach to decision-making under uncertainty 63

Original space: likelihood, Lobith, July


11

10
s1, [m]

7
7.5 8 8.5 9 9.5 10 10.5
h , [m]
1

11

10
s2, [m]

7
7.5 8 8.5 9 9.5 10 10.5
h2, [m]

11

10
s3, [m]

7
7.5 8 8.5 9 9.5 10 10.5
h , [m]
3

Figure 3b Original space: likelihood, Lobith, July.

Table 1 Coefficients and standard deviation in the prior distribution for h0k;C , k 24 h, lead times 24, 48 and 72 h, selected
months
an bn cn sn
n 24 h
January 0.6613 0.2652 0.0000 0.1537
April 0.6252 0.2819 0.0000 0.1097
July 0.5542 0.3519 0.0000 0.2393
October 0.1483 0.1873 0.0000 0.1274

n 48 h
January 0.7097 0.0732 0.0000 0.3857
April 0.4853 0.2574 0.0000 0.4509
July 0.2996 0.6056 0.0000 0.4861
October 0.1502 0.2279 0.0000 0.3302

n 72 h
January 0.7596 0.1504 0.0000 0.6106
April 0.4411 0.1264 0.0000 0.6720
July 0.1542 0.6690 0.0000 0.6894
October 0.2141 0.3648 0.0000 0.5288

Likelihood tions at Lobith and Cologne at times t0 and t0k ,


The likelihood is modeled in analogy to the prior den- respectively:
sity, assuming again a linear relationship between the 1n;L an gn;L bn g0;L cn g0k;C d n Hn 12
transformed normal variables. The transform 1n;L of the
forecasted level at Lobith is related via a linear regres- where the parameters an , bn , cn and d n are regression con-
sion to the normal transform gn;L of the level observed stants, while the residual Hn is stochastically independent
at time tn , and the transforms g0;L and g0k;C of observa- from (gn;L ; g0;L ; g0k;C ) and normally distributed with zero
64 P. Reggiani, A.H. Weerts

mean and variance r2n (not shown here). The parametric at t024 are summarized in Table 3. We note that the coeffi-
expression of the likelihood becomes: cient Dn remains very small and is thus approximated by
zero. The coefficient Cn is negative in most cases, which
fQn 1n;L jgn;L ; g0;L ; g0k;C
  indicates that if the level in Cologne decreases, the one in
1 1n;L  an gn;L  bn g0;L  cn g0k;C  d n Lobith increases and vice-versa.
q 13
rn rn
where q is the normal density operator and r2n VAR1n jgn;L ; Transformation into the original space
g0;L ; g0k;C . Table 2 summarizes the regression coefficients In the original variable space, the prior density of levels at
and standard deviations for lead times of n equal to 24, 48 forecasting time tn , conditional on those observed at fore-
and 72 h and level observations at Cologne at t024 . The cast start time t0 at Lobith and at time t0k in Cologne, takes
trend in the numerical values for the regression coefficients on the form:
can be explained in analogy to Table 1. Also here the regres- gn hn;L jh0;L ; h01;C
sion constant d n is very small and approximable with zero. chn;L

sn qQ 1 Chn;L
Posterior density !
The posterior density in the normal space is obtained by Q 1 Chn;L  an Q 1 Ch0;L  bn Q 1 Ch0k;C  cn
q
combining the prior (11) and the likelihood (13), both nor- sn
mal linear, with the transformed conditional expected den-
15
sity kQn 1n;L jg0;L ; gn;L ; g0k;C , as stated by Bayes theorem. The
underlying manipulations are omitted and can be found in where chn;L is the marginal density of hn;L at forecast time
Krzysztofowicz and Kelly (2000). The parameterized density tn , Lobith. The posterior density in the original space of ac-
in the normal space results in the expression: tual water levels to be expected at time tn at Lobith, condi-

!
chn;L Q 1 Chn;L  An Q 1 Dsn;L  Bn Q 1 Ch0;L  Cn Q 1 Ch0k;C  Dn
/n hn;L jsn;L ; h0;L ; h01;C  q 16
T n qQ 1 Chn;L Tn

/Qn gn;L j1n;L ; g0;L ; g0k;C tional on levels sn;L forecasted for day n and levels h0;L and
  h0k;L observed at days t0 and t0k at Lobith and Cologne
1 gn;L  An 1n;L  Bn g0;L  Cn g0k;C  Dn
q 14 respectively, is given by the following parametric expres-
Tn Tn
sion:
The parameters An , Bn , Cn , Dn and T n evaluated for lead The posterior cumulative distribution of hn;L finally reads as
times n 24; 48; 72 hours and level observations at Cologne follows:

!
Q 1 Chn;L  An Q 1 Dsn;L  Bn Q 1 Ch0;L  Cn Q 1 Ch0k;C  Dn
Un hn;L jsn;L ; h0;L ; h01;C Q 17
Tn

Table 2 Coefficients and standard deviation in the likelihood for h0k;C , k 24 h, lead times 24, 48 and 72 h, selected months
an bn cn dn rn
n 24 h
January 0.9719 0.0338 0.0210 0.0000 0.0312
April 0.7618 0.2314 0.1250 0.0000 0.0858
July 0.6765 0.0209 0.3762 0.0000 0.1878
October 0.9936 0.0743 0.0569 0.0000 0.0405

n 48 h
January 0.5147 0.2261 0.2964 0.0000 0.0550
April 0.3810 0.4601 0.1605 0.0000 0.1278
July 0.3814 0.1277 0.4534 0.0000 0.2211
October 0.5540 0.1599 0.3038 0.0000 0.0685

n 72 h
January 0.1908 0.0542 0.7908 0.0000 0.0651
April 0.7908 0.2788 0.5130 0.0000 0.1534
July 0.1904 0.0108 0.7605 0.0000 0.2406
October 0.2857 0.0053 0.7244 0.0000 0.1020
A Bayesian approach to decision-making under uncertainty 65

Table 3 Definition of coefficients and numerical values for observations at Cologne at t024 h, lead times 24, 48 and 72 h,
selected months
Coefficient: An Bn Cn Dn T 2n
Definition: an s2n an rn an d n s2n bn rn an cn s2n cn r2n an bn s2n s2n r2n
a2n s2n r2n a2n s2n r2n a2n s2n r2n a2n s2n r2n a2n s2n r2n

n 24 h
January 0.8467 0.0885 0.0632 0.0000 0.0272
April 0.7338 0.1060 0.1189 0.0000 0.0825
July 0.4837 0.4289 0.0473 0.0000 0.1609
October 0.7603 0.0928 0.1490 0.0000 0.0311

n 48 h
January 1.2093 0.0055 0.2085 0.0000 0.1455
April 0.8806 0.0827 0.0192 0.0000 0.2993
July 0.5355 0.0270 0.1796 0.0000 0.3864
October 1.0390 0.2299 0.0626 0.0000 0.1400

n 72 h
January 0.8480 0.5909 0.4724 0.0000 0.5114
April 0.6962 0.1826 0.1607 0.0000 0.5735
July 0.3134 0.1330 0.3116 0.0000 0.6477
October 0.8366 0.1673 0.0423 0.0000 0.4021

1.4

n=24, h0,L=11.94, h0-24,C=5.45


n=48
1.2 n=72
n=24, h0,L=10.6, h0-24,C=4.29
n=48
n=72
1 n=24, h0,L=9.1, h0-24,C=3.08
n=48
n=72
gn(hn,L|h0,L,h0-24,C)

n=24, h0,L=10.01, h0-24,C=3.80


0.8
n=48
n=72

0.6

0.4

0.2

0
7 8 9 10 11 12 13 14 15 16
water level (m)

Figure 4a A priori pdf on measured discharge, 24, 48 and 72 h lead time, Lobith.

Experiments at Cologne at t0  k is given by (15). Fig. 4a depicts the prior


densities for lead times 24 (continuous), 48 (dash-dotted)
Equations (15)(17) constitute the working equations for and 72 (dotted) hours and water level observations at Co-
the uncertainty processor. The prior density parameterized logne 24 h ahead of t0 . The water levels have been observed
with respect to water level observations at Lobith at t0 and in winter (3-March, black), spring (17-April, magenta),
66 P. Reggiani, A.H. Weerts

summer (18-Aug, green) and autumn (14-Dec, blue) at 0:00 With increasing lead time the probability density becomes
hours, while the levels at Cologne have been recorded more spread, synonymous for higher uncertainty on the dis-
24 hearlier, respectively. We note that the respective coef- tribution of water levels. It is thus evident that introducing
ficient sets an ; bn ; cn ; dn in Table 1, corresponding to each multiple observing stations upstream improves the specifi-
season, have been used for deriving the priors. We also ob- cation of the prior densities for the 1-day lead time. We
serve that with increasing lead time, the information con- emphasize the importance of including additional support-
tent of the priors diminish, which is evident from a more ing information for the revision of the prior distribution
wide and less spiky shape of the curve. This trend can be ex- function in deriving the posterior.
plained in light of the linear regression model (10) at the ba- Next we show the revised posterior densities. In the
sis of the prior distribution function and is reflected in the following figures, we report the prior and the revised
values of the coefficients in Table 1. The model is charac- posterior distribution function given by (16) for various lead
terized by a weakening correlation between earlier observa- times. The water level that has been actually observed is
tions at Lobith and Cologne and future levels at Lobith as indicated as a vertical line. The base time for forecasts
lead time increases. We see that the uncertainty on the and observations is always t0 00:00 h. Fig. 5a shows the
prior distribution increases more significantly in the winter revised posterior evaluated for the conditioning water
season (black) in which the river Rhine exhibits higher vola- level observation at Lobith and Cologne and the level
tility in water level excursions. predicted by the model for lead times n f24; 48; 72g.
For comparison we reproduce in Fig. 4b the prior densi- The level h0;L has been observed on the 4th of March at Lob-
ties for the same observed values, whereby no supporting ith (12.7 m) and the level h024;C 24 h earlier at Cologne
observation at Lobith were used. The prior probability den- (6.5 m). Based on the hydrologicalhydraulic model chain
sity remains exclusively conditional on the observations at and the deterministic precipitation forecast, the SOBEK
Cologne at time t024 . The events are the same as those cho- model predicts water levels at Lobith n hours ahead of t0
sen for Fig. 4a. The absence of the Lobith observations cor- equal to 14.1, 14.0 and 13.8 m, respectively. The prior
responds to a 0 value of the coefficient Bn in (15). We note probability distribution on water levels evaluated from
that by using two support stations (Lobith and Cologne, (15) for different values h0;L and h024;C and respective
Fig. 4a) the informative content of the prior distributions in- forecasting lead times n, are represented by thin dashed
creases for the 24 h forecast. The peaks become slightly and dotted lines. The corresponding revised posterior
spikier, with more probability density concentrated around probability distributions, conditional on model forecasts
characteristic water levels for each of the selected months. sn;L , are represented by the thicker lines. We note that

1.4
n=24, h0-24,C =5.45
n=48
n=72
1.2 n=24, h0-24,C =4.29
n=48
n=72
n=24, h0-24,C =3.08
1
n=48
n=72
n=24, h0-24,C =3.80
gn(hn,L|h0,L,h0-24,C )

0.8 n=48
n=72

0.6

0.4

0.2

0
7 8 9 10 11 12 13 14 15 16
water level (m)

Figure 4b A priori pdf on measured discharge, 24, 48 and 72 h lead time, Lobith.
A Bayesian approach to decision-making under uncertainty 67

g(h24,L|h00,L=12.7, h0-24,C=6.5)
g(h48,L|h00,L=12.7, h0-24,C=6.5)
g(h72,L|h00,L=12.7, h0-24,C=6.5)
1
(h24,L|s24,L=14.1, h0,L=12.7, h0-24,C=6.5)
(h48,L|s48,L=14.0, h0,L=12.7, h0-24,C=6.5)
(h72,L|s72,L=13.8, h0,L=12.7, h0-24,C=6.5)

0.8
probability densities

0.6

0.4

0.2

0
8 9 10 11 12 13 14 15
water level [m]

Figure 5a Prior and posterior probability densities of water levels, lead time 24, 48 and 72 h, Lobith, 04/03/07.

the Bayesian revision accounts through the likelihood func- emphasizes once more the impact of the prior distributions
tion for the systematic bias of the model, by shifting the on the Bayesian revision and the necessity for including
posterior distribution with respect to the prior. The bias multiple upstream stations into the formulation of the
correction is noticeable in the 48 h forecast. It is worth processor.
mentioning that by using a sophisticated model updating
method during the historical simulation period, like, e.g.
the Ensemble Kalman filter (Evensen, 1994) a significant Discussion
bias correction could have been achieved already at the
outset of the forecast. One of the principal strengths of the uncertainty processor
The Bayesian revision also corrects for the shape of the lies in the fact that it can be executed with minimal compu-
distribution, by changing the distribution of the probability tational effort on-line, while the required parameters are
concentration. This correction is visible for all three lead evaluated off-line. This characteristic is due to the NQT
times. The revision moves the peak of the posterior distribu- technique transforming the process variables into the nor-
tions towards the vertical lines (observations) and increases mal space. In the normal space we perform regressions be-
the concentration of probability distribution (peakier curve) tween process variables in the prior and the likelihood
especially for the 24 hour forecast. The posterior probabil- distributions. The resulting expressions are then mapped
ity density function indicates the forecaster which water le- back into the original space, yielding the parameterized
vel hn;L he can most likely expect at Lobith n hours ahead, if posterior densities (15) and (16). These equations can be
h0;L and h024;C have been observed at Lobith and Cologne evaluated at forecasting time, whereby the parameter val-
and the model predicts sn;L at Lobith. ues need to be only update once in a while, as more histor-
Fig. 5b depicts the prior and posterior probability ical observations and model results become available. From
density functions for the same event on the 4th of March the experiments carried out in the previous section, we see
2007, whereby no supporting observations at Cologne were that the Bayesian processor performs reasonably in the revi-
considered. We observe a marked difference in the revised sion of the conditional prior densities. However, improve-
posterior densities, whose peaks are significantly less peaky ments can still be achieved on different fronts.
and further away from the observations compared to Firstly, the Weibull fits used to model the frequency
Fig. 5a, highlighting a deterioration of the performance distribution Ch of water levels at Lobith (see Fig. 2) are
of the processor. This is especially evident for the 72 h suboptimal for the high-flow period (winter), attributable
lead-time. Moreover the probability density is less concen- to the fact that the cross section profile at Lobith includes
trated when the observations at Cologne are omitted. This flood plains. The model of the probability distribution
68 P. Reggiani, A.H. Weerts

1.5

g(h24,L|h00,L=12.7)
g(h48,L|h00,L=12.7)
g(h72,L|h00,L=12.7)
(h24,L|s24,L=14.1, h0,L=12.7)
(h48,L|s48,L=14.0, h0,L=12.7
(h72,L|s72,L=13.8, h0,L=12.7)

1
probability densities

0.5

0
8 9 10 11 12 13 14 15
water level, [m]

Figure 5b Prior and posterior probability densities of water levels, lead time 24, 48 and 72 h, Lobith, 04/03/07.

can be significantly improved by approximating the empir- hood functions can be modeled more accurately in the tail
ical distribution piecewise with multiple Weibull fits. Also regions. This step however requires the re-elaboration of
a combination of of different fitting models can be expressions for the constants An ; Bn ; Cn and An . A further
considered. generalization of the processor, by allowing the inclusion
Secondly, the assumption of relating downstream level of m multiple upstream observing station with different
forecasts to upstream observations through linear regres- time lags would also be required.
sion between multiple gaussian variables is practical as it The revised conditional density /n hn;L jsn;L ; h0;L ; h01;C in
facilitates the necessary algebraic manipulations. However, (16) provides the probability distribution on the water level
the linear model proves to be rather restrictive as it leads to forecast, conditional on observations and a model forecast.
a deterioration of processor performance with increasing In a comprehensive decision support instrument, the ex-
lead time (Figs. 5a and 5b). The explanation for this is evi- pected damage D relative to a forecasted water level can
dent from Fig. 3a reporting the linear regression between be estimated by combining /n with a suitable cost function
the transformed model stage 1n;L and the observed stages vhn;L . The cost function expresses the economic value of
gn;L at Lobith in January. We note that the figure represents (not) issuing an alert in terms of the forecasted water level
the projection onto the (1n;L , gn;L ) plane of the straight line at Lobith:
parameterized by (12) in the 4-dimensional variable space Z inf
(1n;L , gn;L , g0;L , g0k;K ). The partial inadequacy of the linear D vhn;L /n hn;L jsn;L ; h0;L ; h01;C dh 18
model shows up in correspondence of the highest (and for 0

the same reason also the lowest) water levels. The extreme The determination of the cost function needs to be explored
high and low levels remain the principal outliers, while rela- on a case-by-case basis. It should, next to economical con-
tion between 1n;L and gn;L performs well in the middle range, siderations, incorporate also institutional and policy issues.
where the most water level values are concentrated. The The cost function needs to be established in co-operation
resulting effect is even more clear in the back transformed between forecasters and institutional decision makers and
non-linear relationship between the original variables (sn;L , is left for future research.
hn;L ) in Fig. 3b. This shortcoming gets worse with increasing
lead time n. As it is exactly the peaks in which one is inter- Summary and conclusions
ested, the linear regression can be considered too simplistic
a model. To overcome this, multiple polynomial regression We have presented an application of a Bayesian processor to
can be used, in which second-order terms of the process asses the predictive uncertainty on water level predictions
variables are considered. In this way, the prior and likeli- in the river Rhine flood forecasting system. The processor
A Bayesian approach to decision-making under uncertainty 69

is applied after the sequential execution of a hydrological IPCC: Fourth Climate Assessment Report, IPCC, Geneva 2007.
and a hydrodynamical model. The meteorological input is Kelly, K.S., Krzysztofowicz, R., 2000. Precipitation uncertainty
provided by deterministic weather predictions in forecast processor for probabilistic river stage forecasting. Water Resour.
mode and by ground observations. The uncertainty proces- Res. 36 (9), 26432653.
Kitanidis, P., Bras, R.L., 1980a. Real time forecasting with a
sor is based on Bayesian revision. Prior knowledge on flood
conceptual Hydrologic model, 1: analysis of uncertainty. Water
propagation is processed in probabilistic terms by the prior
Resour. Res. 16 (6), 10251033.
density and subsequently combined with a likelihood func- Kitanidis, P.K., Bras, R.L., 1980b. Real-time forecasting with a
tion, which can be envisaged as a probabilistic quantifica- conceptual hydrological model: 2: application of results. Water
tion of the model error. The prior density is revised in the Resour. Res. 126 (6), 10251033.
Bayesian framework, yielding a posterior distribution on Krzysztofowicz, R., 1999. Bayesian theory of probabilistic forecast-
water levels, conditional on all information available in ing via deterministic hydrologic model. Water Resour. Res. 35
the forecasting process. The conditioning variables include (9), 27392750.
water level observations at the forecasting location Lobith Krzysztofowicz, R., Kelly, K.S., 2000. Hydrologic uncertainty
at the onset of the forecast, as well as water level observa- processor for probabilistic river stage forecasting. Water Resour.
Res. 36 (11), 32653277.
tions at Cologne 150 km further upstream and 24, 48 or 72 h
Krzysztofowicz, R., Herr, H.D., 2001. Hydrologic uncertainty
earlier. Additional conditioning variables such as internal
processor for probabilistic river stage forecasting: precipita-
model states, could have been included into the formula- tion-dependent model. J. Hydrol. 249 (14), 4668.
tion. We have shown that the inclusion of water level obser- Krzysztofowicz, R., 2001. The case for probabilistic forecasting in
vations from upstream observing station into the hydrology. J. Hydrol. 249, 29.
forecasting process actually improve the prior distribution Krzysztofowicz, R., 2004. Bayesian processor of output: a new
and consequently also the revised posterior. However, some technique for probabilistic weather forecasting. In: Preprints of
assumptions, like a linear relation between forecasted the 17th Conference on Probability and Statistics in the
observations and observation upstream are restrictive and Atmospheric Sciences, Seattle, Washington, American Meteoro-
can compromise the performance of the processor. The re- logical Society; Paper no. 4.2.
Lardet, P., Obled, C., 1994. Real-time flood forecasting using a
vised posterior probability distribution function constitutes
stochastic rainfall generator. J. Hydrol. 162, 391408.
an effective measure of the certainty of occurrence of an
Raiffa, H., Schlaifer, R., 1961. Applied Statistical Decision Theory.
event. Combined with a cost function, which quantifies The MIT Press, Cambridge, MA.
the economic value of issuing an alert, the expected dam- Reggiani, P., Weerts, A.H., 2008. Probabilistic quantitative precip-
age relative to a forecasted water level can be estimated. itation forecast for flood prediction: an application. J. Hydro-
Once an acceptable level of risk, expressed in terms of a meteor. 9 (1), 7695. doi:10.1175/2007JHM858.1.
damage level, has been agreed upon, objective decision- Schaake, J., Larsson, L., 1998. Ensemble streamflow prediction
making under uncertainty can be facilitated for end-users (ESP): Progress and research needs. In Preprints, Special
of river flow forecasts. Symposium on Hydrology, Am. Meteorol. Soc., Boston, MA, pp.
1924.
Schaake, J., Franz, K., Bradley, A., Buizza, R., 2006. The hydrologic
Acknowledgements ensemble prediction experiment (HEPEX). Hydrol. Earth Syst.
Sci. Discuss. 3, 33213332.
We would like to thank the forecasting office of the Dutch Stelling, G.S., Verwey, A., 2005. Numerical Flood Simulation
Encyclopaedia of Hydrological Sciences. John Wiley & Sons
Ministry of Transport and Waterways and the German Fed-
Ltd., UK.
eral Office of Hydrology for granting permission for the Szollosi-Nagy, A., Mekis, E., 1988. Comparative analysis of three
use of the operational river Rhine Forecasting system in this recursive real-time river flow forecasting models: deterministic,
research. stochastic and coupled deterministic-stochastic. Stoch. Hydrol.
Hydraul. 2, 1733.
Todini, E., 2007. Hydrological modeling: past, present and future.
References Hydrol. Earth Sys. Sci. 11 (1), 468482.
Van der Waerden, B.L., 1952. Order tests for two-sample problem
Bergstrom, S., 1995. The HBV model. In: Singh, V.P. (Ed.), and their power 1. Indagat. Math. 14, 453458.
Computer Models of Watershed Hydrology. Water Resources Van der Waerden, B.L., 1953a. Order tests for two-sample problem
Publications, Highlands Park, CO, pp. 443476. and their power 2. Indagat. Math. 15, 303310.
Broersen, P.M.T., Weerts, A.H., 2005. Automatic error correction Van der Waerden, B.L., 1953b. Order tests for two-sample problem
of rainfall-runoff models in flood forecasting systems. In: and their power 3. Indagat. Math. 15, 311316.
Instrumentation and Measurement Technology Conference Werner, M.G.F., van Dijk, M., Schellekens, J., 2004. DELFT-FEWS:
(IMTC), 2005, Proceedings of the IEEE, 1619 May 2005, vol. 2, an open shell flood forecasting system. In: Liong, Phoon Babovic
pp. 963968. ISBN: 0-7803-8879-8. (Eds.), 6th International Conference on Hydroinformatics, World
De Groot, M.H., 1970. Optimal Statistical Decisions. McGraw-Hill, Scientific Publishing Company, pp. 12051212. ISBN 981-238-
New York. 787-0.
Evensen, G., 1994. Sequential data assimilation with a nonlinear Werner, M.G.F., Heynert, K., 2006. Open model integration a
quasi-geostrophic model using Monte-Carlo methods to forecast review of practical examples in operational flood forecasting. In:
error statistics. J. Geophys. Res. 99 (C5), 1014310162. Gourbesville, Cunge, Guinot Liong (Eds.), 7th International
Georgakakos, K.P., 1987. Real-time flash flood prediction. J. Conference on Hydroinformatics, Research Publishing, vol. I.,
Geophys. Res. 92 (D8), 96159629. pp. 155162. ISBN 819-031-702-4.
Georgakakos, K.P., Smith, G.F., 1990. On improved hydrologic WMO, Flood Forecasting Initiative, Report on the technical confer-
forecasting results from a WMO real-time forecasting exper- ence on improved meteorological and hydrological forecasting
iment. J. Hydrol. 114, 1745. Geneva, Switzerland, 2023 November 2006.

You might also like