You are on page 1of 5

Proceedings of the 19th IAHR-APD Congress 2014, Hanoi, Vietnam

ISBN 978604821338-1

STATISTICAL MODELING OF PRECIPITATION PROCESS FOR AN UNGAUGED SITE IN THE


CONTEXT OF CLIMATE CHANGE

MYEONG-HO YEO(1)&VAN-THANH-VAN NGUYEN(1)


(1)McGillUniversity, Montreal, Canada,
van.tv.nguyen@mcgill.ca

ABSTRACT
The overall objective of the present paper is to propose a statistical approach to downscaling of the precipitation process at
an ungaged location in the context of climate change. More specifically, the proposed approach consists of a combination
of three components: (i) a regionalization approach for identifying the homogeneous groups of observed daily
precipitation series available at different raingages; (ii) a stochastic model for constructing daily rainfall events at an
ungaged location within a homogeneous group; and (iii) a statistical downscaling model (SDRain) for describing the
linkage between the constructed daily precipitation series and the large-scale climatic predictors given by GCM
simulation outputs. The feasibility of the proposed stochastic approach has been assessed using the available daily
precipitation data for the 1973-2001 period from a network of 62 raingage stations in South Korea and the NCEP re-
analysis climate predictors. Results of the numerical application have indicated that it is feasible to estimate the missing
precipitation data at an ungauged site based on the data available at other sites within the same homogeneous region.
Furthermore, the proposed SDRain was able to generate daily precipitation sequences for an ungaged site with
comparable statistical characteristics as those given by the application of SDRain for a gaged site with available observed
precipitation data.

Keywords:Precipitation, Regionalization, Statistical downscaling, Missing data, Climate change.

1. INTRODUCTION hydrologic information from one location to the other site


where the data are needed but not available, or to improve
Information on the variability of rainfalls in time and
the accuracy of the hydrologic variable estimates at
space is critical for the planning, design, and management
locations where available records are too short (Nguyen,
of a large number of water-resource projects. However, in
2007). Consequently, for rainfall estimation at an
most practical applications, rainfall records at the location
ungauged site, the homogeneity of precipitation processes
of interest are often limited (a partially-gauged site) or
at different sites is a necessity condition to obtain an
unavailable (an ungauged site). Regionalization methods
accurate rainfall estimate with less uncertainty.
are hence frequently used to estimate rainfall information
Nevertheless, traditional regionalization techniques are
using the other locations where the data are available and
often criticized for the obvious subjectivity, in particular in
sufficient, or to improve the accuracy of the rainfall
the definition of hydrologically similar sites (or
estimates where available records are too short(Gonzlez,
hydrologically homogeneous regions), and the lack of
and Valds, 2008). In addition, according to the IPCC
physical justifications (Samuel et al., 1999; Brdossy, 2007;
Synthesis Report (IPCC, 2007) the annual precipitations in
Goswami et al., 2007; Besaw et al., 2010;Li et al., 2010;).
many different regions in the world, especially in
Hence, in the present study an improved regionalization
America, northern Europe, and northern and central Asia,
technique will be proposed based on the similarity of
show an increasing trend. Global Climate Models (GCMs)
rainfall occurrences at different locations using the
have been extensively used in many studies for assessing
Ordinal Factor Analysis (OFA) (Jreskog and Moustaki,
this impact. However, outputs from these models are
2001).
usually at resolutions that are too coarse (generally greater
than 200km) and not suitable for the hydrological impact As mentioned above, with observations of climate change
assessment at a regional or local scale. Hence, and its impacts on water resources systems, a number of
downscaling methods have been proposed for linking studies have been conducted to establish the linkages
GCM predictions of climate change to the historical between the large-scale climate variables given by GCMs
observations of the precipitation processes at a local site or and the observed characteristics of the daily precipitation
over a given watershed (Nguyen et al., 2006). Moreover, process at a local site using different downscaling
the prediction of ungauged basin (PUB) under climate methods (Yarnal et al., 2001; Nguyen et al., 2006). These
change conditions remains still a crucial challenge in the downscaling methods, however, are not suitable for
research and engineering practice (Sivapalan, 2003). dealing with cases where precipitation data at the location
of interest are limited or not available. Hence, the
More specifically, in the context of regional impact study,
estimation and prediction of hydrological variables such
regionalization methods have been developed and
as precipitation and flow with climate change conditions
employed according to two main objectives: considering
for these ungauged or partially-gauged sites remains a
spatial dependency (homogeneity) and reducing uncertainty.
crucial challenge for managing and planning water
Hence, these techniques are frequently used to transfer
resources(Sivapalan, 2003). Therefore, this study proposes

1
a statistical downscaling (SD) procedure for describing the OFA can be found in the publication by Jreskog &
accurately the linkages between the large-scale climate Moustaki(2001).
variables given by GCM simulation outputsand the
estimated daily precipitation characteristics at a location 2.2 A Stochastic Rainfall Model for Estimating
of interest where the precipitation data are limited or Precipitation Series at an Ungaged Site
unavailable. The stochastic modeling of the precipitation process is
In brief, the suggested SD procedure is based on a based on the combination of two different components:
combination of three components: (i) firstly, a the modeling of the rainfall occurrences, and the modeling
regionalization approach was proposed toidentify the of the precipitation amounts.As mentioned above, the
homogeneous groups of observed daily precipitation modeling of the rainfall occurrences is based on the
seriesavailable at different raingages based on the homogeneous grouping given by the application of the
similarity of rainfall occurrences at different locations OFA to rainfall occurrences. Let Fjbe the factor score for a
determined by the Ordinal Factor Analysis (OFA); (ii) given day j as defined in the following:
secondly, a stochastic model was developed for
constructing daily rainfall events at an ungaged sitewithin
Fj xi , j i
s
a homogeneous group based on the application of the
eigen-decomposition technique to data available at those i 1
raingages within the same homogeneous group; and (iii)
[1]
thirdly, a statistical downscaling model (SDRain) was if rj F j , then wet at day j
developed to describe the linkage between the constructed
daily precipitation series and the large-scale climatic if rj F j , then dry at day j
predictors given by GCM simulation outputs. The
feasibility of the proposed stochastic approach has been
assessed using the available daily precipitation data for
where iis the factor loading associated with the raingage
the 1973-2001 period from a network of 62 raingage station iin a homogeneous region of s stations,xi,j is the
stations in South Korea and the NCEP re-analysis climate precipitation occurrence at a station i for day j, and j is
predictors data over this study area. The jackknife method the average vector for dayj. With the homogeneous region
was used to simulate the ungaged condition. Results of identified by OFA, the factor score Fj represents hence
the numerical application have indicated that it is feasible how many stations in a given region rainfalls occur for the
to estimate the missing precipitation data at an ungauged given day j. A uniform random number rj(0 rj 1) was
site based on the data available at other sites within the then used to determine the wet- or dry-day based on the
same homogeneous region. Furthermore, it was found value of Fj.
that the OFA could provide more physically meaningful
homogeneous rainfall groups than those given by the Regarding the modeling of rainfall amounts, since the
commonly used Principal Component Analysis (PCA). distribution of precipitations on wet days is strongly
Finally, the proposed SDRain downscaling model was skewed, a log-transformation technique could be used to
able to generate daily precipitation sequences for an reduce this strong skewness:
ungaged site with comparable statistical characteristics as
ln Pij Yij [2]
those given by the application of the statistical
downscaling for a gaged site with available observed
precipitation data. where Pij denotes the precipitation amount at a station i
for a given day j, and Yij represents the corresponding log-
2. A STATISTICAL APPROACH TO transformed precipitation amount. Thus, Y becomes more
DOWNSCALING OF DAILY closely to normally distributed variable. Hence, the
PRECIPITATION PROCESS FOR AN
UNGAGED SITE relationship between the mean ( P ) of Pijandthe mean

As mentioned the previous section, the proposed SD (mY ) and variance ( s Y2 ) of Yij can be written as follows:
approach consists of threecomponents as described in the
following: 2
E P P exp Y Y [3]
2
2.1 A Regionalization Analysis of Rainfall Occurrences

Factor analysis (FA) is a well-known statistical method for Thus, the regional expected value of the daily
describing the variability among correlated observed precipitation amount can be estimated by the following
variables using a lower number of latent (unobserved) equation
variables called factors. Although the number of latent
variables (factors) is smaller than the number of original E PRj PRj W j exp RY
j
[4]
observed variables, they can account for the same
in which
j
information as the original data set. In this study, to PR is the regional mean of the daily precipitations
identify the hydrologically homogeneous rainfall regions
the OFA was employed to describe the similarity of
for a given day j; m RY
j
is the regional mean of the log-
rainfall occurrences at differentraingages. The use of the transformed daily precipitations for the day j; and Wj =
OFA is appropriate in this case since the daily exp(2RY /2) is the correction factor for matching the
precipitation occurrence series are made up of values 0 expected values of the original and log-transformed daily
precipitations ( RY is the regional variance of the log-
2
and 1 only (i.e., binary variables). Detailed description of
transformed daily precipitation). Equations [1] and [4] can

2
be used to generate missing rainfall series at a given site the total area) as shown in Figure 1. The topographic and
within a homogenous region. East Asian Monsoon conditions play an important role in
the identification of homogeneous rainfall regimes.As
Using Equations [1] and [4], a set of 100 daily precipitation mentioned above, the proposed approach consists of three
occurrences and amounts was generated for each site. steps: (i) firstly, the OFA statistical regionalizationwas
Only one representative daily precipitation series at the carried out for defining hydrological homogenous regions
location of interest is necessary because the statistical of daily precipitations, (ii) secondly,the stochastic
downscaling model for a single raingage station requires precipitation model was used to generate the occurrences
only one data sequence. Hence, from the 100 ensembles of and amounts of the daily precipitation time series for an
generated precipitation occurrences by Equation [1] the ungauged station, (iii) thirdly, the proposed Statistical
daily percentages of wet-days are computed. Then, a Downscaling for Rainfall process based on SDRain was
representative wet/dry series for an ungauged station are employed to compare the performance of the proposed SD
decided by a critical point 0.5 (to obtain a median value) procedure for an ungauged condition with the
as follows: performance for a gauged condition at the same location.
For comparison purposes, Figure 1 shows the
1 m
PO j Occ j
m j 1
[5] hydrologically homogeneous rainfall regions delineated
by the common Principal Component Analysis (PCA)
technique and the proposed OFA. It can be seen that the
where is a daily percentage of wet-day at a given day OFA could provide more physically meaningful
j, m is the number of generated ensembles, and is the homogeneous regions than those given by the PCA since
generated precipitation occurrence series. Moreover, the the identified regions correspond more closely to the
representative daily precipitation amount at an ungauged particular topographic features of the study area. Hence,
station can be expressed as follows: the homogeneous regions determined by the OFA were
_______
used for generating the rainfall series at an ungauged
AmoRj W j exp Yj [6] location.

REGIONALIZATION BY REGIONALIZATION BY
PCA (KOREA) OFA (KOREA)
REGION 2 REGION 2

j
where Amo R
is the representative daily precipitation REGION 3
REGION 1
REGION 1
amount for a given day j, and W is a daily weight, and REGION 10

mYj is the mean of log-transformed precipitation amount REGION 3


REGION 4

in a homogeneous region delineated by the OFA REGION 4 REGION 5


REGION 6

regionalization method. REGION 5

REGION 8

2.3 A Statistical Downscaling Model for Estimated REGION 7

Ungaged Daily Precipitation REGION 6

The proposed Statistical Downscaling for Rainfall process REGION 7

REGION 9

(SDRain) was developed to describe the linkage between


the constructed daily precipitation series and the large-
scale climatic predictors given by GCM simulation (A) (B)
outputs. The SDRain models can be expressed as follows Figure 1: Homogeneous regions delineated by PCA (A)
(Nguyen and Yeo, 2011): and by OFA(B) using precipitation data available from a
network of 62 raingage stations in South Korea.
ea0 a1 X1 a2 X 2 am X m
Prob wet at j day j [7] In this application, to simulate the ungaged condition the
1 ea0 a1 X1 a2 X 2 am X m
jackknife method was used; that is, one station is removed
R j f exp(b0 b1 X 1 b2 X 2 bm X m j ) [8] from a group of homogeneous stations and the daily
precipitation series was estimated based on the data
available at the remaining raingage stations in the group.
in which j is the probability of wet day at day j ,Xis are This process was repeated until each station was removed
once and the rainfall series were generated at each
the large-scale atmospheric predictors given by GCM removed location (Pandey and Nguyen, 1999). The
simulations,as andbs are regression parameters, fis a bias produced rainfall series at ungauged sites were then
correction coefficient, and R j is the modeled daily statistically analyzed and compared to the observed data
precipitation amount. for evaluating the performance of the proposed approach
using the Proportion Correct (PC) index and the Success of
3. NUMERICAL APPLICATION Critical Index (SCI) as shown in the following equation
To test the feasibility of the proposed approach, the (Wilks, 2006):
available daily precipitation data from a network of 62
raingages in South Korea for the 1973-2001 period were Simulated
used. South Korea is located in the lower portion of
Korean Peninsula, and lies between latitudes 33 and 39 Wet Dry ab
PC
N and longitude 124 and 130E. Its total area is around Observed Wet a b a+b N
100,032 km2 consisting mostly mountainous area (70% of

3
and red/blue circles represent the statistics computed
Dry c d c+d ad
SCI from observed data and from estimated precipitation
abc
series for an ungauged station, respectively. These
a+c b+d N
graphical results show that the proposed SD procedure for
ungauged sites was able to describe accurately the annual
For purposes of illustration, Figure 2 shows the evaluation
and monthly statistics of observed precipitation events.
results for the annual number of wet days (NWD), PC,
and SCI indices for Seoul station (K1) located in Region 1
and for Pusan station (K7) in Region 8 (see Figure 1B). It 4. CONCLUSIONS
was found that the suggested approach was able to
In this study, a statistical downscaling procedure was
provide an accurate description of the rainfall occurrences
proposed for downscaling the daily precipitation process
for these two stations as indicated by the good agreement
at a location without data. More specifically, the suggested
of the NWD index and the high values of the PC (>0.90)
approach consists of three basic steps: (i) identifying
and SCI (>0.70). Figure 3 and 4 present the comparison
hydrologically homogeneous regions based on the
between the observed and estimated annual means of
similarity of daily precipitation occurrences; (ii)
precipitation amounts and the annual number of wet-days
constructing the daily precipitation series at an ungaged
at 4 representative stations (K1-Seoul, K7-Pusan, K10-
site using a stochastic precipitation model; and (iii)
Jeonju, and K32-Jecheon). The ranges shown in this figure
establishing the linkage between large-scale climate
denote the maximum and minimum values of these two
predictors given by GCMs and the constructed
parameters, and the red-circles and blue-circles represent
precipitation series at the ungauged location using the
the observed values. It can be seen that the proposed
SDRain downscaling model. Results of an illustrative
stochastic rainfall generator can provide accurate annual
application using data from a network of 62 raingage
and monthly statistics of the observed daily rainfall series
stations in South Korea have indicated the feasibility and
at these stations.
accuracy of the proposed method. In addition, it was
Annual NWD : SEOUL
100
PC : SEOUL SCI : SEOUL found that the OFA could identify more physically
meaningful homogeneous rainfall regions than those
110 100
Percentage of Correct (%)

Success of Critical Index


Annual NWD (days)

95

given by the commonly used PCA.Furthermore, the very


90 90

90

good agreement between observed and estimated


70 80

85

statistical properties of the generated daily rainfall series


50 70

OBSERVED ESTIMATED

using the jackknife procedure has indicated the accuracy


1970 1975 1980 1985 1990 1995 2000 2005
1970 1975 1980 1985 1990 1995 2000 2005 YEAR
1970 1975 1980 1985 1990 1995 2000 2005
YEAR
YEAR

110
Annual NWD : PUSAN
100
PC : PUSAN
SCI : PUSAN
of the suggested approach. In summary, results of this
100
illustrative application have indicated that the proposed
Percentage of Correct (%)

Success of Critical Index


Annual NWD (days)

90 95

SD procedure for an ungaged site could provide


90

70 90

comparable results as those given by the downscaling


80

50 85

method for a gaged location with available real observed


70

OBSERVED ESTIMATED

precipitation data.
1970 1975 1980 1985 1990 1995 2000 2005 1970 1975 1980 1985 1990 1995 2000 2005 1970 1975 1980 1985 1990 1995 2000 2005
YEAR YEAR YEAR

Figure 2. Accuracy of the rainfall occurrence model for


Seoul and Pusan stations based on the annual NWD, Annual Wet Days : K1 Annual Wet Days : K7
Percentage of Correct (PC), and Success of Critical Index 140 140

(SCI) 120 120


Wet days (days)

Wet days (days)

100 100

80 80
Annual Mean of Precipitation : K1 Annual Mean of Precipitation : K7
12 14
Mean Precipitation (mm/year)

Mean Precipitation (mm/year)

60 60
10 12
40 40
10
8
1975 1980 1985 1990 1995 2000 1975 1980 1985 1990 1995 2000
8 Year Year
6
6
4
Annual Wet Days : K10 Annual Wet Days : K32
4
140 140
2 2
120 120
Wet days (days)

Wet days (days)

0 0
1975 1980 1985 1990 1995 2000 1975 1980 1985 1990 1995 2000
Year Year 100 100

80 80
Annual Mean of Precipitation : K10 Annual Mean of Precipitation : K32
12 12 60 60
Mean Precipitation (mm/year)

Mean Precipitation (mm/year)

10 10 40 40

8 8 1975 1980 1985 1990 1995 2000 1975 1980 1985 1990 1995 2000
Year Year
6 6

4 4
Figure 4. Observed and estimated annual wet-days of
2 2
precipitations for four selected stations: K1, K7, K10, and
K32. The range denotes the maximum and minimum
0 0
1975 1980 1985 1990 1995 2000 1975 1980 1985 1990 1995 2000
Year Year

values of these two parameters, and the bluecircles


Figure 3. Observed and estimated annual means of represent the observed values.
precipitations for four selected stations: K1, K7, K10, and
K32. The range denotes the maximum and minimum
values of these two parameters, and the redcircles
represent the observed values.

Figure 5 and 6 present the range-plots of annual and


monthly downscaled averages of precipitation amounts
and wet-day process for the two selected stations,
respectively. The range indicates simulated maximum and
minimum values of the simulated precipitation sequences,

4
Annual Mean of Precipitation (K1) Annual Wet-Days (K1) Jreskog, K., & Moustaki, I. (2001). Factor analysis of
ordinal variables: A comparison of three approaches.
12 140
Mean Precipitation (mm/year)

Annual wet-days (days)


10 120

8 100 Multivariate Behavioral Research, 36(3), 347387.


6 80
Li, M., Shao, Q., Zhang, L., & Chiew, F. H. S. (2010). A new
regionalization approach and its application to predict
4 60

2 40

0
1975 1980 1985 1990 1995 2000
20
1975 1980 1985 1990 1995 2000
flow duration curve in ungauged basins. Journal of
Year
(A)
Year
(B) Hydrology, 389(1-2), 137145.
Annual Mean of Precipitation (K7) Annual Wet-Days (K7)
Nguyen, V-T-V. (2007). On regional estimation of floods
12 140
for ungaged sites, in Advances in Geosciences, Vol. 6:
Mean Precipitation (mm/year)

Annual wet-days (days)


10 120

8 100
Hydrological Science, N. Park et al. (Eds.), World
6 80 Scientific Publishing Company, pp. 55-66.
4 60
Nguyen, V-T-V., Nguyen, T-D., and Gachon, P. (2006). On
the linkage of large-scale climate variability with local
2 40

0 20

characteristics of daily precipitation and temperature


1975 1980 1985 1990 1995 2000 1975 1980 1985 1990 1995 2000
Year Year
(C) (D)
extremes: an evaluation of statistical downscaling
Figure 5. Observed and estimated annual means of
methods. Advances in Geosciences, Vol. 4: Hydrological
precipitation and the number of wet-days (NWD) for
Science, N. Park et al. (Ed.), World Scientific Publishing
Seoul (K1) and Busan (K7). Blue/green ranges denote
Company, pp. 1-9.
maximum and minimum values of estimated means and
Nguyen,V-T-V., &Yeo, M.-H. (2011). Statistical
NWD. The red trianglesrepresentstatistics computed from
Downscaling of Daily Rainfall Processes for Climate-Related
the observed data at a gaged site, and the blue circles
Impact Assessment Studies. World Environmental and
represent the estimated statistics at an ungaged site.
Water Resources Congress 2011, Palm Spring, USA,
Monthly Mean of Precipitation (K1)
14 50
Monthly Wet-Days (K1)
American Society of Civil Engineers, pp. 44774482.
Mean Precipitation (mm/MON)

Pandey, G. R., & Nguyen, V.-T.-V. (1999). A comparative


Monthly wet-days (days)

12
40
10

8
30
study of regression based methods in regional flood
6

4
20
frequency analysis. Journal of Hydrology, 225(1-2), 92
2
10
101.
0
Jan Feb Mar Apr May Jun
MON
Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun
MON
Jul Aug Sep Oct Nov Dec
Samuel, J., Coulibaly, P., & Metcalfe, R. (2011). Estimation
(A) (B)
of continuous streamflow in Ontario ungauged basins:
Monthly Mean of Precipitation (K7)
12 50
Monthly Wet-Days (K7) comparison of regionalization methods. Journal of
Mean Precipitation (mm/MON)

Hydrologic Engineering, 16(5), 447459.


Monthly wet-days (days)

10
40

8
30 Sivapalan, M. (2003). Prediction in ungauged basins: a
grand challenge for theoretical hydrology. Hydrological
6

20
4

2
10
Processes, 17(15), 31633170.
0
Jan Feb Mar Apr May Jun
MON
Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun
MON
Jul Aug Sep Oct Nov Dec Wilks, D. S. (2006). Statistical Methods in the Atmospheric
(C) (D) Sciences, 2nd Edition, Burlington, MA,Academic
Figure 6. Observed and estimated monthly means of Press, page 627.
precipitation and the number of wet-days (NWD) for Yarnal, B., Comrie, A.C., Frakes, B., Brown, D.P.(2001).
Seoul (K1) and Busan (K7). Blue/green ranges denote Review developments and prospects in synoptic
maximum and minimum values of estimated means and climatology. International Journal of Climatology, 21,
NWD. The red trianglesrepresentstatistics computed from 1923-1950.
the observed data at a gaged site, and the blue circles
represent the estimated statistics at an ungaged site.

REFERENCES
Brdossy, A. (2007). Calibration of hydrological model
parameters for ungauged catchments. Hydrology and
Earth System Sciences, 11(2), 703710. doi:10.5194/hess-
11-703-2007.
Besaw, L. E., Rizzo, D. M., Bierman, P. R., & Hackett, W. R.
(2010). Advances in ungauged streamflow prediction
using artificial neural networks, Journal of Hydrology,
386(1-4), 2737.
Gonzlez, J., & Valds, J. (2008). A regional monthly
precipitation simulation model based on an L-moment
smoothed statistical regionalization approach, Journal
of Hydrology, 348, 2739.
Goswami, M., OConnor, K. M., & Bhattarai, K. P. (2007).
Development of regionalisation procedures using a
multi-model approach for flow simulation in an
ungauged catchment, Journal of Hydrology, 333(2-4),
517531.
IPCC (Intergovernmental Panel on Climate Change)
(2007). Climate Change 2007: Synthesis Report, 52 pages.

You might also like