You are on page 1of 70

Statistical Modelling of Extreme Values

Stuart Coles and Anthony Davison


c
2008
Based on An Introduction to Statistical Modeling of Extreme Values,
by Stuart Coles, Springer, 2001
http://stat.epfl.ch

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Introduction
Basic problem.
Applications . .
Examples . . . .
Brief history . .
Resources. . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

Basic Notions
Principles of Stability. . . . . . . . . . . . . . . . . . . . . .
Probability framework for block maxima distribution.
Classical limit laws . . . . . . . . . . . . . . . . . . . . . . .
Extremal types theorem . . . . . . . . . . . . . . . . . . . .
Using the limit law . . . . . . . . . . . . . . . . . . . . . . .
Generalized extreme value distribution . . . . . . . . . .
Quantiles and return levels . . . . . . . . . . . . . . . . . .
Max-stability . . . . . . . . . . . . . . . . . . . . . . . . . . .
Outline proof of GEV as limit law . . . . . . . . . . . . .
Domains of attraction . . . . . . . . . . . . . . . . . . . . .
Convergence and approximation . . . . . . . . . . . . . .
Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Interest and nuisance parameters . . . . . . . . . . . . . .
Modelling with the GEV. . . . . . . . . . . . . . . . . . . .
Port Pirie Annual Maxima data . . . . . . . . . . . . . . .
Bayesian Extremes. . . . . . . . . . . . . . . . . . . . . . . .
MCMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MetropolisHastings . . . . . . . . . . . . . . . . . . . . . .
Port Pirie: MCMC analysis . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.

3
. 4
. 5
. 6
15
16

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

17
18
19
20
21
25
26
27
28
29
30
31
32
33
34
36
38
39
42
44
45
47

Limitations of the GEV limit law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50


Point Process Approach
Improved Inferences . . . . . . . . . .
Example: Eskdalemuir rainfall . . .
Poisson process . . . . . . . . . . . . .
Statistical application. . . . . . . . .
Likelihood . . . . . . . . . . . . . . . .
Example: Eskdalemuir rainfall . . .
Threshold methods . . . . . . . . . .
Alternative inference . . . . . . . . .
Mean residual life plot . . . . . . . .
Threshold selection . . . . . . . . . .
rlargest order statistics method .
Venice Sea Levels . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

Modelling Issues
D(un ) condition . . . . . . . . . . . . . .
Short-term dependence: Example . .
Moving maxima . . . . . . . . . . . . . .
Extremal index . . . . . . . . . . . . . . .
D. . . . . . . . . . . . . . . . . . . . . . . .
Modelling. . . . . . . . . . . . . . . . . . .
Wooster example . . . . . . . . . . . . .
Return levels . . . . . . . . . . . . . . . .
Rain example . . . . . . . . . . . . . . . .
Non-stationarity . . . . . . . . . . . . . .
Estimation . . . . . . . . . . . . . . . . . .
Model reduction . . . . . . . . . . . . . .
Other extreme value models . . . . . .
Examples . . . . . . . . . . . . . . . . . . .
Semiparametric . . . . . . . . . . . . . . .
Example: Swiss winter temperatures

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

Multivariate Extremes
Multivariate extremes . . . .
Componentwise maxima . . .
Special cases . . . . . . . . . .
Parametric models. . . . . . .
Alternatives . . . . . . . . . . .
Inference . . . . . . . . . . . . .
Structure variables. . . . . . .
Sea levels . . . . . . . . . . . . .
Point process . . . . . . . . . .
Comments . . . . . . . . . . . .
Consistency of results . . . .
Poisson likelihood . . . . . . .
Exchange rate data . . . . . .
Oceanographic example . . .
Higher-dimensional models .
Asymptotic dependence . . .
Bivariate normal variables . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

51
52
53
57
61
62
63
66
67
69
70
71
72
73
75
76
78
80
83
84
85
86
88
91
92
93
95
96
105
107
111
112
113
116
117
118
119
120
122
126
129
130
131
133
137
140
141
142

Spatial Extremes
143
Spatiotemporal extremes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Geostatistics
Geostatistics . . . . . . . . . . . . . .
Temperature data . . . . . . . . . .
Approaches . . . . . . . . . . . . . .
Latent variable approach . . . . .
HeffernanTawn (2004) . . . . . .
Max-stable processes . . . . . . . .
Extremal coefficient . . . . . . . . .
Madogram . . . . . . . . . . . . . . .
Spectral representations I . . . . .
Spectral representations II . . . .
Pairwise likelihood. . . . . . . . . .
Example: Temperature data . . .
Swiss summer temperature data
Fit of marginal model . . . . . . .
Estimated correlations . . . . . . .
Fit of correlation curve. . . . . . .
Pairwise fits . . . . . . . . . . . . . .
Simulated fields . . . . . . . . . . .
......................
......................
......................
Discussion . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

145
145
146
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168

Overview
1. Basic notions: maximum analysis, and threshold models
2. Alternative charactizationsimproved inferences
3. Modelling issues
4. Multivariate extremes
5. Spatial models for extremes
Statistics of Extremes

January 2008 slide 2

Introduction

slide 3

Basic problem
iid

Simplest case: X1 , . . . Xn F . Require accurate inferences on tail of F .


Key issues:


there are very few observations in the tail of the distribution;

estimates are often required beyond the largest observed data value;

standard density estimation techniques fit well where the data have greatest density, but can be
severely biased in estimating tail probabilities.

Usual lack of physical or empirical basis for extrapolation leads to the extreme value paradigm:
Base tail models on asymptotically-motivated distributions.
Statistics of Extremes

January 2008 slide 4

Applications
Historically there have been two main application areas of extreme value theory:


Environmental:

sea levels

wind speeds

pollution concentrations

river flow

...

Reliability Modelling:

weakest-link type models

Growing areas of application include: finance, insurance, telecommunications, athletic records,


microarrays, . . .
Statistics of Extremes

January 2008 slide 5

4.2
4.0
3.6

3.8

Sea-Level (metres)

4.4

4.6

Examples: Annual maximum sea levels

1930

1940

1950

1960

1970

1980

Year

Annual maximum sea levels at Port Pirie, Western Australia.


Statistics of Extremes

January 2008 slide 6

15
10
0

Frequency

20

25

Examples: Breaking strengths of glass fibres

0.0

0.5

1.0

1.5

2.0

2.5

Breaking Strength

Here minima are of interest.


Statistics of Extremes

January 2008 slide 7

1.6
1.2

1.4

Sea-level (meters)

1.8

Examples: Dependence on covariates

1900

1920

1940

1960

1980

Year

1.6
1.2

1.4

Sea-level (meters)

1.8

Fremantle annual maximum sea levels: apparent trend in time.

-1

SOI

Apparent dependence on the Southern Oscillation Index (an indicator of El Nino).


Statistics of Extremes

January 2008 slide 8

240
238
232

234

236

Race Time (secs.)

242

244

246

Examples: Fastest annual times for womens 1500m

1975

1980

1985

1990

Year

There is an obvious time trend.


Statistics of Extremes

January 2008 slide 9

140
120
60

80

100

Sea-level (cm)

160

180

200

Examples: Largest order statistics

1930

1940

1950

1960

1970

1980

Year

Largest ten (available) annual sea levels at Venice.


Statistics of Extremes

January 2008 slide 10

40
0

20

Daily Rainfall (mm)

60

80

Examples: Threshold exceedances

1920

1930

1940

1950

1960

Year

Daily rainfall accumulation (location in south-west England)


Statistics of Extremes

January 2008 slide 11

0
-20
-40
-60

Daily Minimum Temperature (Degrees Below 0 F.)

20

Examples: Nonstationarity

1983

1984

1985

1986

1987

1988

Year

Daily minimum temperatures (Wooster, Canada).


Statistics of Extremes

January 2008 slide 12

Index

-0.02
-0.06

6000

-0.04

8000

Index

0.0

10000

0.02

0.04

12000

Examples: Financial time series

1996

1997

1998

1999

2000

1996

Year

1997

1998

1999

2000

Year

Dow Jones Index. (Left Panel: Raw. Right Panel: After transformation to induce approximate
stationarity)
Statistics of Extremes

January 2008 slide 13

65
60
55
50
45
40

Annual Maximum Wind Speed (knots) at Hartford (CT)

Examples: Multivariate extremes

50

60

70

80

Annual Maximum Wind Speed (knots) at Albany (NY)

Annual maximum wind speeds at 2 different locations (North East U.S.)


Statistics of Extremes

January 2008 slide 14

Brief history


1920s: Foundations of asymptotic argument developed by Fisher and Tippett

1940s: Asymptotic theory unified and extended by Gnedenko and von Mises

1950s: Use of asymptotic distributions for statistical modelling by Gumbel and Jenkinson

1970s: Classic limit laws generalized by Pickands

1980s: Leadbetter (and others) extend theory to stationary processes

1990s: Multivariate and other techniques explored as a means to improve inference

2000s: Interest in spatial and spatio-temporal applications, and in finance

Statistics of Extremes

January 2008 slide 15

Resources
 Books
Galambos (1987) The Asymptotic Theory of Extreme Order Statistics, Krieger
Resnick (1987) Extreme Values, Regular Variation, and Point Processes, SV
Leadbetter, Lindgren and Rootzen (1983) Extremes and Related Properties of Random Sequences and Processes,
SV

de Haan and Ferreira (2006) Extreme Value Theory: An Introduction, SV


Gumbel (1958) Statistics of Extremes, Columbia University Press
Embrechts, Kl
uppelberg and Mikosch (1997) Modelling Extreme Events for Insurance and Finance, SV
Kotz and Nadarajah (2000) Extreme Value Distributions, Imperial College Press
Coles (2001) An Introduction to Statistical Modelling of Extreme Values, SV
Beirlant, Goegebeur, Segers, and Teugels (2004) Statistics of Extremes: Theory and Applications, Wiley
Finkenst
adt and Rootzen (2004) Extreme Values in Finance, Telecommunications and the Environment, CRC
Reiss and Thomas (2007) Statistical Analysis of Extreme Values, Birkhauser

R packages evd, evdbayes, evir, extRemes, fExtremes, POT

Journal Extremes (published by Springer)

Statistics of Extremes

January 2008 slide 16

Basic Notions

slide 17

Principles of Stability
Fundamental to all characterizations of extreme value processes is the concept of stability.


For example, we might propose one model for the annual maximum of a process, and another for
the 5-year maximum. Since the 5-year maximum will be the maximum of 5 annual maxima, the
models should be mutually consistent.

Similarly, a model for exceedances over a high threshold should remain valid (in a precise sense)
for exceedances of higher threshold.

The expression of such stability requirements as mathematical statements leads to asymptotic


models.

Statistics of Extremes

January 2008 slide 18

Probability framework for block maxima distribution




iid

Let X1 , . . . , Xn F and define


Mn = max{X1 , . . . , Xn }.
Then the distribution function of Mn is
Pr{Mn x} = Pr{X1 x, . . . , Xn x}

= Pr{X1 x} Pr{Xn x}

= F (x)n .


But F is unknown, so approximate F n by limit distributions as n.

What distributions can arise?

Obviously, if F (x) < 1, then F (x)n 0, as n .

Statistics of Extremes

January 2008 slide 19

Classical limit laws


Question, as stated, is trivial: Mn will converge to the upper endpoint of the distribution of X as
n. But

 recall the Central Limit Theorem: with n = and n = / n,


X n n
N (0, 1),
n
so rescaling is needed to obtain a non-degenerate limit.


Same applies here: we seek limits of


Mn bn
an
for suitable sequences {an } > 0 and {bn }.

Statistics of Extremes

January 2008 slide 20

10

Extremal types theorem


Definition 1 The distributions F and F are of the same type if there are constants a > 0 and b such that
F (ax + b) = F (x) for all x.
Theorem 2 (Extremal types theorem) If there exist sequences of constants an > 0 and bn such that, as n ,
Pr{(Mn bn )/an x} G(x)
for some nondegenerate distribution G, then G has the same type as one of the following distributions:
I : G(x)

II : G(x)

III : G(x)

exp{ exp(x)}, < x < ;

0,
x 0,
exp(x ),
x > 0, > 0;

exp{(x) }, x < 0, > 0,


1,
x 0.

Conversely, each of these Gs may appear as a limit for the distribution of (Mn bn )/an , and does so when G itself is
the distribution of X.

Statistics of Extremes

January 2008 slide 21

Three limiting distributions


Frechet, alpha=0.5

Negative Weibull, alpha=0.5

10

0
x

10

0.3
PDF
0.2
0.1
0.0

0.0

0.0

0.1

0.1

PDF
0.2

PDF
0.2

0.3

0.3

0.4

0.4

Gumbel

10

0
x

10

10

0
x

10

The three types are known as the Gumbel, Fr


echet and Weibull (strictly, negative Weibull),
respectively.

The Frechet (Type II) is bounded below, and the negative Weibull (Type III) is bounded above.

The standard Weibull is a distribution for minima.

Statistics of Extremes

January 2008 slide 22

11

Example: Exponential maxima


If F is the standard exponential distribution, then

1 x
e
n

exp{ exp(x)}.

CDF
0.0

0.0

0.2

0.2

0.4

0.4

CDF

0.6

0.6

0.8

0.8

1.0

1.0

F (x + log n)n = (1 exlog n )n =

6
x

10

12

2
x

Distributions of maxima and renormalised maxima of n = 1, 7, 30, 365, 3650 standard exponential variables (from left to right), with Gumbel distribution (heavy).

Statistics of Extremes

January 2008 slide 23

Example: Normal maxima


Maxima of standard normal variables also converge to a Gumbel limit, with
bn = (2 log n)0.5 0.5(2 log n)0.5 (log log n + log 4),

an = (2 log n)0.5 ,

1.0
0.8
0.6
0.0

0.2

0.4

CDF

0.6
0.4
0.0

0.2

CDF

0.8

1.0

but convergence is extremely slow.

-4

-2

-4

-2

Distributions of maxima and renormalised maxima of n = 1, 7, 30, 365, 3650 standard normal variables (from left to right), with Gumbel distribution (heavy).

Statistics of Extremes

January 2008 slide 24

12

Using the limit law




We assume that for some a > 0 and b,


Pr{(Mn b)/a x} G(x),
or equivalently,
Pr{Mn x} G{(x b)/a} = G (x),

where G is of the same type as G.




That is, the family of extreme value distributions may be fitted directly to a series of observations
of Mn .

However, it is inconvenient to have to work with three possible limiting families.

Statistics of Extremes

January 2008 slide 25

Generalized extreme value distribution




This family encompasses all three of the previous extreme value limit families:
( 
)


x 1/
G(x) = exp 1 +
,

+
defined on {x : 1 + (x )/ > 0}.

From now on let x+ = max(x, 0).

and are location and scale parameters

is a shape parameter determining the rate of tail decay, with

> 0 giving the heavy-tailed (Frechet) case

= 0 giving the light-tailed (Gumbel) case

< 0 giving the short-tailed (negative Weibull) case

Statistics of Extremes

January 2008 slide 26

13

Quantiles and return levels




where G(xp ) = 1 p.

In extreme value terminology, xp is the return level associated with the return period 1/p.

15

In terms of quantiles, take 0 < p < 1 and define


i
h
xp =
1 { log(1 p)} ,

Quantile

10

Shape=0.2

Shape=0

Shape=-0.2

-2

Log y

Statistics of Extremes

January 2008 slide 27

Max-stability
Some insight into these results is obtained using the concept of max-stability.
Definition 3 A distribution G is said to be max-stable if
Gk (ak x + bk ) = G(x),

k = 1, 2, . . . ,

for some constants ak and bk .




In other words, taking powers of G results only in a change of location and scale.

The connection with extremes is that a distribution is max-stable if and only if it is a GEV
distribution.

Statistics of Extremes

January 2008 slide 28

14

Outline proof of GEV as limit law


Suppose that (Mn bn )/an has limit distribution G. Then, for large n,
Pr{(Mn bn )/an x} G(x)
and so for any k,
Pr{(Mnk bnk )/ank x} G(x).
But
Pr{(Mnk bnk )/ank x} = [Pr{(Mn bnk )/ank x}]k ,
giving two expressions for the distribution of Mn :




n
nk
Pr(Mn x) G xb
, Pr(Mn x) G1/k xb
,
an
ank

so that G and G1/k are identical apart from scaling coefficients. Thus, G is max-stable and therefore GEV.

Statistics of Extremes

January 2008 slide 29

Domains of attraction
For a given distribution function F , is it easy to determine suitable sequences an and bn and to know what limit
G will occur?
For sufficiently smooth distributions, defining the reciprocal hazard function
r(x) =

1 F (x)
.
f (x)

and letting
bn = F 1 (1 n1 ), an = r(bn ), = lim r (x)
x
the limit distribution of (Mn bn )/an is
1/

exp{(1 + x)+

} if 6= 0

and
exp(ex )

if = 0.

Statistics of Extremes

January 2008 slide 30

15

Convergence and approximation




There has been a good deal of work on the speed of convergence of Mn to the limiting regime, which
depends on the underlying distribution F for example, convergence is slow for maxima of n Gaussian
variables.
From a statistical viewpoint, this is not so useful: we use the GEV as an approximate distribution for
sample maxima for finite (small?) n, so the key question is whether the GEV fits the available dataassess
this empirically.

Direct use of the GEV rather than the three types separately allows for flexible modelling, and ducks the
question of which type is most appropriatethe data decide.

Testing for fit of one type or another is usually unhelpful, because setting (say) = 0 can give
unrealistically precise inferences. Often uncertainty in extremes is (appropriately) large, and it can be
misleading to constrain inferences artificially.

Statistics of Extremes

January 2008 slide 31

Minima
All the ideas apply equally to minima, because
min(X1 , . . . , Xn ) = max(X1 , . . . , Xn ).
Our general discussion is for maxima, and we make this transformation without comment when we
model minima.
Statistics of Extremes

January 2008 slide 32

Inference
Given observed annual maxima X1 , . . . , Xk , aim now is to make inferences on the GEV parameters
(, , ) . Possibilities include:


graphical techniqueshistorically important, remain useful for model-checking;

moment-based estimatorsusually inefficient for extremes, as moments may not exist;

probability-weighted momentswidely used in hydrology, but difficult to extend to complex


data;

likelihood-based techniqueswidely used in statistics because

intuitive paradigm often readily adapted to complex data;

have unifying general approximate estimation and testing theory;

incorporation of prior information via Bayes theorem also uses likelihood as key ingredient.

Statistics of Extremes

January 2008 slide 33

16

Likelihood
Definition 4 Let y be a data set, assumed to be the realisation of a random variable Y f (y; ), where is
unknown. Then the likelihood and log likelihood are
L() = L(; y) = fY (y; ),

() = log L(),

b (), for all .


The maximum likelihood estimate (MLE) b satisfies ()
Often b is unique and in many cases it satisfies the score (or likelihood) equation
()
= 0,

which is interpreted as a p 1 vector equation if is a p 1 vector.


The observed information defined as
2 ()
J() =
T
is a p p matrix if has dimension p.
b ()} 0.
The likelihood ratio statistic (Wilks statistic) is W () = 2{()

Statistics of Extremes

January 2008 slide 34

Likelihood approximations
For large n, and if the data were generated from f (y; 0 , then








b 1 ,
b Np 0 , J()

b 1 .
so a standard error for br is the square root of the rth diagonal element of J()
b we just have to maximise () and to obtain the Hessian matrix at the
To obtain b and J(),
maximumpurely numerically, if a routine to compute () is available;

W (0 ) 2p .

We base tests and/or confidence intervals on these results, even when the sample is small (e.g. n = 20 or
less). Often the approximations are adequate for practical work.
For details, see for example Chapter 4 of Davison (2003) Statistical Models, CUP.

Statistics of Extremes

January 2008 slide 35

17

Interest and nuisance parameters


In practice usually divides into
interest parameters q1 central to the problem (often q = 1 in practice);
 nuisance parameters pq1 whose values are not of real importance.
b denote the maximum likelihood estimator of when is fixed:
Let


b ) (, ),
(,

and define the (generalised) likelihood ratio statistic


n
o
n
o
b )
b (,
b ) = 2 ()
b (b ) .
Wp () = 2 (,

If 0 is the value of that generated the data from a regular model,

Wp () 2q ,

for large n.

We often base confidence intervals and tests for 0 on this, particularly if the profile log likelihood for ,
b ) is asymmetricgenerally the case for quantiles, Value-At-Risk, and similar quantities.
p () = (b ) = (,

Statistics of Extremes

January 2008 slide 36

Non-regular models


With extremal models there is a potential difficulty the endpoint of the distribution (if finite) is a function
of the parameters, so usual asymptotic results may not hold.

Smith (1985, Biometrika) established that the limiting behaviour of the MLE depends on the value of shape
parameter :
when > 0.5, maximum likelihood estimators obey standard theory above;




when 1 < < 1/2, the MLE satisfies the score equation;
when < 1, the MLE does not satisfy the score equation.

When < 1/2, the MLE for the endpoint converges to the endpoint faster than when > 1/2 (good)
at a convergence rate that depends on (bad), and other parameters converge at the usual rate (OK), so
there is no simple theory.
In most environmental problems 1 < < 1 (often 0) so maximum likelihood works fine.
If not, Bayesian methods (which do not depend on these regularity conditions) may be preferable.

Statistics of Extremes

January 2008 slide 37

18

Modelling with the GEV




Specification (programming) of log-likelihood function:


k n
X

log (1 + 1/) log 1 +
(, , ) =
i=1








1+

o

xi 1/

xi



note this equals if any 1 + (xi )/ < 0.


Numerical maximization of log-likelihood.

Calculation of standard errors from inverse of observed information matrix (also obtained numerically).
Diagnostic checks: probability plots, quantile plots, return level plots.
Comparison of competing nested models through deviance (likelihood ratio statistic), and of non-nested
b or other similar criteria.
models by minimising the Akaike information criterion AIC = 2{dim() },
Calculation of confidence intervals for return levels.

Statistics of Extremes

January 2008 slide 38

4.2
4.0
3.6

3.8

Sea-Level (metres)

4.4

4.6

Port Pirie Annual Maxima data

1930

1940

1950

1960

1970

1980

Year

portpirie.fit<- gev.fit(portpirie$SeaLevel)
$conv
[1] 0
$nllh
[1] -4.339058
$mle
[1] 3.87474692 0.19804120 -0.05008773
$se
[1] 0.02793211 0.02024610 0.09825633

Statistics of Extremes

January 2008 slide 39

19

Port Pirie data: Diagnostics


gev.diag(portpirie.fit)
Quantile Plot

4.4
4.2

Empirical

3.8

4.0

0.6
0.4
0.0

3.6

0.2

Model

0.8

4.6

1.0

Probability Plot

0.2

0.4

0.6

0.8

1.0

3.6

3.8

4.0

4.2

Empirical

Model

Return Level Plot

Density Plot

4.4

4.6

0.0

0.5

1.0

f(z)

4.5
4.0

Return Level

1.5

5.0

0.0

0.1

1.0

10.0

100.0

1000.0

3.6

3.8

4.0

Return Period

4.2

4.4

4.6

Statistics of Extremes

January 2008 slide 40

Port Pirie data: Use of profile likelihood

2
1
0

Profile Log-likelihood

Profile log-likelihood for :

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

Shape Parameter

2
0
-2

Profile Log-likelihood

Profile log-likelihood for 100-year return level:

4.5

5.0

5.5

6.0

Return Level

Statistics of Extremes

January 2008 slide 41

20

Bayesian Extremes
Markov chain Monte Carlo (MCMC), and other stochastic computation algorithms, have enabled
Bayesian techniques to be applied to extreme value problems.
Advantages include


use of additional information via prior specification

no reliance on n asymptotics (except for model choice!)




ease of inference (for example, parameter transformations)

development of predictive inference through unified treatment of variation and uncertainty

Statistics of Extremes

January 2008 slide 42

Bayesian inference



Key idea: everything is treated as random variable, so probability calculus is applied to unknown parameters
and potential data
Use Bayes theorem to compute
Pr( unknowns | data ) =

Pr( data | unknowns )Pr( unknowns )


Pr( data )

If the unknown is a continuous parameter , then compute


Pr( | data ) = R

Pr( data | )()


,
Pr( data | )() d

where the prior density () summarises our prior knowledge/belief about


Main problem is to compute integrals like the one in the denominator, for realistic models

Often feasible using Markov chain Monte Carlo (MCMC) simulation

Statistics of Extremes

January 2008 slide 43

MCMC


Idea is that when direct simulation from a density f is difficult, the problem can be tackled by
setting up a Markov chain whose limit density is f .

Consequently, simulation of x1 , x2 , . . . from the chain yields a series with the property that the
marginal density of xj for large enough j is approximately f .

In other words, for large enough N , xN , xN +1 , . . . can be regarded as a dependent series with
marginal density f . Hence, for example, empirical moments of this series yield approximations of
the moments of f .

Statistics of Extremes

January 2008 slide 44

21

MetropolisHastings
To generate a Markov chain {xt } with limit density f :
1. Set t = 0 and choose x(0) .

2. Simulate x with arbitrary transition density h( | x(t) ).

3. Calculate

f (x )q(x(t) | x )
= (x(t) , x ) = min 1,
f (x(t) )q(x | x(t) )

4. Define
(t+1)

x
x(t)

with probability
with probability 1 .

5. Set t = t + 1 and go to step 2.


Statistics of Extremes

January 2008 slide 45

Comments on MetropolisHastings


The algorithm remains valid when f is only proportional to a target density function. This is why
the algorithm is so important in Bayesian applications.

Subject to regularity conditionsin particular, the possibility of moving everywhere in the


domainthe choice of q is arbitrary. Simple families such as random walks are generally chosen,
though poor choices will lead to poor algorithms.

A successful algorithm will:

converge quickly: density of xt is near f for small t;

mix well: have low autocorrelation.

These conditions facilitate the use of fairly short series of simulated data for inference, but use of
convergence diagnostics is essential, as there is usually no theoretical guarantee of convergence
Statistics of Extremes

January 2008 slide 46

22

Port Pirie: MCMC analysis

0.6
scale

0.2

0.4

4.6
4.2
3.8

location

5.0

0.8

MCMC chains for GEV parameters and (by transformation) 100-year return level in analysis of Port Pirie annual
maxima. Analysis based on vague, though proper, priors.

1000

3000

5000

1000

1000

3000

5000

6.5
6.0
5.5
5.0
4.5

0.2
shape

0.2
0.6

3000
Index

100year return level

Index

5000

Index

1000

3000

5000

Index

Statistics of Extremes

January 2008 slide 47

Port Pirie: MCMC analysis


Posterior means:

b = 3.87,
b = 0.208, b = 0.045,

Posterior standard deviations:

SD() = 0.029, SD() = 0.022, SD() = 0.099


Statistics of Extremes

January 2008 slide 48

23

Port Pirie: MCMC analysis

3.85

3.95

20
5 10

4.05

0.15

0.20

0.25

0.30

Shape

100year return level

0.2

0.4

1.0
0.0

6
5
4
3
2
1

0.0

2.0

scale

Probability density function

location

0.4

Probability density function

20
15
10
5

3.75

Probability density function

Scale

Probability density function

Location

4.0 4.5 5.0 5.5 6.0 6.5 7.0

shape

return level

Posterior densities in analysis of Port Pirie annual maxima.


Statistics of Extremes

January 2008 slide 49

Limitations of the GEV limit law


The GEV limit law for block maxima requires careful reading.


It states that when (normalized) maxima have a limit, that limit will be a member of the GEV
family. It does not (by itself) guarantee the existence of a limit, and there exist wide classes of
models for which no limit exists.

One interesting case is the Poisson distribution. If {Xi } is a sequence of Poisson variables with
mean , a sequence of constants {In } can be found such that


lim Pr max = In or In + 1 = 1
n

1in

and this degeneracy implies that no limit distribution for normalized maxima exists.


The discreteness in the Poisson distribution causes the oscillation between In and In + 1. It is not
the cause of the limit degeneracy, which is actually a consequence of the Poisson tail decay.

Statistics of Extremes

January 2008 slide 50

24

Point Process Approach

slide 51

Improved Inferences
The annual maxima method can be inefficient if other data are available. Alternative methods include:


peaks over thresholds,

r-largest order statistics.

Both are special cases of a point process representation.


Statistics of Extremes

January 2008 slide 52

Hourly rainfall (mm)


5
10

15

Example: Eskdalemuir rainfall

1970

1975

1980

1985

Time

Statistics of Extremes

January 2008 slide 53

Hourly rainfall (mm)


5
10

15

Example: Eskdalemuir rainfall

1970

1975

1980

1985

Time

Statistics of Extremes

January 2008 slide 54

25

Hourly rainfall (mm)


5
10

15

Example: Eskdalemuir rainfall

1970

1975

1980

1985

Time

Statistics of Extremes

January 2008 slide 55

Point process


A point process P is simply a collection of pointstimes of avalanches, positions of stars in the


sky, time/position of earthquake times/epicentres; . . .

For any suitable set A, we simply count the number of points in A, and for some (random) n we
can write the process as
n
X
Xj
P=
j=1

where the Xj are the positions of the points, and x puts unit mass at x.


The basic point process is the Poisson process.

Statistics of Extremes

January 2008 slide 56

Poisson process
Definition 5 Let X Rd , and let a function (A) 0 be defined for any measurable A X . A
Poisson process P on X with intensity measure satisfies


the number of points of P in A, denoted N (A), has the Poisson distribution with mean (A);

If A, B X are disjoint, then N (A) and N (B) are independent.

If A = [a1 , x1 ] [ad , xd ], and if

(x1 , . . . , xd ) =

d (A)
x1 xd

exists, then is called the intensity (density) function of P.


Statistics of Extremes

January 2008 slide 57

26

Point process limit: Basic idea






iid

Suppose F unknown, and take X1 , . . . , Xn F .

Form a 2-dimensional point process {(i, Xi ); i = 1, . . . , n} and characterize the behaviour of this
process in regions of the form A = [t1 , t2 ] [u, ).
More formally, suppose (Mn bn )/an converges to the distribution
o
n
1/
.
G(x) = exp (1 + x)+

Construct a sequence of point processes on R2 by





i
Xi bn
Pn =
,
: i = 1, . . . , n .
n+1
an

Then Pn P as n , where P is a Poisson process.

Statistics of Extremes

January 2008 slide 58

Poisson limit

Poisson limit when rescaling samples of sizes 10, 100, 1000, 10,000.

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6
t

Statistics of Extremes

0.8

1.0

-2
(X-b)/a

0.0 0.2 0.4 0.6 0.8 1.0

-4

-6

-8

(X-b)/a

-8

-6

-4

-8

-8

-6

-6

(X-b)/a

-2

-4

-4

(X-b)/a

-2

-2

January 2008 slide 59

Limiting Poisson process




As n, on regions bounded away from , Pn P, a Poisson process defined by its intensity


measure .

Consider the set Ax = [0, 1] [x, ), for some x > u.

The Poisson property gives

Pr{no points in Ax } = exp{(Ax )}

= Pr{Mn x}
o
n
1/
,
exp (1 + x)+

and so, by time-homogeneity of the process,

1/

{(t1 , t2 ) [x, )} = (t2 t1 )(1 + x)+


Statistics of Extremes

.
January 2008 slide 60

27

Statistical application
For statistical application, it is convenient to:


Assume the limiting process is a reasonable approximation for finite n above a high threshold, u.

To absorb the unknown scaling coefficients into the intensity function and so work directly with
the original series.

To rescale the intensity so that the annual maximum has the GEV distribution with parameters
(, , ) .

This leads to modelling the series {(i, Xi ); i = 1, . . . , n} as a Poisson process above u with intensity
function
1/

{(t1 , t2 ) (x, )} = ny (t2 t1 ){1 + (x )/}+

where ny is the number of years of observation, and the time limits are rescaled to 0 t1 < t2 1.

Statistics of Extremes

January 2008 slide 61

Likelihood



Maximum likelihood inference is most natural.


For a region of the form Av = [0, 1] (v, ) for v > u, the likelihood is
NAv

L(Av ; , , )

= exp{(Av )}

d(ti , xi )

i=1

NAv
n
1/ o Y
1/1
1
= exp ny 1 + v

,
1 + xi +

i=1

where x1 , . . . , xNAv is an enumeration of the NAv points that exceed the threshold v.
The parameters are the same as for the maximum model.

Statistics of Extremes

January 2008 slide 62

Hourly rainfall (mm)


5
10

15

Example: Eskdalemuir rainfall

1970

1975

1980

1985

Time

Model is now applied to a time series of hourly rainfall aggregates measured in mm with a threshold
of u = 5mm.
Statistics of Extremes

January 2008 slide 63

28

Example: Eskdalemuir rainfall


(rain.fit <- fpot(esk.rain,threshold=5,model="pp",
start=list(loc=10,scale=1.2,shape=0.1),npp=365.25*24))
Threshold: 5
Number Above: 356
Proportion Above: 0.0024
Estimates
loc
10.13628

scale
1.86637

Standard Errors
loc
scale
0.35380 0.23673

shape
0.06696

shape
0.05379

Statistics of Extremes

January 2008 slide 64

Example: Eskdalemuir rainfall


Quantile Plot

0.0

0.2

10

Empirical
15

Model
0.4
0.6

0.8

20

1.0

Probability Plot

0.0

0.2

0.4
0.6
Empirical

0.8

1.0

10 12
Model

14

16

18

Return Level Plot

0.0

0.1

10

0.2

Density
0.3 0.4

Return Level
15
20

0.5

25

0.6

Density Plot

10 12
Quantile

14

16

0.05

Statistics of Extremes

0.20

1.00
5.00
Return Period

20.00

January 2008 slide 65

29

Threshold methods


Let Xn,i
= (Xi bn )/an , for i = 1, . . . , n. Then, by the Poisson limit,

{(0, 1) (u + x, )}
{(0, 1) (u, )}
1/

x
.
=
1+
+ (u ) +
Absorbing the unknown scaling coefficients leads to the survivor function of the Generalized Pareto
Distribution (GPD):

Pr{Xn,i
> u + x | Xn,i
> u}


x 1/
Pr{Xn,i > u + x | Xn,i > u} = 1 +
,
+


x > 0,

where = + (u ).
Taking the limit 0 gives the exponential distribution as a special case.

Statistics of Extremes

January 2008 slide 66

Alternative inference
An equivalent inference to the point process method is to apply the GPD model to the exceedances of
a threshold u. For the rainfall data we obtain:
(rain.gpdfit <- fpot(esk.rain,threshold=5,npp=365*24))
Deviance: 1058.954
Threshold: 5
Number Above: 356
Proportion Above: 0.0024
Estimates
scale
shape
1.52239 0.06702
Standard Errors
scale
shape
0.11488 0.05383

Statistics of Extremes

January 2008 slide 67

30

Interpretation


There are now just two parameters, since the model conditions on the exceedance (one parameter
is lost corresponding to the crossing rate of u).

The estimate of is (almost!) identical to the point process analysis, while the change of
parameterization leads to a different value of .

Provided < 1, the mean residual life exists and satisfies


E(X u | X > u) =

+ u
,
1

we obtain a simple diagnostic for threshold selection. The mean exceedance above u should be
linear in u at levels for which the model is valid.
Suggests looking for linearity in a plot of the empirical mean residual life.
Statistics of Extremes

January 2008 slide 68

Mean residual life plot


mrlplot(esk.rain) gives

0.0

0.5

1.0

Mean Excess
1.5
2.0

2.5

3.0

3.5

Mean Residual Life Plot

6
8
Threshold

10

12

Interpretation hard, but u > 2 looks reasonable; u = 5 looks high.


Statistics of Extremes

January 2008 slide 69

Threshold selection

0.2

14

Scale
3
4

Location
16 18 20

Shape
0.0 0.1 0.2 0.3

22

Bias-variance trade-off: threshold too lowbias because of the model asymptotics being invalid; threshold too
highvariance is large due to few data points.
An alternative approach is to fit the Poisson process model at many thresholds and look for parameter stability:

3 4 5
Threshold

3 4 5
Threshold

3 4 5
Threshold

Again, u = 5 looks too high.

Statistics of Extremes

January 2008 slide 70

31

rlargest order statistics method


Consider the vector of r largest order statistics in a block (say, year) of data:
!
(1)
(2)
(r)
Mn bn Mn bn
Mn bn
,
,...,
an
an
an
(1)

rescaled in the same way as just the maximum, Mn .


(r)
Setting u = Mn in the point process likelihood gives

!1/ r
(r)
Y 1

Mn
L = exp 1 +

i=1

(i)

Mn
1+

!1/1

This is the likelihood contribution for a single block (year). The full likelihood is obtained by
multiplying such terms across the blocks.
Statistics of Extremes

January 2008 slide 71

Venice Sea Levels


rlarg.fit(venice.data)
nllh:
[1] 1149.268
mle:
[1] 120.3961096

12.6929941

-0.1148273

se:
[1] 1.34704941 0.52516203 0.01901015

Statistics of Extremes

January 2008 slide 72

32

Modelling Issues

slide 73

Modelling issues
Extreme value data usually show:


dependence on covariate effects;

short term dependence (storms for example);

seasonality (due to annual cycles in meteorology);

longterm trends (due to gradual climatic change);

other forms of nonstationarity (for example, the deterministic effect of tides on sealevels).

For temporal dependence there is a sufficiently wide-ranging theory which can be invoked. Other
aspects have to be handled at the modelling stage.
We now discuss how to deal with


short-term dependence;

trends and seasonality.

Statistics of Extremes

January 2008 slide 74

D(un ) condition
The usual (weak) condition which is adopted to eliminate the effect of long-range dependence is
Leadbetters D(un ) condition:
For all i1 < . . . < ip < j1 < . . . < jq with j1 ip > l
|Pr{Xi1 un , . . . , Xip un , Xj1 un , . . . , Xjq un }
Pr{Xi1 un , . . . , Xip un }Pr{Xj1 un , . . . , Xjq un }| (n, l),

where (n, ln )0 for some sequence ln = o(n). This implies that rare events that are sufficiently
separated are independent.
Theorem 6 If D(un ) is satisfied with un = an x + bn , and if
Pr{Mn an x + bn }G(x),
then G is a GEV distribution.
Conclusion: Same models apply for processes with (restricted) long-range dependence.
Statistics of Extremes

January 2008 slide 75

33

Short-term dependence: Example


iid

x
3

Suppose Y1 , Y2 , . . . exp(1), and Xi = max(Yi , Yi+1 ):

20

40

60

80

100

Extremes tend to cluster in pairs.


Statistics of Extremes

January 2008 slide 76

Exponential example
Marginal distribution:
Pr{Xi < x}

Pr{Yi < x, Yi+1 < x} = (1 ex )2 .

Let X1 , X2 , . . . be an independent series with the same marginal distribution, and Mn = max{X1 , . . . , Xn }.
Then

1 x 2n
e
Pr{Mn log(2n) < x} = [1 exp{x log(2n)}]2n = 1 2n

exp ex = G2 (x),
while, for Mn = max{X1 , . . . , Xn },
Pr{Mn log(2n) < x}

and G1 (x) = {G2 (x)}1/2 .

Pr{Y1 < x + log(2n), . . . , Yn+1 < x + log(2n)}



n+1
x
=
1 e2n

exp 21 ex = G1 (x),

Statistics of Extremes

January 2008 slide 77

34

Example: Moving maxima


iid

Let Y0 , Y1 , Y2 , . . . F , with
n
o
1
F (y) = exp (a+1)y
,

y > 0,

where 0 a 1 is a parameter, and define the process Xi by

Xi = max{aYi1 , Yi },

5.0

50.0

i = 1, . . . , n.

0.1

0.5

x(i)

5.0
0.1

0.5

x(i)

50.0

X0 = Y0 ,

10

20

30

40

50

10

20

40

50

30

40

50

50.0
5.0
0.1

0.5

x(i)

5.0
0.1

0.5

x(i)

30
i

50.0

10

20

30

40

50

10

20

Statistics of Extremes

January 2008 slide 78

Moving maxima: Asymptotic properties


Pr{Xi x} = Pr{aYi1 x, Yi x} = exp(1/x),

Now, let X1 , X2 , . . . be a series of independent variables having a marginal standard Frechet distribution, and
define Mn = max{X1 , . . . , Xn }. Then,
n

Pr{Mn nz} = [exp{1/(nz)}] = exp(1/z).


On the other hand, for Mn = max{X1 , . . . , Xn },
Pr{Mn nz} =
=
=

Pr{X1 nz, . . . , Xn nz}


Pr{Y1 nz, . . . , Yn nz}
h
n
oin
1
exp (a+1)nz
1

{exp (1/z)} a+1 ,

It follows that

Pr{Mn nz} = [Pr{Mn nz}] a+1

Statistics of Extremes

January 2008 slide 79

35

Extremal index
The previous examples illustrate the following general result.
Theorem 7 Let {Xi } be a stationary process and {Xi } be independent variables with the same marginal
distribution. Set Mn = max{X1 , . . . , Xn } and Mn = max{X1 , . . . , Xn }. Under suitable regularity conditions,
Pr {(Mn bn )/an z} G1 (z)
as n for normalizing sequences {an > 0} and {bn }, where G1 is a non-degenerate distribution function, if
and only if
Pr {(Mn bn )/an z} G2 (z),
where
G2 (z) = G1 (z)
for a constant called the extremal index that satisfies 0 < 1.
Thus if G1 is GEV, then so is G2 , with the same a strong robustness result.

Statistics of Extremes

January 2008 slide 80

Extremal index


The extremal index can also be defined as


= lim Pr{max(X2 , . . . , Xpn ) un | X1 un },
n
where pn = o(n), and the sequence un is such that Pr{Mn un } converges.

Loosely, is the probability that a high threshold exceedance is the final element in a cluster of
exceedances.

Thus extremes occur in clusters whose (limiting) mean cluster size is 1/.

The mean distance between clusters is increased by a factor 1/.

In fact the distribution of a cluster maximum is the same as the marginal distribution of an
exceedance, so there is no bias in considering only cluster maxima, if we can identify clusters . . .
Statistics of Extremes

January 2008 slide 81

36

Consequences


When clustering occurs, the notion of return level is more complex:


if = 1, then the 100-year-event has probability 0.368 of not appearing in the next 100 years;
if = 1/10, then on average the event also occurs ten times in a millenium, but all together: it has
probability 0.904 of not appearing in the next 100 years.

If we can estimate the tail of marginal distribution F (e.g. by fitting to block maxima), then
Pr(Mn x) F (x)n G(x),
where G is GEV with parameters , , . The marginal quantiles are approximately
F 1 (p) G1 (pn ) > G1 (pn ),
so may be much larger than would be the case with = 1.

A similar argument shows that ignoring can lead to over-estimating a return level estimated using G.

Statistics of Extremes

January 2008 slide 82

D (un ) condition


The D (un ) condition is said to hold for the process {Xi } if


[n/k]

lim sup n
n

X
j=2

P {X1 > un , Xj > un } 0 as k

If a stationary process satisfies both the D(un ) and D (un ) conditions, then its extremal index is
= 1.

D (un ) holds for many well-known processes, for example Gaussian autoregressive processes. This
raises the issue of how best to model the extremes of such processes, since at any realistic level,
dependence is likely to be strong, while the asymptotic limit is independence.

Other conditions D , D (2) , . . . facilitate extremal index calculation in special cases.

Statistics of Extremes

January 2008 slide 83

Modelling techniques
The above discussion tells us that under the weak condition D(un ) and some mild additional
conditions, the maxima of stationary series can be modelling using the GEV, and that exceedances
can be modelling using the GPD. This is a strong robustness result, but it implies that extremes will
occur in clusters of correlated observations.
Possible modelling strategies using exceedances and the GPD are:


to identify clusters and model cluster maxima only;

as above, but also estimate the extremal index empirically;

to ignore the dependence on the basis that the marginal model is valid, but to inflate standard
errors to account for reduction in independent information;

to specify explicit model for dependence, such as a first-order Markov chain.

Statistics of Extremes

January 2008 slide 84

37

Example: Wooster temperatures


3

10
-30

-20

-10

Degrees Below 0 F.

10
0
-10
-30

-20

Degrees Below 0 F.

20

20

10

20

30

40

Day Index

10

20

30

40

Day Index

Simple estimates of the extremal index are based on empirical means of clusters. Here the runs method is used:
a cluster is deemed to have terminated when there are r consecutive observations below the threshold. Left:
r = 1 gives 4 clusters. Right: r = 3 gives 2 clusters.

Statistics of Extremes

January 2008 slide 85

Calculation of return levels




The m-observation return level is


xm = u +

i
(mu ) 1 ,

where and are the parameters of the threshold excess generalized Pareto distribution, u is the
probability of an exceedance of u, and is the extremal index.


nu
nc
bu =
and b =
.
n
nu

where nc is number of clusters and nu is number of exceedances.




So simply estimate the component u by nc /n.

Statistics of Extremes

January 2008 slide 86

Example: Wooster temperatures

nc

b
b
x
b100
b

u = 10
r=2
r=4
31
20
11.8 (3.0)
14.2 (5.2)
0.29 (0.19) 0.38 (0.30)
27.7 (12.0)
26.6 (14.4)
0.42
0.27

u = 20
r=2
r=4
43
29
17.4 (3.6)
19.0 (4.9)
0.36 (0.15) 0.41 (0.19)
26.2 (9.3)
25.7 (9.9)
0.24
0.16

Results may be sensitive to choice of threshold, u, and run length, r.


Statistics of Extremes

January 2008 slide 87

38

Example: Eskdalemuir rainfall


(rain.fit <- fpot(esk.rain,threshold=5,model="pp",
start=list(loc=10,scale=1.2,shape=0.1),npp=365.25*24))
Threshold: 5
Number Above: 356
Proportion Above: 0.0024
Estimates
loc
10.13628

scale
1.86637

Standard Errors
loc
scale
0.35380 0.23673

shape
0.06696

shape
0.05379

Statistics of Extremes

January 2008 slide 88

Example: Eskdalemuir rainfall


(rain.fit <- fpot(esk.rain,threshold=5,model="pp",cmax=T, r=0,
start=list(loc=10,scale=1.2,shape=0.1),npp=365.25*24))
Threshold: 5
Number Above: 356
Proportion Above: 0.0024
Clustering Interval: 0
Number of Clusters: 272
Extremal Index: 0.764
Estimates
loc
scale
9.81557 1.83937

shape
0.04178

Standard Errors
loc
scale
shape
0.3519 0.2406 0.0632

Statistics of Extremes

January 2008 slide 89

39

Example: Eskdalemuir rainfall

0.2

0.3

0.4

Extremal Index
0.5
0.6

0.7

0.8

For finite threshold u, increases with u, suggesting that very extreme hourly rainfall totals occur
singly.

3
4
Threshold

Statistics of Extremes

January 2008 slide 90

Non-stationarity
General results not broad enough for application hence, model trends, seasonality and covariate
effects by parametric or nonparametric models for the usual extreme value model parameters.
Some possibilities for parametric modelling:
(t) = + t;
(t) = exp( + t);

1 , t t0 ,
(t) =
2 , t > t0 ;
(t) = + y(t).
Statistics of Extremes

January 2008 slide 91

Parameter estimation


Model specification (example)


Zt GEV((t), (t), (t)),

Likelihood (for complete parameter set ,


L() =

m
Y

g(zt ; (t), (t), (t)),

t=1

where g is GEV model density.




Maximization of L yields maximum likelihood estimates.

Standard likelihood techniques also yield standard errors, confidence intervals etc.

Statistics of Extremes

January 2008 slide 92

40

Model reduction


For nested models M0 M1 , the deviance statistic is


D = 2{1 (M1 ) 0 (M0 )},

Based on asymptotic likelihood theory, M0 is rejected by a test at the -level of significance if


D > c , where c is the (1 ) quantile of the 2k distribution, and k is the difference in the
dimensionality of M1 and M0 .

Statistics of Extremes

January 2008 slide 93

Model diagnostics


Assuming a fitted model


the standardized variables

b
Zt GEV(b
(t),
b(t), (t)),



Zt
b(t)
1
b

log 1 + (t)
,
Zt =
b

b(t)
(t)

each have the standard Gumbel distribution, with probability distribution function
Pr{Zt z} = exp(ez ),



z R.

Possible diagnostics:


probability plot: i/(m + 1), exp( exp(
z(i) )); i = 1, . . . , m



quantile plot: log [ log{i/(m + 1)}] , z(i) ; i = 1, . . . , m

Statistics of Extremes

January 2008 slide 94

Other extreme value models




Similar techniques are applicable for the threshold exceedance and point process models, but
threshold selection is likely to be a more sensitive issue.

Time-varying thresholds may also be appropriate, though there is little guidance on how to make
such a choice.

Statistics of Extremes

January 2008 slide 95

Example: Fremantle sea levels


Model
Constant
Linear in
Quadratic in
Linear in and




Log-likelihood
43.6
49.9
50.6
50.7

Based on likelihood considerations, best model is linear in .


For this model, trend is around 2mm per year and b = 0.125 (0.070).

(Note: Model improves further by inclusion of SOI as a covariate for ).

Statistics of Extremes

January 2008 slide 96

41

1.6
1.2

1.4

Sea-level (metres)

1.8

Example: Fremantle sea levels

1900

1920

1940

1960

1980

Year

Fitted trend in location parameter for Fremantle annual maximum sea levels.
Statistics of Extremes

January 2008 slide 97

Example: Fremantle sea levels

Residual Quantile Plot (Gumbel Scale)

0.0

0.2

0.4

Model

Empirical

0.6

0.8

1.0

Residual Probability Plot

0.0

0.2

0.4

0.6

0.8

1.0

-1

Empirical

Model

Probability and quantile plots for nonstationary GEV analysis of Fremantle annual maximum sea levels.
Statistics of Extremes

January 2008 slide 98

Example: Race times


Model
Constant

Log-likelihood
54.5

Linear

51.8

Quadratic

48.4

239.3
(0.9)
(242.9, 0.311)
(1.4, 0.101)
(247.0, 1.395, 0.049)
(2.3, 0.420, 0.018)

b
3.63
(0.64)
2.72
(0.49)
2.28
(0.45)

b
0.469
(0.141)
0.201
(0.172)
0.182
(0.232)

Quadratic model apparently preferable.


Statistics of Extremes

January 2008 slide 99

42

240
238
232

234

236

Race Time (secs.)

242

244

246

Example: Race times

1975

1980

1985

1990

Year

Fitted models for location parameter in womens 1500 metre race times. Note quadratic model would
lead to slower races in recent and future events.
Statistics of Extremes

January 2008 slide 100

Example: Race times


Alternative exponential model

(t) = 0 + 1 e2 t .

240
238
232

234

236

Race Time (secs.)

242

244

246

has log-likelihood 49.5. Not so good as quadratic model, though comparison via likelihood ratio test
is invalid as models are not nested. Better behaviour for large t suggests a preferable model though.

1975

1980

1985

1990

Year

Statistics of Extremes

January 2008 slide 101

43

Example: Wooster Temperature Series


Spring

-20
-60

-40

Degrees Below Zero F.

0
-20
-40

Degrees Below Zero F.

20

Winter

100

200

300

400

100

200

Day Index

Day Index

Summer

Autumn

300

400

300

400

-30
-70

-50

Degrees Below Zero F.

-50
-60
-70

Degrees Below Zero F.

-40

-10

100

200

300

400

Day Index

100

200
Day Index

One approach to handle seasonality is to model seasons separately. This is not likely to be sufficient
here, at least with only 4 seasons.
Statistics of Extremes

January 2008 slide 102

0
-20
-40
-60

Daily Minimum Temperature (Degrees Below 0 F.)

20

Example: Wooster Temperature Series

1983

1984

1985

1986

1987

1988

Year

Selection of a time-varying threshold for negated Wooster temperature series.


Statistics of Extremes

January 2008 slide 103

Example: Wooster Temperature Series


Model
1. Time-homogeneous
2. As 1. but periodic in
3. As 2. but periodic in log
4. As 3. but periodic in
5. As 3. plus linear trend in and log
6. As 3. with separate for each season

p
3
5
7
9
9
10

143.6
126.7
143.6
145.9
145.1
143.9

Likelihood considerations lead to Model 3 as preferable.


Statistics of Extremes

January 2008 slide 104

44

Semiparametric regression

We may relax parametric assumptions, or mix them with more flexible forms. For example, we may want to
model temperatures at sites in the Alps as dependent on altitude a(x), where x is location, plus a smooth
function of time t, giving
(x, t) = 0 + 1 a(x) + g(t), or (x, t) = 0 + 1 a(x) + g{t, a(x)}.

Statistics of Extremes

January 2008 slide 105

Formulation


A standard approach is then to include basis functions in the model, so


0 + 1 a(x) + g(t) 0 + 1 a(x) + 1 b1 (t) + + p bp (t),
where the functions b1 (t), . . . , bp (t) may be splines, polynomials, etc., depending on the properties sought.

Often take splines, which have optimality properties, and which correspond to prediction of stochastic
processes.

Usually penalise the smoothing parameters = (1 , . . . , p )T , for example, maximising a log likelihood of
form
(, ) + T K,
where smoothing parameter > 0 controls smoothness of result (equivalent to effective degrees of
freedom).

Interpretation in terms of mixed model, by representing Np (0, K 1 1


), allows data to choose degree
of smoothing.

Statistics of Extremes

January 2008 slide 106

45

Example: Swiss winter temperatures


1970

Grachen

1980

1990

Sils Maria

Arosa

Temperature below threshold (degrees Celsius)

5
10
15
20
0

Chateau dOex

La Brevine

Guttannen

Oberiberg

Chaumont

Andermatt

Fribourg

Haidenhaus

Heiden

Einsredeln

Vattis

Elm

5
10
15
20
0
5
10
15
20
0

Rheinfelden

Grono

Altstatten

Thun

Ebnat Kappel

Meiringen

5
10
15
20
1970

1980

1990

1970

1980

1990

1970

1980

1990

Year

DecemberFebruary temperatures ( C) below thresholds at 21 Swiss weather stations, for winters


197071 to 199798.
Statistics of Extremes

January 2008 slide 107

0.4
0.3
0.2

lambda

0.0

0.1

0.2
0.0

0.1

lambda

0.3

0.4

Swiss winter temperatures

20

40

60

80

8
6
2

sigma

6
sigma
4
2

20

40

60

80

0.10
xi

0.20
0.25

0.25

0.20

0.15

0.15

0.10

0.05

North Atlantic oscillation index

0.05

Day

xi

North Atlantic oscillation index

Day

500

1000

1500

Altitude

North Atlantic oscillation index

Statistics of Extremes

January 2008 slide 108

46

Swiss winter temperatures


Estimated 0.05 quantiles for winter temperatures at Rheinfelden (298m), Vattis (957m), and Arosa
(1821m), as a function of the North Atlantic oscillation index
5

North Atlantic oscillation index

30

30

30

15

20

25

10

Temperature (degrees Celsius)

25

25

10

15

20

20

15

10

Temperature (degrees Celsius)

Temperature (degrees Celsius)

Arosa

Vattis

Rheinfelden

North Atlantic oscillation index

North Atlantic oscillation index

Statistics of Extremes

January 2008 slide 109

Issues with smoothing extremes




Care needed with choice of model: with GPD, change of threshold u 7 u changes scale
parameter:
u 7 u = u + (u u),
so, for example, the formulation
(u , ) = (exp{g(t)}, h(x))
at threshold u will become

(u , ) = exp{g(t)} + h(x)(u u ), h(x) ,

at threshold u , so interpretation depends on thresholdshould avoid.




Better to fit on GEV scale, or using point process formulation, for which parametrization is
invariant.

Statistics of Extremes

January 2008 slide 110

47

Multivariate Extremes

slide 111

Multivariate extremes
Lack of data means the precision of extreme value estimates is often poor.
The only way to overcome this is to incorporate additional information, suggesting the use of
multivariate models.
Questions include:


What issues are important when contemplating multivariate extremes?

What are appropriate ways to summarize dependence in extremes?

What models are suggested by asymptotic theory?

How should inference be carried out?

Statistics of Extremes

January 2008 slide 112

Componentwise maxima
iid

If (X1 , Y1 ), (X2 , Y2 ) F (x, y), define


Mx,n = max {Xi }
i=1,...,n

and

My,n = max {Yi },


i=1,...,n

then
M n = (Mx,n , My,n )
is the vector of componentwise maxima.



The asymptotic theory of multivariate extremes begins with an analysis of M n as n .

The issue is partly resolved by recognizing that {Xi } and {Yi } considered separately are sequences
of independent, univariate random variables, to which the earlier theory may be applied.

Statistics of Extremes

January 2008 slide 113

Marginal standardization
Representations are especially simple if we assume that both Xi and Yi have the standard Frechet
distribution, with distribution function
F (z) = exp(1/z),
Then defining
M n
M n

z > 0.

max {Xi }/n, max {Yi }/n

i=1,...,n

i=1,...,n

the marginal distributions of


are standard Frechet for all n. Remaining questions concern
dependence of limit distribution only.
Statistics of Extremes

January 2008 slide 114

48

Limit distribution of componentwise maxima

Let M n = (Mx,n
, My,n
) be componentwise maxima of independent vectors with standard Frechet marginal
distributions. Then if
d

Pr{Mx,n
x, My,n
y} G(x, y),

where G is a non-degenerate distribution function, G has the form


G(x, y) = exp{V (x, y)},
where
V (x, y) = 2

max
0

x > 0, y > 0

w 1w
, y
x

dH(w),

and H is a distribution function on [0, 1] satisfying the mean constraint


Z 1
wdH(w) = 1/2.
0

If H is differentiable with density h, then


V (x, y) = 2

max
0

w 1w
, y
x

h(w)dw.

Statistics of Extremes

January 2008 slide 115

Special cases


Independence: When H is a measure with masses 0.5 on w = 0 and w = 1


G(x, y) = exp{(x1 + y 1 )},

x > 0, y > 0.

Perfect dependence: When H is a measure that places unit mass on w = 0.5


G(x, y) = exp{ max(x1 , y 1 )},

x > 0, y > 0,

which is the distribution function of variables that are marginally standard Frechet, but which are
perfectly dependent: X = Y with probability 1.
Statistics of Extremes

January 2008 slide 116

Parametric models


For modelling purposes, it is usual to specify a parametric family for H or h that encompasses a
wide range of dependence types over the parametric domain.

Standard example is the logistic model, with


h(w) = 12 (1 1){w(1 w)}11/ {w1/ + (1 w)1/ }2
for 0 < < 1.

In this case

 o
n 
,
G(x, y) = exp x1/ + y 1/

x > 0, y > 0.

Independence and perfect dependence arise as limits as 1 and 0 respectively.

Statistics of Extremes

January 2008 slide 117

49

Alternative models
A limitation of the logistic model is its symmetry. Asymmetric alternatives include the


bilogistic model
h(w) = 12 (1 )(1 w)1 w2 (1 u)u1 {(1 u) + u}1
on 0 < w < 1, where 0 < < 1 and 0 < < 1, and u = u(w, , ) is the solution of
(1 )(1 w)(1 u) (1 )wu = 0;

and the Dirichlet model


h(w) =

( + + 1)(w)1 {(1 w)}1


,
2()(){w + (1 w)}++1

0 < w < 1,

for parameters > 0 and > 0.


Statistics of Extremes

January 2008 slide 118

Inference for multivariate extreme value models


Inference consists of the following steps:
1. estimation of marginal distributions and transformation to standard Frechet;
2. choice of the dependence model H;
3. estimation of the parameters of H by maximum likelihood;
4. model assessment.
There is a potential gain in efficiency by estimating marginal and dependence parameters in a single
likelihood maximization.
Statistics of Extremes

January 2008 slide 119

Structure variables


Though processes may be multivariate, it may be that a univariate functiona so-called


structure variableis the quantity of interest.

This suggests two possible methods of analysis:


1.

univariate extreme value analysis of the sturcture variable process; or

2.

multivariate analysis of the full process, followed by marginalization to the structure variable
of interest.

Statistics of Extremes

January 2008 slide 120

50

Inference for structure variables




Univariate inference is trivial.

For multivariate inference, if Z = (Mx , My ) is the structure variable of interest, and


M = (Mx , My ) has density g, then
Z
g(x, y)dxdy,
Pr{Z z} =
Az

where Az = {(x, y) : (x, y) z}.

Sometimes integration can be avoided. For example, if Z = max{Mx , My },


Pr{Z z} = Pr{Mx z, My z} = G(z, z),
where G is the joint distribution function of M .

Statistics of Extremes

January 2008 slide 121

4.4
4.2
4.0
3.8
3.6

Port Pirie Annual Maximum Sea-level (m)

4.6

Example: Sea levels

1.3

1.4

1.5

1.6

1.7

1.8

1.9

Freemantle Annual Maximum Sea-level (m)

Annual maximum sea levels at Port Pirie and Fremantle. Logistic


b = 0.922 (0.087), implying
near-independence.
Statistics of Extremes

January 2008 slide 122

51

2.0
1.0

1.5

Return Level (m)

2.5

3.0

Analysis of structure variable

10

50

100

500

1000

Return Period (Years)

Return levels of structure variable Z = max{Mx , (My 2.5)} using both univariate and multivariate
methods.
Statistics of Extremes

January 2008 slide 123

2.0
1.6

1.8

Return Level (m)

2.2

2.4

Impact of dependence on structure variables I

10

50

100

500

1000

Return Period (Years)

Z = max{Mx , (My 2.5)} in logistic model analysis of Fremantle and Port Pirie annual maximum
sea-level series with = 0, 0.25, 0.5, 0.75, 1 ( = 0 is lowest curve).
Statistics of Extremes

January 2008 slide 124

52

1.6
1.2

1.4

Return Level (m)

1.8

2.0

Impact of dependence on structure variables II

10

50

100

500

1000

Return Period (Years)

Z = min{Mx , (My 2.5)} in logistic model analysis of Fremantle and Port Pirie annual maximum
sea-level series with = 0, 0.25, 0.5, 0.75, 1 ( = 0 is highest curve).
Statistics of Extremes

January 2008 slide 125

Point process representation




As in the univariate case, there are threshold exceedance and point process characterizations of
extremes that enable a greater use of available information.

The point process representation, in particular, also provides some insight into the measure
function H that appears in the componentwise limit law.

Statistics of Extremes

January 2008 slide 126

Point process construction




(X1 , Y1 ), (X2 , Y2 ) . . . a sequence of independent variables with standard Frechet margins.

Assume componentwise maxima convergence

Pr{Mx,n
x, My,n
y} G(x, y).

Define sequence of point processes {Nn } by


Nn = {(n1 X1 , n1 Y1 ), . . . , (n1 Xn , n1 Yn )}.

Normalization by n of standard Frechet variables is required to obtain marginal convergence.


Statistics of Extremes

January 2008 slide 127

53

Point process limit




On regions bounded from the origin (0, 0), we have


d

Nn N,


where N is a non-homogeneous Poisson process on (0, ) (0, ).


Moreover, in terms of pseudo-polar coordinates
r =x+y

and w =

x
x+y ,

the intensity function of the limiting process N is


(r, w) = 2 dH(w)
r2 .


The function H is related to G by


G(x, y) = exp{2

max

w 1w
x, y

dH(w)}

Statistics of Extremes

January 2008 slide 128

Comments


The intensity function of the limit process factorizes across r and w. Loosely, the relative
magnitude of extreme events in the two variables is independent of the magnitude itself.

The limit provides an interpretation of H as the distribution of relative magnitudes of extreme


events between the two variables.

In practiceas usualthe limit process is taken to be exact for extreme enough events: in this
case for events which are sufficiently far from (0, 0).

Statistics of Extremes

January 2008 slide 129

Consistency of results

Pr{Mx,n
x, My,n
y} = Pr{Nn (A) = 0},

where
A = {(0, ) (0, )}\{(0, x) (0, y)}.
So,

Pr{Mx,n
x, My,n
y} Pr{N (A) = 0} = exp{(A)},

where
(A) =
=

= 2

2
A
1

dr
dH(w)
r2
Z

w=0
Z 1

w=0

r=min{x/w,y/(1w)}

max

w 1w
x, y

dr
dH(w)
r2

dH(w).

Statistics of Extremes

January 2008 slide 130

54

Likelihood for Poisson process model


Choosing A = {(x, y) : x/n + y/n > r0 }, for large r0 ,
(A) = 2

dr
dH(w) = 2
r2

r=r0

dr
r2

dH(w) = 2/r0 ,
w=0

which is constant with respect to the parameters of H.


Hence, assuming H has density h,
L(; (x1 , y1 ), . . . , (xn , yn )) = exp{(A)}

NA
Y

(x(i) /n, y(i) /n)

i=1

NA
Y

h(wi ),

i=1

where wi = x(i) /(x(i) + y(i) ) for the NA points (x(i) , y(i) ) falling in A.
Statistics of Extremes

January 2008 slide 131

Comments


Choice of threshold r0 is based, as usual, on empirical measures of model stability.

Alternative forms of threshold curve lead to more complicated likelihood expressions.

Simultaneous estimation of marginal and dependence features is possibly advantageous, but again
results in a considerably more complicated likelihood.

Statistics of Extremes

January 2008 slide 132

0.0
-0.02

Log-daily return

0.02

Example: Exchange rate data

1997

1998

1999

2000

2001

2000

2001

0.0
-0.02

Log-daily return

0.02

Year

1997

1998

1999
Year

Log-daily returns of exchange rates. Top panel: UK sterling/US dollar exchange rate. Bottom panel:
UK sterling/Canadian dollar exchange rate.
Statistics of Extremes

January 2008 slide 133

55

0.01
0.0
-0.01
-0.02

Sterling v. Canadian Dollars

0.02

Example: Exchange rate data

-0.02

-0.01

0.0

0.01

0.02

Sterling v. US Dollars

Concurrent values of exchange rates.


Statistics of Extremes

January 2008 slide 134

100
10
1

Standardized Canada/UK Daily Returns

1000

Example: Exchange rate data

10

100

1000

Standardized US/UK Daily Returns

Concurrent values of exchange rates after transformation to Frechet scale.


Statistics of Extremes

January 2008 slide 135

56

1.5
1.0
0.0

0.5

Relative Frequency

2.0

Example: Exchange rate data

0.0

0.2

0.4

0.6

0.8

1.0

Fitted logistic model density h in point process analysis of exchange rate data. (b
= 0.434 (0.025))
Statistics of Extremes

January 2008 slide 136

0.2
-0.2

0.0

Surge (m)

0.4

0.6

0.8

Example: Oceanographic data

10

Wave Height (m)

Simultaneous values of wave and surge height.


Statistics of Extremes

January 2008 slide 137

57

1.0
0.0

0.5

Relative Frequency

1.5

2.0

Example: Oceanographic data

0.0

0.2

0.4

0.6

0.8

1.0

Various fitted models h in point process analysis of wave-surge data.


Statistics of Extremes

January 2008 slide 138

Example: Oceanographic data


Model
Logistic
Bilogistic
Dirichlet

227.2
230.2
238.2

0.659
0.704
0.852

0.603
0.502

Details of various fits to oceanographic data give support for asymmetric dependence structure.
Statistics of Extremes

January 2008 slide 139

Higher-dimensional models


Limit results and characterizations have analogous forms in higher dimensions.

Extremal dependence is chracterized by a dependence function H on the simplex


sd = {(w1 , . . . , wd ) : wj 0, j = 1, . . . , d,

d
X

wj = 1}

j=1

subject to
Z

wj dH(w) = 1,

j = 1, . . . , d

Sd

In principle, inference techniques remain the same, but


1.

model specification is more difficult;

2.

likelihood calculation is more difficult;

3.

reliability of asymptotic results is more questionable.

Statistics of Extremes

January 2008 slide 140

58

Asymptotic dependence
A limitation of multivariate extreme value models (and point process equivalents) is that apart from
the degenerate case of independence, all such models are asymptotically dependent; the intuition is
that because these are asymptotic (limiting) models, the extent of dependence cannot vary with the
rareness of the event, whereas this happens in practice.


Two variables X and Y with the same distribution are said to be asymptotically independent if
lim Pr{Y > z | X > z} = 0,

otherwise they are asymptotically dependent.




Loosely, two variables are asymptotically independent if an extreme value in one has zero
probability of happening on the occurrence of an extreme in the other variable.

Statistics of Extremes

January 2008 slide 141

Bivariate normal variables




The class of asymptotically independent variables is non-trivial, and includes, for example, all
bivariate normal variables with positive correlation.

Using extreme value models for such distributions is misleading.

The limit is independence, but this provides a poor approximation to dependence at finite
levels.

A fitted model at any specific threshold will overestimate strength of dependence on


exrapolation.

Much recent work has looked at inference for asymptotic independent models: Ledford and Tawn,
Heffernan, Coles and Tawn, Heffernan and Tawn.

Statistics of Extremes

January 2008 slide 142

59

Spatial Extremes

slide 143

Spatiotemporal extremes
Many environmental extremal problems are spatial or temporal in nature, or both:


heatwaves

rainfall

avalanches

forest fires

storms

sea levels

Likely to be more important in the future, prediction (on different scales) needed for long-range
planning, short-range evacuation, . . .
Statistics of Extremes

January 2008 slide 144

Geostatistics

slide 145

Geostatistics


Statistics of spatially-defined variables

Mostly a multivariate normal theory:

remove trends in mean and dispersion in space and time

transform residuals to standard normal margins

fit suitable spatial/space-time correlation functions

inferences using WLS, (likelihood), or Bayes (McMC)

More generally, set up model for response variable Y (x) with x X :


ind

Y (x) | S(x) f (y; ),

S(x) N {(x), (x)},

and use MetropolisHastings algorithm for inference on S(x), predictions of future Y , etc.
(Diggle, Tawn, Moyeed, 1998), lots of Bayesians
Statistics of Extremes

January 2008 slide 145

60

Example: Temperature data

Statistics of Extremes

January 2008 slide 146

Example: Temperature data


Maximum temperature: June, July, August, 2001!2005
!

Jungfraujoch (3580 m)

Santis (2490 m)
!

Gd!St!Bernard (2472 m)
!
!

Arosa (1840 m)
!
!

Davos!Dorf (1590 m)
!

Montana (1508 m)
!

Temperature anomaly (degrees Celsius)

Engelberg (1035 m)

Chateau d'Oex (985 m)


!

Bern!Liebefeld (565 m)
!

Zurich!MeteoSchweiz (556 m)
!

Bad Ragaz (496 m)


!

Neuchatel (485 m)
!

Oeschberg!Koppigen (483 m)
!
!

Montreux!Clarens (405 m)

!
!

Locarno!Monti (366 m)
!

!
!

Basel!Binningen (316 m)
!

Lugano (273 m)

2001

2002

2003

2004

Statistics of Extremes

2005

January 2008 slide 147

61

Geostatistics of extremes
Basic setup:




want to model extremes of process Y (x, t), (x, t) X T R3


time series (maybe intermittent)could be annual maxima, or daily values, or . . .
data available at
sites xd XD = {x1 , . . . , xD } X

times Td = {td,1 , . . . , td,nd }, for d {1, . . . , D}


exposition simplified if Td T = {t1 , . . . , tn }

Aim to compute distributions of quantities such as


Z
r(x, t)I {Y (x, t) ydanger } dx
R(t) =
X

where r(x, t) is population at risk if Y (x, t) exceeds some level ydanger at time t

Statistics of Extremes

January 2008 slide 148

Approaches
Four (?) main approaches:


latent variable (often Bayesian) approach

copulas

HeffernanTawn

multivariate extremes (max-stable processes, Smith 1990)

Statistics of Extremes

January 2008 slide 149

Latent variable approach





Conditional on underlying process S(x), observations Y (x), for x X follow an extremal distribution
Examples:
Y (x)
Y (x)




|
|

ind

S(x) = ((x), (x), (x)) GEV{y; S(x)},


ind

S(x) = ((x), (x)) GPD{y; S(x)},

S(x) N3 {(x), (x)}

S(x) N2 {(x), (x)}

Examples: Casson and Coles (1999, Extremes); Cooley et al. (2007, JASA)
Could use copulas to transform margins to Gaussian, then fit geostatistical modelsGaussian
anamorphesis

Advantages: computationally feasible for large-scale problems using standard simulation techniques
(MetropolisHastings algorithm, . . .), possibility of estimating probabilities for complex events

Disadvantages: all extremal dependencies are incorporated through S(x); marginal distributions are not
extremal; difficult to incorporate full range of possible extremal dependencies

Statistics of Extremes

January 2008 slide 150

62

HeffernanTawn (2004)


Use decomposition
Pr(Y C) =
where C =





d Cd

D
X

d=1

EYd {Pr(Yd Cd | Yd )f (Yd )} ,

and Cd Cc = , for c 6= d, with


Cd = C {y RD : FYd (yd ) > FYc (yc ), c = 1, . . . , D, c 6= d}

Use GPD for Yd and nonparametric linear model for conditional probability, and simulate to estimate
Pr(Y C).
Advantages: can deal with large D, extends to near-independence models
Disadvantages: models on different Cd need not be coherent, uniqueness of representation for conditional
probability unclear, inference messy

Statistics of Extremes

January 2008 slide 151

Max-stable processes
General setup:


Z(x) Frechet(1), for x X , so if the Zm () are independent and identically distributed, then
m1 max{Z1 (x), . . . , Zm (x)} Frechet(1)

Joint distribution at any subset D = {x1 , . . . , xd } of sites has multivariate extreme-value


distribution, and extremal coefficient
Z
max wd dH(w1 , . . . , wD ),
D =
SD dD

where D = |D|, SD is the unit simplex in RD , and H is a measure on SD with all its marginal
expectations equal to 1; we have 1 D D.
Statistics of Extremes

January 2008 slide 152

Extremal coefficient


Summary of dependence in subset of variables

If Z1 , . . . , Zd all marginally unit Frechet, and joint distribution is max-stable, then


Pr{Z1 z, . . . , Zd z} = exp(D /z),

z > 0,

where extremal coefficient D depends on subset.


Interpretation: 1 D d, where

= 1 corresponds to complete dependence of maxima in subset,


= d corresponds to independence of maxima in subset




Not a complete summary of joint distribution, but basis of useful diagnostics, as can estimate D quite
easily.
Rainfall example (sketch!)

Statistics of Extremes

January 2008 slide 153

63

Madogram


Exploratory tool (analogous to variogram), designed for dealing with many extremes on a map

F -madogram (Poncet, Cooley, Naveau, 2006/7?)


1
2 E[|F {Y

(x + h)} F {Y (x)}|] =

1 (h) 1
,
2 (h) + 1

where (h) is pairwise extremal coefficient at range h


Extended by Naveau et al. (2006) to estimation of dependence function Vh in expression
Pr {Y (x) y1 , Y (x + h) y2 } = exp {Vh (y1 , y2 )} ,
where
Vh (y1 , y2 ) = 2

max

w 1w
,
y1 y2

dHh (w).

Statistics of Extremes

January 2008 slide 154

Spectral representations I
de Haan (1984):


A continuous (in probability) max-stable process Z(x) may be expressed as


D

Z(x) = max Uk f (x Tk ),
k

where (Uk , Tk ) are the points of a Poisson process on R+ [0, 1] with measure du/u2 (dt),
where is a finite measure on [0, 1] and fk are non-negative L1 functions.


Interpretation in terms of storms of sizes Uk and shape f centred at Tk

Exploited by Smith (1990), Coles (1993), Coles and Walshaw (1994), de Haan and Pereira (2006)
with various choices of f (normal, t, Laplace, circular)

Can compute joint density of Z(x1 ), Z(x2 ), but no further

Statistics of Extremes

January 2008 slide 155

64

Spectral representations II
Schlather (2002):


Let V (x) be a stationary process on Rd with = E max{0, V (x)} < , and let be a Poisson process on
(0, ) with intensity 1 ds/s2 . If the Vs (x) are independent copies of V (x), then
Z(x) = max sVs (x),
s

is a stationary max-stable random process with unit Frechet margins.





Interpretation in terms of maxima of random sheets


Example: Vs () stationary istropic Gaussian processes with correlation (h), then
log Pr{Z(x1 ) z1 , Z(x2 ) z2 } =




1
2

1
1
+
z1
z2

1/2 !
{(h) + 1}z1 z2
1+ 12
(z1 + z2 )2

Corresponding extremal coefficient (h) = 1 + 21/2 {1 (h)}1/2 has natural bounds (Schlather and
Tawn, 2003)
Can only represent positive dependencebut most likely in practice

Statistics of Extremes

January 2008 slide 156

Pairwise likelihood


Data Z1 , . . . , Zn are from (unavailable) full joint density f (z1 , . . . , zn ; ), yielding log likelihood
() = log f (z1 , . . . , zn ; ).

If lower-order marginal densities are available, construct composite likelihood by taking product of
(non-independent!) densities.

Simplest example is pairwise log likelihood


2 () =

log f (zi , zj ; ),

i>j

constructed from all distinct disjoint pairs of observations.




iid

j
If is identifiable from the pairwise marginal densities, and if Z 1 , . . . , Z n f (z; ), with Z j = (Z1j , . . . , ZD
), then

under mild regularity conditions the maximum pairwise likelihood estimator is consistent, and satisfies
n
o

1 K()J(
)
1 as n .
N , J()

Statistics of Extremes

January 2008 slide 157

65

Example: Temperature data

Statistics of Extremes

January 2008 slide 158

Swiss summer temperature data


For illustration:



annual maximum temperature data at D = 17 Swiss sites, 19612006


fit GEV to standardised values Y (xd , tj ) with
dtj = d0 + d1 tj ,

d ,

d ,

d = 1, . . . , 17, j = 1, . . . , 46,

iid





and obtain Zdtj Frechet(1) using estimated probability integral transform


estimate surfaces for b0 , b1 , b, b using splines

use likelihood estimation to obtain bij for each pair of sites (model checking)

use pairwise likelihood to fit stationary isotropic covariance function


(u) = (1 1 ) + 1 exp{(u/2 )3 },

0 1 1, 2 , 3 > 0,

with estimated values 1 = 0.62 (0.2), 2 = 360 (100)km, 3 = 1.59 (1.29)




simulate max-stable random fields Z (x) from fitted model, then transform back to real scale Y (x) and
assess . . .

Statistics of Extremes

January 2008 slide 159

66

50.0

Station Engelberg

0.5

Station GdStBernard

2.0

10.0

10.0 50.0
0.5
1.0

5.0

50.0

0.5

50.0

50.0

10.0 50.0

Station Montana

2.0

0.5

2.0

10.0 50.0

Station Lugano

5.0

0.2

Station MontreuxClarens

1.0

5.0

50.0

Station Neuchatel

0.2

1.0

5.0

50.0

Station OeschbergKoppigen

10.0 50.0

Station DavosDorf

50.0

2.0

10.0 50.0
0.5

2.0

10.0 50.0
0.2 1.0 5.0

10.0 50.0
5.0

0.2

2.0
50.0

2.0
1.0

50.0

Station LocarnoMonti

0.5
0.2

5.0

0.5
0.2 1.0 5.0

Station Chateau dOex

1.0

2.0

50.0

10.0 50.0
5.0

0.2

10.0 50.0
5.0

0.5
1.0

50.0

0.5
1.0

2.0

10.0 50.0
2.0
0.5
0.2

5.0

2.0

10.0 50.0
2.0
0.2

Station BernLiebefeld

1.0

Station BaselBinningen

0.5

0.5

2.0

10.0 50.0

Station Bad Ragaz

50.0

0.5
0.2

0.5

10.0 50.0

0.5

2.0

10.0 50.0

0.5

Station Arosa

0.2 1.0 5.0

2.0

10.0 50.0

50.0

10.0 50.0

10.0

2.0

2.0

0.5

0.5

0.5

2.0

10.0 50.0
2.0
0.5

0.5

2.0

10.0 50.0

Fit of marginal model

50.0

0.5

Station Jungfraujoch

2.0

10.0 50.0

0.5

Station Satis

5.0

50.0

Station ZrichMeteoSchweiz

Statistics of Extremes

January 2008 slide 160

Estimated correlations

1.0
0.5
0.0
1.0

0.5

Coefficient estimated from Schlather model

0.5
0.0
0.5
1.0

Coefficient estimated from Schlather model

1.0

Individual points: MLEs of correlations 2 SE (grey bars) as a function of distance (103 km) between pairs of
points
Solid line: fitted correlation estimated using pairwise likelihood
Left: temperature data; right: simulated data

0.0

0.2

0.4

0.6

0.8

1.0

0.0

Distance between two stations


Schlather model coefficients (red points) and their confidence intervals (gray error bars) and Weibull model curve vs. distance.

Statistics of Extremes

0.2

0.4

0.6

0.8

1.0

Distance between two stations

January 2008 slide 161

67

Fit of correlation curve


Normal QQ plot of {g(b
ij ) g(
ij )}/SE, for Fisher z transform g

1
0
2

Sample Quantiles

Normal QQ Plot

Theoretical Quantiles

Statistics of Extremes

January 2008 slide 162

68

69
Statistics of Extremes

January 2008 slide 164

Scale is standardized relative to robust location and scale at each site


For illustration, following pages show simulated fields y (x, 2007) ordered according to values of
Z
X

y (x, 2007) dx

Simulated fields
5.0

50.0

5.0

50.0

50.0
0.1

0.5

50.0

0.5

5.0

50.0

5.0

0.5

5.0

50.0

5.0
0.5
5.0

50.0

pair(1,13) altitudes=(1840,405)
0.1

0.5

5.0

50.0

pair(1,16) altitudes=(1840,2490)
0.1

0.5

5.0

50.0

0.1

0.5

5.0

50.0

0.5

5.0
0.5

0.5

pair(1,9) altitudes=(1840,3580)
0.1

0.1

pair(1,12) altitudes=(1840,1508)
0.1

0.5
pair(1,15) altitudes=(1840,483)
0.1

0.1

50.0

5.0
50.0

pair(1,5) altitudes=(1840,985)
0.1

50.0

5.0

0.5
5.0

0.1

0.1

0.5

50.0

0.5

5.0
0.5

50.0

pair(1,8) altitudes=(1840,2472)
0.1

0.1

pair(1,11) altitudes=(1840,273)
0.1

0.5

5.0
0.5
pair(1,14) altitudes=(1840,485)
0.1

0.1

50.0

5.0
0.1

50.0

5.0

5.0
5.0

50.0

5.0

50.0

0.5

0.5

50.0

0.5

0.5

5.0
pair(1,10) altitudes=(1840,366)
0.1

pair(1,4) altitudes=(1840,565)
0.1

0.5

5.0
0.1

pair(1,7) altitudes=(1840,1035)
0.1

50.0

50.0

5.0

50.0
50.0

50.0

5.0

50.0

0.5

0.1

0.5

5.0

50.0
5.0
pair(1,3) altitudes=(1840,316)
0.1

0.5

5.0
0.1

0.5

5.0

50.0

0.5

0.5
0.1

0.1

0.5

5.0
0.1

0.5

50.0

50.0

5.0

pair(1,17) altitudes=(1840,556)
0.1

0.5

5.0

50.0

Pairwise fits

50.0

January 2008 slide 163


0.5

pair(1,6) altitudes=(1840,1590)
0.1

0.1

Statistics of Extremes

pair(1,2) altitudes=(1840,496)
0.1

500th hottest year out of 1000 realisations for 2007.


8

10

500th hottest year out of 1000 realisations for 2007.


8

10

45.5

45.5

47.5

47.5

47.0

47.0

46.5

46.5

46.0

46.0

48.0

48.0

Statistics of Extremes

January 2008 slide 165

898th hottest year out of 1000 realisations for 2007.


6

898th hottest year out of 1000 realisations for 2007.


10

10

45.5

45.5

47.5

47.5

47.0

47.0

46.5

46.5

46.0

46.0

48.0

48.0

Statistics of Extremes

January 2008 slide 166

899th hottest year out of 1000 realisations for 2007.


6

899th hottest year out of 1000 realisations for 2007.


10

10

45.5

45.5

47.5

47.5

47.0

47.0

46.5

46.5

46.0

46.0

48.0

48.0

Statistics of Extremes

January 2008 slide 167

Discussion


Schlather representation is very general, can envisage for other types of data:
rainfall time series at a single site, using addition of a random set (Mehdi)
rainfall in space and time?
snowfall in space (Juliette?)
etc.

Modelling issues:
What processes are sensible? Gaussian random fields? Levy processes?
How to extend to near-independence cases?
How to build in covariates? Smoothing? Pre-whitening or not?
How best to perform estimation and testing?
Bayesian inference?
How to build in physical knowledge of underlying processes?

Statistics of Extremes

January 2008 slide 168


70

You might also like