You are on page 1of 11

Original Article

Structural Health Monitoring


2016, Vol. 15(6) 639649

Dam safety prediction model The Author(s) 2016


Reprints and permissions:
sagepub.co.uk/journalsPermissions.nav
considering chaotic characteristics in DOI: 10.1177/1475921716654963
shm.sagepub.com
prototype monitoring data series

Huaizhi Su1,2, Zhiping Wen3, Zhexin Chen2 and Shiguang Tian4

Abstract
Support vector machine, chaos theory, and particle swarm optimization are combined to build the prediction model of
dam safety. The approaches are proposed to optimize the input and parameter of prediction model. First, the phase
space reconstruction of prototype monitoring data series on dam behavior is implemented. The method identifying
chaotic characteristics in monitoring data series is presented. Second, support vector machine is adopted to build the
prediction model of dam safety. The characteristic vector of historical monitoring data, which is taken as support vector
machine input, is extracted by phase space reconstruction. The chaotic particle swarm optimization algorithm is intro-
duced to determine support vector machine parameters. A chaotic support vector machinebased prediction model of
dam safety is built. Finally, the displacement behavior of one actual dam is taken as an example. The prediction capability
on the built prediction model of dam displacement is evaluated. It is indicated that the proposed chaotic support vector
machinebased model can provide more accurate forecasted results and is more suitable to be used to identify efficiently
the dam behavior.

Keywords
Dam safety, prediction model, support vector machine, chaos theory, particle swarm optimization

Introduction the trained SVM can be used to forecast the coming


dam behavior. Some research work has demonstrated
The methods in mathematics, mechanics, and infor- that the prototype monitoring data series on dam beha-
matics are often used to analyze the prototype monitor- vior has chaotic characteristics.9,10 The unordered
ing data on dam behavior and establish the prediction chaotic time series presents a certain regularity, which
model of dam safety. The dam behavior is identified is sensitive to initial value. It is difficult to forecast the
and assessed with the built model.1,2 In fact, the con- long-term behavior according to chaotic time series.
struction of prediction model is equivalent to a machine The short-term behavior is certain and predictable.
learning problem. The monitoring data series on dam After the phase space reconstruction of chaotic time
behavior is taken as the training sample set to imple- series is implemented, the original rule of chaotic
ment the training operation and obtain a learning
machine. Compared to other machine learning meth-
ods, support vector machine (SVM) has some charac- 1
State Key Laboratory of Hydrology-Water Resources and Hydraulic
teristics such as perfect theory, kernel technology, Engineering, Hohai University, Nanjing, China
2
global optimum, and good generalization ability. It is College of Water Conservancy and Hydropower Engineering, Hohai
University, Nanjing, China
more suitable to solve the nonlinear regression analysis 3
Department of Computer Engineering, Nanjing Institute of Technology,
problem with high dimension.38 Nanjing, China
The prototype monitoring data can reflect the 4
National Engineering Research Center of Water Resources Efficient
dynamic dam behavior under the action of environ- Utilization and Engineering Safety, Hohai University, Nanjing, China
mental and external loads. Furthermore, the remark-
Corresponding author:
able correlation exists in historical monitoring data on Huaizhi Su, State Key Laboratory of Hydrology-Water Resources and
dam behavior. So, the characteristic vector of historical Hydraulic Engineering, Hohai University, Nanjing 210098, China.
monitoring data can be extracted as SVM input and Email: su_huaizhi@hhu.edu.cn
640 Structural Health Monitoring 15(6)

system can be extracted from the historical data. The m are chosen. According to x(t), t, and m, a new series
nonlinear model is selected to approximate the dynamic is obtained as follows
system characteristics in the reconstructed phase space,
and the behavior within a certain period can be fore- X t = fxt, xt + t , . . . , xt + m  1t g 1
casted. Considering chaotic characteristics in prototype
where t = 1, 2, ., M, M = N 2 (m 2 1)t. This m
monitoring data series on dam safety, this article
dimensional state space, which consists of observed
adopts the phase space reconstruction technology to
value and delay time values, is the reconstructed phase
extract the characteristic vector of historical monitoring space. Obviously, the determination of delay time t
data, which is taken as SVM input. The chaotic sup- and embedding dimension m in equation (1) is a key
port vector machine (CSVM)-based prediction model step of phase space reconstruction.
of dam safety is studied. The delay time t and embedding dimension m are
The SVM parameters influence remarkably the gen- traditionally determined separately. In fact, the certain
eralization ability of SVM-based model. As the conven- correlation between above two parameters exists. The
tional approaches, the gird search method and genetic quality of phase space reconstruction is affected by
algorithm can be used to select the SVM parameters. independently choosing appropriate delay time t and
However, the gird search method needs to spend larger embedding dimension m and especially the determina-
calculation and longer operation time. The genetic tion of embedding window width t w = (m 2 1)t.
algorithm is often very complicated and needs to design In the C-C method proposed by Kim et al.,11 the
different crossover or mutation modes for different statistics constructed with the correlation integral of
optimization problems. The particle swarm optimiza- embedding time series is used to describe the correla-
tion (PSO), which searches the optimal solution tion of nonlinear time series. The delay time t and
through the individual collaboration, has the advan- embedding window width tw are estimated, and the
tages such as simple structure and easy implementation. embedding dimension is calculated. The C-C method
However, it is difficult to ensure the quality of initial has the advantages such as small calculation and easy
particles and is easy to fall into local optimal solution operation. A detailed description on the C-C method is
at a later stage. In this article, the chaotic particle as follows.
swarm optimization (CPSO) is introduced to obtain the Assume that x(n) (n = 1, 2, ., N) represents a mon-
optimal parameters of SVM model. itoring data series which will be taken to carry out the
phase space reconstruction, t is the delay time, and m is
the embedding dimension. The number of phase points
Phase space reconstruction and chaotic is M = N 2 (m 2 1)t. The number of vector pairs
characteristics identification of prototype related with above M points in the phase space is calcu-
monitoring data series on dam behavior lated. The vectors with the distances less than a given
positive number r are called related vectors. In the
Phase space reconstruction method of prototype reconstructed phase space, the correlation integral of
monitoring data series embedding time series X(i) is defined as the proportion
To analyze the chaotic time series, the phase space of related vector pair number in M2 possible matching
reconstruction needs to be implemented in the first pairs, namely
place. It is considered that the evolution characteristic
of any component of one system is determined by other 1 X M
C m, N , r, t = ur  kX i  X jk 2
interactional components and is the comprehensive M 2 i, j = 1
reflection of various components interaction. The devel-
opment process of any component implies the informa- where
tion on other components of one system. So, the phase 
space of one system can be reconstructed by revealing 0, x<0
u x = 3
the implicit information in the component time series. 1, x.0
For the observed components, their observations at the
The time series x(n) (n = 1, 2, ., N) is divided into
delay points of some fixed time are regarded as new
t disjoint subsequences with the length of N/t, namely
components. The new components can be taken to
reconstruct an equivalent phase space. fx1, xt + 1, x2t + 1, . . .g
Assume that x represents the observed component, fx2, xt + 2, x2t + 2, . . .g
x(t) (t = 1, 2, ., N) is a monitoring data series. The ..
phase space reconstruction can be fulfilled as follows. .
The appropriate delay time t and embedding dimension fxt, xt + t, x2t + t, . . .g
Su et al. 641

The statistic of each subsequence S(m, N, r, t) is cal- Chaotic characteristic identification method of
culated as follows prototype monitoring data series
Chaotic characteristics of prototype monitoring data
1X t
series can be identified by calculating the characteristic
S m, N , r, t = fCl m, N =t, r, t
t l=1 parameters on strange attractor of chaotic signal. The
Cl 1, N =t, r, tm g 4 common parameters include the Lyapunov exponent
or largest Lyapunov exponent describing the diver-
where Cl is the correlation integral obtained by the lth gence rate of adjacent track, the correlation dimension
subsequence. describing the attractor dimension, and the
S(m, N, r, t) reflects the autocorrelation characteris- Kolmogorov entropy reflecting the information fre-
tic of time series. If the time series is independent and quency. Above three parameters are attractor invar-
identically distributed, then for the certain m and t, iants. In this article, the largest Lyapunov exponent
when N!N, S(m, N, r, t) is equal to 0 for all r. and correlation dimension are used to identify the
However, the actual time series is finite and there is a chaotic characteristics of prototype monitoring data
significant correlation. Therefore, S(m, N, r, t) is not series on dam behavior.
equal to 0 in general. The local maximum time can be
selected when S(m, N, r, t);t is the null point or when
S(m, N, r, t);t shows the minimum changes for all Lyapunov exponent. Chaotic system is sensitive to the ini-
neighborhood radius r. Under above cases, the points tial value. The slight variation of initial system state
in phase space are almost uniformly distributed, and will cause the exponential divergence of system beha-
the chaotic trajectory is completely unfolded in phase vior with time, but it will eventually converge to a
space. So the maximum and minimum radiuses r corre- strange attractor. Lyapunov exponent is a parameter
sponding to S(m, N, r, t) are chosen, and the following describing the divergence rate of adjacent track.
difference is defined Lyapunov exponentbased method is used to identify
the chaotic characteristics of prototype monitoring
DS m, N , t = maxS m, N , ri , t data series according to the diffusion motion of phase
   orbit.
 min S m, N , rj , t , i 6 j 5
A chaotic system has at least one Lyapunov expo-
DS(m, N, t);t can be used to measure the maximum nent which is greater than zero. It is an important fea-
deviation of S(m, N, r, t) for all neighborhood radius r. ture distinguishing strange attractor and other
The null points of S(m, N, r, t);t are same for all m attractors. So, only the maximum Lyapunov exponent
and r, and the minimum values of DS(m, N, t);t are l1 is usually estimated to implement the chaotic identi-
also same for all m. The delay time t corresponds to the fication of actual dynamic systems with Wolf method,
minimum value among the local maximum times t. Jacobian method, and small data setsbased method.12
It is indicated that, when 2 < m < 5, s/2 < r < 2s, The calculation steps of small data setsbased method
and N 500, the good results can be obtained. s is the with small calculation and easy operation can be
standard deviation of time series. According to equa- described as follows:
tions (4) and (5), the three statistics are defined as
follows 1. Determine the delay time t, the embedding dimen-
sion m and the average period p.
1 X 5 X 4   2. Reconstruct the phase space
S9t = S m, N , rj , t 6
16 m = 2 j = 2
X t = fxt, xt + t , . . . , xt + m  1t g
X
5 t = 1, 2, . . . , M :
1
DS9t = DS m, N , t 7
4 m=2 3. Search the nearest point X(t#) of every point X(t)
in the phase space and limit short separation,
Scor t = jS9tj + DS9t 8 namely
In the C-C method, the delay time t is regarded as
dt 0 = minkX t  X t9k, jt  t9j.p 9
the minimum between two local maximum times t cor- ^t
responding to the first null point of S#(t) and the first
minimum of DS#(t), and the embedding window width where t# = 1, 2, ., M and t6t#.
tw is regarded as the time t corresponding to the mini- 4. Calculate the distance dt(i) of neighborhood point
mum of Scor(t). for every point X(t) in the phase space after i
642 Structural Health Monitoring 15(6)

discrete time steps, namely ln C r


D = lim 15
r!0 ln r
dt i = kX t + i  X t9 + ik, i = 1, 2, . . . ,
A double logarithmic curve of lnC(r)lnr can be
minM  t, M  t9 10 obtained by giving some certain values of r and calcu-
5. The geometric significance of the maximum lating the corresponding correlation integral C(r). The
Lyapunov exponent l1 is to quantify the exponent least square method is adopted to fit an optimal straight
divergence of initial closed orbits. So the following line on above curve. The straight line slope is called cor-
can be known relation exponent.
For a chaotic series, the correlation exponent will be
dt i = dt 0el1 iDt 11 increased with the increase in embedding dimension m,
finally tends to a saturation value. The saturation value
is the correlation dimension D2. For a random series,
where Dt is the time interval of monitoring data series. the correlation exponent will keep increasing as the
The following can be obtained using Log function on embedding dimension m increases, and there is no
both sides of equation (11) saturation phenomenon. Therefore, the embedding
dimension m is increased gradually and the correspond-
ln dt i = ln dt 0 + l1 iDt, t = 1, 2, . . . , M 12 ing correlation exponent is calculated. The chaotic
Obviously, the maximum Lyapunov exponent l1 series or random series can be distinguished according
can be approximately regarded as the slopes of the to the saturation phenomenon of calculated correlation
straight lines lndt(i)/Dt;i (t = 1, 2, ., M), which can exponent.
be obtained using the least square method, namely
CSVM-based prediction model of dam
1
yi = hln dt ii 13 safety
Dt
Considering the chaotic characteristics in prototype
where \. represents the average value of all t. A linear
monitoring data series on dam behavior, SVM and
area of y(i);i is selected and the least square method is
chaos theory are combined into chaotic SVM (CSVM),
adopted to generate a regression straight line. The slope
which is used to build the prediction model of dam
of above straight line is the maximum Lyapunov expo-
safety.
nent l1.

Correlation dimension. A key feature of chaotic system is CSVM input determination


that it has the strange attractor in the phase space. A The appropriate delay time t and embedding dimension
basic mathematical parameter describing the strange m are chosen to implement the phase space reconstruc-
attractor is its dimension. G-P algorithm is extensively tion of chaotic monitoring data series on dam behavior,
applied to the correlation dimension calculation of above x(t) (t = 1, 2, ., N). The following phase space vector
attractor according to the time series.13,14 The basic prin- sequence is obtained
ciple of G-P algorithm is introduced as follows.
After the monitoring data series, x(t) (t = 1, 2, ., X t = fxt, xt  t , . . . , xt  m  1t g,
N), is taken to implement the phase space delay recon- t = 1 + m  1t, . . . , N 16
struction, a set of space vectors as follows can be
obtained It can be known from Takens embedding theorem
that, as long as the chosen delay time t and embedding
X t = fxt, xt + t , . . . , xt + m  1t g, dimension m are appropriate, the trajectory in the
t = 1, 2, . . . , M reconstructed phase space is the dynamic equivalence
of original system in the sense of differential homeo-
A neighborhood radius r is given and the correlation morphism. So, there is a function relationship as
integral C(r) can be calculated as follows follows
1 X M
X t + h = F X t = F fxt, xt  t , . . . , x
C r = 2
ur  kX i  X jk 14
M i, j = 1 t  m  1t g 17
Obviously, the calculation results of correlation inte- where h is the prediction step length.
gral are related to the value of r. Correlation dimension The change process of X(t)!X(t + h) can reflect the
D is defined as follows evolution of original unknown dynamic system. For the
Su et al. 643

case such as h = 1, the following first component of variables with same number of optimized variables are
X(t + h) can be obtained from equation (17) generated. The chaos is added to the optimized vari-
ables with a similar way of signal carrier and the vari-
xt + 1 = f fxt, xt  t , . . . , xt  m  1t g 18 ables present the chaotic state. At the same time, the
ergodic range of chaotic motion is extended to the
Obviously, {x(t), x(t 2 t), ., x[t 2 (m 2 1)t]} can
value range of optimized variables. Then, the chaotic
be regarded as the input vector of SVM which is used
variables are directly taken to search. Because the chao-
to fulfill the regression prediction of x(t + 1).
tic motion has the characteristics such as randomness,
ergodicity, and sensitivity to initial conditions, the
CSVM parameter selection chaos-based search is better than other random
The SVM parameter selection has a great influence on searches.
the learning effect and generalization ability of SVM The selection on penalty factor C and RBF kernel
which is used to carry out the regression analysis.1518 function parameter g is equivalent to solving the two-
The gird search method is usually used to select the dimensional (2D) optimization problem as follows
penalty factor C and the parameter g of radial basis
min f x1 , x2
function (RBF) kernel function. In fact, it searches the 20
s:t: ai <xi <bi , i = 1, 2
optimal parameter combination from an exhaustive
parameter combination series, which needs to exhaust where the optimized variable x1 represents the penalty
long time. Therefore, some intelligent optimization factor C, the optimized variable x2 represents the para-
algorithms such as PSO are combined with SVM to meter g of RBF kernel function. The objective function,
search the SVM parameters. PSO first initializes ran- namely, the fitness value f, is to average the forecast
domly a particle group in a search space. The position mean square errors of k validation sets. Considering
of each particle is a solution of optimization problem. that the big penalty factor C can cause the decrease in
Each particle has a fitness value measuring its perfor- model generalization ability, the penalty factor C
mance, and there is a speed which decides the flight should be controlled as small as possible under the pre-
direction and distance of each particle. Then, the parti- mise of ensuring a certain prediction accuracy on cross
cles track the current optimal particle, dynamically validation of training set.
adjust their speeds and positions, and the optimal solu- The CPSO can be used to select the penalty factor C
tion is found by iteration. In each iteration, the particle and the RBF kernel function parameter g as follows.
updates itself by tracking two following extremes. One The flowchart is shown in Figure 1:
is the individual extreme pbest, namely, the current
optimal solution found by the particle itself. Another is 1. Implement the chaotic particle initialization.
the global extreme gbest, namely, the current optimal  Generate randomly one 2D vector z1 = (z11, z12),
solution found by the entire population. Compared
and its components are between 0 and 1. According
with genetic algorithm, PSO algorithm finds the opti-
to equation (20), zi + 1j = 4zij(1 2 zij), i = 1, 2, .,
mal solution through individual collaboration. The
N 2 1 and j = 1, 2. N particles, z1, z2, ., zN, can
PSO algorithm has two shortcomings.19 One is that the
be obtained.
initialization process is random, but the quality of each  Add the components of zi to the optimized variables
particle cannot be guaranteed. Another is that it is easy
with the signal carrier way. xij = aj + (bi 2 aj)zij,
to fall into the local optimal solution at a later stage.
i = 1, 2, ., N and j = 1, 2. N particles, x1, x2, .,
As an improved PSO, the CPSO can take advantage
xN, can be generated.
of chaotic motion ergodicity to find a particle swarm  Calculate the fitness values, fN 3 1, for all particles,
with good individual quality when the particle swarm is
and select m ones with better performance among N
initialized. The chaotic disturbance on the particles is
particles to form initial particle swarm xm 3 2. For
carried out so as to make the solution out of local
above selected m particles, their fitness values fi and
extreme interval when the particles are updated.
the first components xi1 (namely, penalty factor C)
Logistic mapping is a typical chaotic system, and its
are both smaller. The initialization speed is vm 3 2
iterative formula can be expressed as follows
= zeros(m, 2).
 Initialize the individual extreme, pxbest0 = xm 3 2.
zi + 1 = mzi 1  zi , i = 0, 1, 2, . . . , m 2 2, 4 19
The corresponding fitness value is pfbest0 = fm 3 1.
When the control parameters m = 4, 0 < z0 < 1, Initialize the global extreme, gxbest0, and the corre-
logistic is completely in a chaotic state. According to sponding fitness value, gfbest0, according to the fit-
the chaotic motion characteristics, the optimal search ness values fi and the first components xi1 (namely,
can be implemented as follows. A set of chaotic penalty factor C) of m particles.
644 Structural Health Monitoring 15(6)

7. Update the global extreme of particle swarm. If the


fitness value on individual extreme of the particle
xi is pfbestki \ gfbestk 2 1 or pfbestki-gfbestk
2 1 < e and the first component of individual
extreme is pxbestki1 \ gxbestk 2 1,1, then the global
extreme of particle swarm is gxbestk = pxbestki,
and the corresponding fitness value is gfbestk =
pfbestki. Otherwise, gxbestki = gxbestk 2 1i and
gfbestk = gfbestk 2 1.
8. End the cyclic operations. According to the fitness
values gfbestk and the first components gxbestk1 on
the kmax global extremes, which are obtained in the
kmax iterations, the appropriate penalty factor C
and RBF kernel function parameter g are
determined.

Construction process for prediction model of dam


Figure 1. CSVM parameter selection flowchart.
safety with CSVM
Assume that x(t) (t = 1, 2, ., N) represents a monitor-
2. Generate randomly one 2D vector u0 = (u01, u02), ing data series on dam safety. According to the delay
and its components are between 0 and 1. time t and embedding dimension m chosen with the
3. For the iteration number k = 1:kmax and the parti- proposed way, the phase space reconstruction of x(t) is
cle i = 1:m, the following cyclic operations are implemented and the chaotic characteristic is identified.
implemented. If the time series is chaotic, a sample set X(t), x(t + 1)},
4. Update the speeds of all particles, which are within where t = 1 + (m 2 1)t, ., N 2 1, is constructed.
vmax X(t)2Rm is the input feature vector, which is obtained
according to equation (16). x(t + 1)2R is the output
vki = c0 vk1i + c1 pxbestk1i  xk1i + c2 gxbestk1i  xk1i variable. The first n sets of observed data are taken as
the training samples to implement the training opera-
21
tion of CSVM and obtain the prediction model of dam
5. Update the particle position with chaotic safety.
disturbance.
 According to equation (19), u1j = 4u0j(1 2 u0j),
j = 1, 2, obtain u1 = (u11, u12).
Example analysis
 Put each component of u1 to the chaotic distur- One roller compacted concrete gravity dam with a max-
bance range [2b, b] with the signal carrier way, imum height of 113.0 m, a crest length of 308.5 m, and
Dxj = 2b + 2bu1j, j = 1, 2. Obtain the distur- a crest elevation of 179.0 m is taken as an example to
bance variable Dx = (Dx1, Dx2). demonstrate the proposed model. This dam (Figure 2)
 Let u0 = u1. consists of six dam sections which are numbered 16
 Calculate xki = xk 2 1i + vki, x9ki = xk1i + vki + Dx from left bank to right bank. The normal storage water
and the corresponding fitness values fki, fki9 . If fki9\fki level and the check flood level are 173.00 and 177.80 m,
or jfki9\fki j<e and the first component at the particle respectively. The pendulum measurements (Figure 2)
position is x9ki1\xki1 , then update the particle posi- were installed to measure the horizontal displacement
tion into xki = x9ki and the corresponding fitness value of dam crest and dam body. The monitoring system
is fki = fki9 . was put into operation in October 2002. In this article,
6. Update the extreme of individual particle. If fki \ the horizontal displacement along the river of No. 5
pfbestk 2 1i or fki-pfbestk 2 1i < e and the first dam section crest is analyzed with the proposed
component at the particle position is xki1 \ method. Figure 3 shows the time curve of horizontal
pxbestk 2 1i1, then the individual extreme of the displacement measured daily from 1 January 2003 to 31
particle xi is pxbestki = xki, and the corresponding December 2007. Based on the observations of pendu-
fitness value is pfbestki = fki. Otherwise, lum measurement, the proposed method is used to
pxbestki = pxbestk 2 1i and pfbestki = pfbestk 2 1i. build the prediction model of dam displacement. The
Su et al. 645

Figure 2. Layout of pendulum measurements observing horizontal displacement.

Figure 3. Time curve on observed horizontal displacement of No. 5 dam section crest.

fitting and forecasting ability of built model and con- reconstruction of monitoring data series, namely x(t),
ventional model is compared. t = 1, 2, ., 1825, is implemented. The phase space
vector series is obtained as follows

Phase space reconstruction of monitoring data series


on dam displacement
The key to reconstruct the phase space of monitoring
data series on dam displacement, namely, x(t), t = 1,
2, ., 1825, is to determine the related parameters.
Equations (6)(8) in the mentioned C-C method are
used to determine the delay time t and embedding
dimension m. Figure 4 shows the relation curves
between S#(t)t, DS#(t)t, and Scor(t)t. It can be seen
from Figure 4 that the time when DS#(t) reaches the
first minimum is earlier than that when S#(t) reaches
null point. So, the delay time is regarded as the time
when DS#(t) reaches the first minimum, t = 26. The
embedding window width is taken as the time t corre-
sponding to the minimum of Scor(t), tw = 73. The cal-
culated embedding dimension is m = 4.
According to the delay time t and embedding dimen- Figure 4. Relation curves between S#(t)t, DS#(t)t, and
sion m determined above, the phase space Scor(t)t.
646 Structural Health Monitoring 15(6)

Figure 5. Relation curve between y(i) and i.

Figure 7. Relation curve between d(m) and m.

is m = 1, 2, ., 10. The correlation integrals C(r) are


Figure 6. Relation curve between lnC(r) and lnr. calculated with equation (14) and the relation curve
between lnC(r) and lnr can be given, which is shown in
X t = fxt, xt  26, xt  52, Figure 6. It can be seen from Figure 6 that there are
xt  78g, t = 79, 80, . . . , 1825 22 obvious straight line segments in 10 curves. The straight
lines are fitted using the least square method and their
slopes, namely, correlation exponents d(m), can be
Chaotic characteristics identification of monitoring obtained. Figure 7 shows the relation curve between
data series on dam displacement d(m) and m. It can be seen from Figure 7 that the corre-
lation exponent increases as the embedding dimension
Maximum Lyapunov exponent calculation. The small data m increases. When m = 6, it tends to the saturation
setsbased method is adopted to calculate the maxi- value (1.85) which is the correlation dimension D2.
mum Lyapunov exponent of monitoring data series, It can be known from above analysis that for the
namely, x(t), t = 1, 2, ., 1825. The delay time t and monitoring data series on dam displacement, x(t),
embedding dimension m determined above are 26 and t = 1, 2, ., 1825, its maximum Lyapunov exponent
4, respectively. y(i) is calculated with equation (13). The l1 is greater than zero, and its correlation dimension
relation curve between y(i) and i can be given as Figure D2 is the fraction. So the chaotic characteristic exists in
5. The least square method is used to fit the linear area the monitoring data series on dam displacement. The
of above relation curve between y(i) and i. It can be proposed method can be used to build the prediction
known that the straight line slope, namely the maxi- model of dam displacement.
mum Lyapunov exponent l1, is 0.0087.

CSVM-based prediction model of dam displacement


Correlation dimension calculation. G-P algorithm is intro-
duced to calculate the correlation dimension of moni- and its performance analysis
toring data series, namely, x(t), t = 1, 2, ., 1825. The A sample set, {X(t), x(t + 1)}, t = 79, 80, ., 1824, is
phase space is reconstructed under the conditions that constructed, where X(t)2R4 is the input feature vector
the delay time is t = 26 and the embedding dimension obtained according to equation (22) and x(t + 1)2R is
Su et al. 647

Figure 8. Search results on optimal parameters by gird search method.

Table 1. CPSO algorithm parameters. cross validation mean square error (CVMSE). When
the CVMSE difference between two parameter sets is
Parameter Value not more than 1025, the parameter set with smaller C is
Random particle number (N) 50 better.
Chaotic particle number (m) 10 The gird search method uses the logarithmic form,
Maximum iteration number (kmax) 10 namely, log2 C and log2 g, to construct the gird. The
Maximum particle speed (vmax) 0.6 3 28 change step length of log2 C and log2 g is 0.8, namely,
Acceleration constant (c1) 1.5 3 rand the change step length of C and g is 1.7411. Figure 8
Acceleration constant (c2) 1.7 3 rand
Chaotic disturbance range [2b, b] [21, 1] shows the search results on optimal parameters by the
gird search method. The obtained optimal parameters
are that the penalty factor C = 84.4485 and the RBF
kernel function parameter g = 0.0068. The corre-
the output variable. For the 1746 sample points of dam sponding CVMSE is 0.0026, and the consuming time is
displacement, the 1382 sample points observed from 168 s.
2003 to 2006 forms a training set in order to establish CPSO algorithm takes x = (C, g) as the particle,
the prediction model of dam displacement, and the 364 and CVMSE as the fitness value. Table 1 lists the para-
sample points observed in 2007 forms a test set in order meters related to CPSO algorithm. The search results
to judge the prediction performance of built model. on optimal parameters by CPSO algorithm are shown
in Figure 9. The obtained optimal parameters are that
the penalty factor C = 63.6498 and the RBF kernel
Model parameters selection. RBF kernel function is function parameter g = 0.0039. The corresponding
selected and the insensitive loss function parameter is CVMSE is 0.0026, and the consuming time is 88 s.
e = 0.001. The grid search method and the CPSO are
used to determine the penalty factor C and the para-
meter g of RBF kernel function. The ranges of C and g Prediction model of dam displacement and its
are both set as 22828. The evaluation indexes of model performance. The determined parameters and training
performance are taken as the penalty factor C and the set of CSVM are taken to train CSVM. The
648 Structural Health Monitoring 15(6)

Figure 9. Search results on optimal parameters by CPSO algorithm.

Figure 10. Calculated results of CSVM-based model with optimal parameters searched by CPSO algorithm.

Table 2. Fitting and prediction abilities of two CSVM-based models with optimal parameters searched by CPSO (Model I) and by
gird search method (Model II).

Model Fitting ability Prediction ability Consuming time (s)


Mean square Squared correlation Mean square Squared correlation
error coefficient r2 error coefficient r2

Model I 2.37e25 0.9996 2.36e25 0.9995 168


Model II 2.38e25 0.9996 2.32e25 0.9995 88

CSVM-based prediction model of dam displacement displacement; however, CPSO algorithm spends less
can be obtained. Figure 10 shows the fitting and pre- time than the gird search method on searching the
diction results of CSVM-based prediction model with optimal parameters.
the optimal parameters searched by CPSO algorithm.
Table 2 lists the evaluation indexes on fitting and pre-
Conclusion
diction performances of two CSVM-based prediction
models with the optimal parameters searched by Considering the chaotic characteristics in prototype
CPSO algorithm and gird search method. It can be monitoring data series on dam behavior, the learning
seen from Table 2 that two models are well matched method of SVM and chaos theory are integrated to
in the fitting and prediction abilities of dam study the establishment problem on prediction model
Su et al. 649

of dam safety. The algorithms on phase space recon- References


struction and CPSO are developed to determine the 1. Mata J, Tavares de Castro A and Sa da Costa J. Con-
input feature vector and parameters of CSVM-based structing statistical models for arch dam deformation.
learning model: Struct Control Hlth 2014; 21(3): 423437.
2. Xu C, Yue D and Deng C. Hybrid GA/SIMPLS as alter-
1. Based on the reconstructed phase space of proto- native regression model in dam deformation analysis. Eng
Appl Artif Intel 2011; 25(3): 468475.
type monitoring data series on dam behavior, a
3. Rankovic V, Grujovic N, Divac D, et al. Development of
method is presented to identify the chaotic char- support vector regression identification model for prediction
acteristics in monitoring data series on dam of dam structural behaviour. Struct Saf 2014; 48: 3339.
behavior. 4. Lopez FJM, Puertas SM and Arriaza JAT. Training of
2. SVM is introduced to build the prediction model support vector machine with the use of multivariate nor-
of dam safety. The phase space reconstruction is malization. Appl Soft Comput 2014; 24: 11051111.
implemented to extract the feature vector from the 5. Widodo A, Kim EY, Son JD, et al. Fault diagnosis of low
historical data in prototype monitoring data series speed bearing based on relevance vector machine and support
with chaotic characteristics, which is regarded as vector machine. Expert Syst Appl 2009; 36(3): 72527261.
SVM input. The CPSO algorithm is adopted to 6. Smola AJ and Scholkopf B. A tutorial on support vector
determine the key parameters of SVM. A CSVM regression. Stat Comput 2004; 14(3): 199222.
7. Samanta B, Al-Balushi KR and Al-Araimi SA. Artificial
with optimal input and parameters is proposed to
neural networks and support vector machines with
fulfill the prediction model establishment of dam
genetic algorithm for bearing fault detection. Eng Appl
safety. Artif Intel 2003; 16(78): 657665.
3. The displacement of one actual dam is taken as an 8. Kuang FJ, Xu WH and Zhang SY. A novel hybrid
example to verify the modeling efficiency and fore- KPCA and SVM with GA model for intrusion detection.
casting ability. It is indicated that, because of SVM Appl Soft Comput 2014; 18: 178184.
key parameters and input vector optimized, the 9. Gu CS, Zhao EF, Jin Y, et al. Singular value diagnosis in
modeling time is greatly reduced under the condi- dam safety monitoring effect values. Sci China Ser E
tion of ensuring the forecasting precision. By con- 2011; 54(5): 11691176.
sidering the chaotic characteristics in prototype 10. Su HZ, Wen ZP, Wang F, et al. Multifractal scaling beha-
monitoring data series on dam behavior, the pro- vior analysis for existing dams. Expert Syst Appl 2013;
40(12): 49224933.
posed CSVM-based model can describe more rea-
11. Kim HS, Eykholt R and Salas JD. Nonlinear dynamics,
sonably the dam structural behavior.
delay times, and embedding windows. Physica D 1999;
127: 4860.
Acknowledgements 12. Rosenstein MT, Collins JJ and De Luca CJ. A practical
The authors thank the reviewers for useful comments and sug- method for calculating largest Lyapunov exponents from
gestions that helped to improve the paper. small data sets. Physica D 1993; 65: 117134.
13. Grassberge P and Procaccia I. Measuring the strangeness
of strange attractors. Physica D 1983; 9(12): 189208.
Declaration of Conflicting Interests 14. Grassberger P and Procaccia I. Characterization of
The author(s) declared no potential conflicts of interest with strange attractors. Phys Rev Lett 1983; 50(5): 346349.
respect to the research, authorship, and/or publication of this 15. Suykens JAK, Brabanter JD, Lukas L, et al. Weighted least
article. squares support vector machines: robustness and sparse
approximation. Neurocomputing 2002; 48(14): 85105.
16. Su HZ, Wen ZP and Wu ZR. Study on an intelligent
Funding inference engine in early-warning system of dam health.
The author(s) disclosed receipt of the following financial sup- Water Resour Manag 2011; 25(6): 15451563.
port for the research, authorship, and/or publication of this 17. Wang XH, Mao HL, Zhu CM, et al. Damage localiza-
article: This research has been partially supported by tion in hydraulic turbine blades using kernel-independent
National Natural Science Foundation of China (SN: component analysis and support vector machines. Proc
51579083, 41323001, 51139001, and 51479054), Jiangsu IMechE, Part C: J Mechanical Engineering Science 2009;
Natural Science Foundation (SN: BK2012036), the Doctoral 223(2): 525529.
Program of Higher Education of China (SN: 18. Wu ZH and Huang NE. Ensemble empirical mode
20130094110010), Open Foundation of State Key Laboratory decomposition: a noise assisted data analysis method.
of Hydrology-Water Resources and Hydraulic Engineering Adv Adapt Data Anal 2009; 1(1): 141.
(SN: 20145027612), the Fundamental Research Funds for the 19. Kang F and Li JJ. Artificial bee colony algorithm opti-
Central Universities (SN: 2015B25414) and a project funded mized support vector regression for system reliability
by the Priority Academic Program Development of Jiangsu analysis of slopes. J Comput Civil Eng 2015; 30(5):
Higher Education Institutions (SN: 3014-SYS1401). 04015040.