You are on page 1of 4

Computer Physics Communications 147 (2002) 729–732

www.elsevier.com/locate/cpc

Generalized evolutionary programming with


Lévy-type mutation
Masao Iwamatsu 1
Department of Information and Computer Engineering, Kisarazu National College of Technology, Kisarazu City, Chiba 292-0041, Japan

Abstract
A new generalized evolutionary programming with Lévy-type mutation is proposed. The Lévy-type distribution is know to
reproduce Gaussian, Cauchy, and Student’s t-distributions and characterized by a power-law fat-tail. This new evolutionary
programming is tested for five standard test functions. The average performance of the new algorithm with Lévy-type mutation
for hard optimization problems is superior to the original evolutionary programming with Gaussian mutation.  2002 Elsevier
Science B.V. All rights reserved.
PACS: 02.60.Pn; 02.70.Tt

Keywords: Evolutionary programming; Lévy distribution

1. Introduction planning of a cooling schedule in SA [4] is unneces-


sary in EP. In our previous paper [5], EP was success-
Evolutionary programming [1] is one of the gen- fully used to find the lowest energy structure of silicon
eral metaheuristics known as evolutionary algorithm clusters.
(EA) [2] which includes genetic algorithm (GA), evo- In this paper, we generalized the classical EP
lution strategies (ES), and evolutionary programming (CEP) [1,6] whose mutation (perturbation) follows the
(EP). Among EA, EP [1] is attractive because it does Gaussian distribution (Gaussian mutation), and pro-
not rely on any gradient information and relies solely posed a more general EP (GEP) with the Lévy-type
on mutation (perturbation in physics) and yet its global mutation. A similar generalization was already pro-
convergence is guaranteed [3]. Therefore, EP is akin to posed by Kappler [7] in evolutionary strategies (ES)
the popular simulated annealing (SA) [4]. In contrast by replacing the Gaussian mutation by the Cauchy mu-
to SA, however, intensification (convergence) of the
tation. Later, Yao et al. [6] introduced the fast evo-
search is automatically realized by the self-adaptation
lutionary programming (FEP) in which the Gaussian
of mutation [1,2] in EP. Therefore, the cumbersome
mutation in EP was replaced by the Cauchy mutation.
Our GEP made a step forward since the Lévy-type dis-
E-mail address: iwamatsu@ph.ns.musashi-tech.ac.jp tribution is a general class of distribution which in-
(M. Iwamatsu).
1 Present address: Department of Physics, General Education cludes the Gaussian, the Cauchy, and the Student’s
Center, Musashi Institute of Technology, Setagaya-gu, Tokyo 158- t-distributions [8]. Therefore, our GEP includes CEP
8557, Japan. as well as FEP. The Lévy flight search pattern has
0010-4655/02/$ – see front matter  2002 Elsevier Science B.V. All rights reserved.
PII: S 0 0 1 0 - 4 6 5 5 ( 0 2 ) 0 0 3 8 6 - 7
730 M. Iwamatsu / Computer Physics Communications 147 (2002) 729–732

also been recently discovered in the foraging behavior comparison, if the solution offers at least as low
of albatross [9]. A similar generalization using Lévy- value f (xi ) as the randomly selected opponent, it
type perturbation was proposed in simulated annealing receives a win.
(SA) by Tsallis and Stariolo [10]. (7) The M solutions out of xi and xi , i = 1, . . . , M,
that have the most “wins” are selected to be par-
ents of the next generation. The best solution is
2. Algorithm
always chosen and global convergence is guaran-
teed [3].
The classical evolutionary programming (CEP) to
(8) If the available computational resources are ex-
search for the global optimum of the objective function
hausted, then halt; else proceed to step (3) and
f (xi ) of n-dimensional space is as follows [1,6]:
continue to iterate.
(1) The initial population comprises M trial solutions,
In our generalized evolutionary programming
each taken as a pair of real-valued vectors, xi and
(GEP), we replaced the Gaussian random number
σi , i = 1, . . . , M, with their dimension n, where xi
σ N(0, 1) = N(0, σ ) in Eq. (1) of CEP by the Lévy-
are the variables of the objective function f (xi ) to
type random number which follows the Lévy-type dis-
be optimized and σi the standard deviation for the
tribution characterized by a real number q (1  q <
Gaussian mutation. The components of each xi ,
3):
i = 1, . . . , M, are selected in accordance with a  1   1 

q − 1 Γ q−1 /Γ q−1 − 2
uniform distribution ranging over the given range 1

of the variables. The initial components σi are all pq (x) = , (3)


πσ 2 (1 + (q − 1)(x/σ )2 )1/(q−1)
fixed to 3.0 [6].
(2) Each solution xi , i = 1, . . . , M, is scored with which was derived from the Tsallis generalized sta-
respect to the given objective function f . tistical mechanics [8]. This distribution function be-
(3) Each parent (xi , σi ), i = 1, . . . , M creates a single comes the Gaussian distribution in the limit q → 1+ :
offspring (xi , σi ) by: 1  
p1 (x) = √ exp −(x/σ )2 , (4)
xi (j ) = xi (j ) + σi (j )Nj (0, 1), (1) πσ 2
 
σi (j ) = σi (j ) exp τ  N(0, 1) + τ Nj (0, 1) , (2) and it becomes the Cauchy distribution
1 1
where xi (j ), xi (j ), σi (j ), and σi (j ), j = 1, . . . , n p2 (x) = , (5)
denotes the j th component of the vector xi , xi , σi , πσ 1 + (x/σ )2
and σi , respectively. N(0, 1) denotes a Gaussian when q = 2 and Student’s t-distribution when q is
random variable with mean 0 and standard devi- rational numbers q = (3 + m)/(1 + m) (0 < m <
ation 1. Nj (0, 1) indicates that the Gaussian ran- ∞). In general, q can be any real number (fractal)
dom number is generated anew for each compo- and the distribution is characterized by a fat tail with
nent j. The
√ factors τ and τ√ are commonly set to asymptotic power law distribution ∼ x 2/(q−1) .
τ = ( 2 n ) and τ  = ( 2n )−1 [2,6].
−1

(4) If the values σi (j ) are smaller than 10−4 , they


are set equal to the nominal value of 10−4 [1]. 3. Experiments
This replacement is necessary to avoid premature
convergence to local optima [11]. Using our GEP, we conducted experiments for five
(5) Each offspring vector xi , i = 1, . . . , M is evalu- test functions listed in Table 1; Sphere model (f1 )
ated in the light of the objective function. is a simple concave function with single minimum.
(6) Pairwise comparisons are conducted over the Griewank (f2 ), Rastrigin (f3 ), and Ackley (f4 ) models
union of parents xi and offspring xi , i = 1, . . . , M. are all multi-modal functions with many local minima.
For each solution, Q = 10 randomly selected Finally, the Schekel model has a few local minima.
opponents are chosen from among all parents The dimensions of the models are all fixed to
and offspring with equal probability. In each n = 10 except for the Shekel model for which the
M. Iwamatsu / Computer Physics Communications 147 (2002) 729–732 731

Table 1
Test functions used in the experiment. In the Shekel model, x and
ai , i = 1, . . . , 5 are four dimensional vectors. The 4 elements of 5
vectors ai , i = 1, . . . , 5 and 5 element of ci in Shekel model are
found, e.g., in Ref. [6]
Name Function (range) Minimum
n 2
f1 Sphere i=1 xi f1 (0, . . . , 0)
(−100 < xi < 100) =0
1 n x 2 − n cos √
f2 Griewank 4000
xi 
+1 f2 (0, . . . , 0)
i=1 i i i
(−600 < xi < 600) =0
n  2 
f3 Rastrigin i xi + 10 cos(2π xi ) + 10 f3 (0, . . . , 0)
(−5.12 < xi < 5.12) =0
  
f4 Ackley −20 exp −0.2 n1 ni=1 xi2 f4 (0, . . . , 0)
 1 n 
− exp n i=1 cos(2π xi ) + 20 + e =0
(−32 < xi < 32) Fig. 1. The mean of 50 bests of 50 trials of the Sphere model (f1 ) as
  −1 a function of generation for various q. Note that q = 1.01 is close to
f5 Shekel − 5i=1 (x − ai )(x − ai )T + ci f5 (ai )
CEP and q = 2.0 is FEP. The performance of CEP (q = 1) is better
0  xi  10 1/ci
than GEP with q > 1.

Table 2
Comparison of CEP (with Gaussian) and GEP (with Lévy (q =
2.5)). The means of bests (mean) of 50 trials at the last generation
and the standard deviations (STD). The t-test by a two-tailed test
tells us whether the difference of two means is meaningful
Gene- CEP GEP CEP-GEP
ration (q = 1.01) (q = 2.5) t-test
Mean (STD) Mean (STD)
f1 1000 8.19 × 10−9 2.14 × 10−7 −18.45
(2.03 × 10−9 ) (7.87 × 10−8 )
f2 5000 1.934 (3.526) 0.459 (1.062) +2.833
f3 2000 3.542 (3.061) 1.782 (1.819) +3.494
f4 5000 13.93 (5.582) 10.65 (5.388) +2.992
f5 100 −6.868 (3.093) −5.832 (3.002) −1.726
Fig. 2. The mean and the best of 50 bests at the last generation as a
function of q for the Griewank model (f2 ). Now, the performance
dimension is fixed to n = 4. The number of population of GEP is better than CEP.
is fixed to M = 50, and opponent in the step (6) of the
algorithm is Q = 10. The numbers of generation are Fig. 2 shows the mean of the 50 best at the last
fixed according to the difficulty of models as shown in generation (at 5000 generation) for the Griewank
Table 2. model (f2 ). In marked contrast to the simple Sphere
We performed the experiment 50 times for each q model, the average performance of GEP with q >
and collected the average and standard deviation of 1 is clearly superior to CEP (q = 1). The mean
the 50 best solutions of 50 trials to assess the average becomes the lower, the larger q becomes. Since
performance of GEP. Fig. 1 shows the mean of best as the objective function f2 has numerous minima, the
a function of generation for the Sphere model (f1 ). We Lévy-type mutation with fat tail (realized by large
found that the mean with smaller q is better at the last q) is advantageous because an occasional long jump
generation (1000 generation), which suggests that the produced from the fat-tailed distribution can be used
performance of CEP is better than GEP. The reason to escape from local minimum.
is simple: When we have a single minimum, the long We compared in Table 2 the performance of CEP
jump from the Lévy-type mutation will deteriorate the (approximated by GEP with q = 1.01) and GEP
search because solutions xi will be scattered. with q = 2.5. From this table, we concluded that
732 M. Iwamatsu / Computer Physics Communications 147 (2002) 729–732

the performance of GEP is superior to CEP for only one or few optima. Our GEP with the Lévy-type
hard optimization problems f2 ∼f4 with many local mutation will be effective to many optimization prob-
minima. However CEP is better than GEP for simple lems encountered in physics and chemistry.
problems f1 , f5 .
The difference of two means obtained from CEP
and GEP can be tested using t-test. From the values Acknowledgements
of t-test of Table 2, we concluded that the difference
of average is meaningful for f1 ∼f4 . Therefore, GEP The author would like to thank Mr. K. Tanaka for
is certainly better than CEP for hard problems, but his help at the initial stage of this work.
GEP cannot do better for simple problems. Similar
advantage and disadvantage of FEP over CEP were
found by Yao et al. [6]. They noted that it is an example References
of “No free lunch theorem” [12] which asserts that no
single optimization algorithm is best on average for all [1] D.B. Fogle, Comput. Math. Appl. 27 (1994) 89.
optimization problems. Our Fig. 2 suggests that our [2] T. Bäck, H.-P. Schwefel, Evol. Comput. 1 (1993) 1.
GEP with q > 2 is even better than FEP with q = 2 [3] D.B. Fogel, Cybernet. Syst. 25 (1994) 389.
for hard optimization problems. [4] S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi, Science 220 (1983)
671.
[5] M. Iwamatsu, Comp. Phys. Comm. 142 (2001) 214.
[6] X. Yao, Y. Liu, G. Lin, IEEE Trans. Evol. Comput. 3 (1999)
4. Conclusion 82.
[7] C. Kappler, in: H.-M. Voigt et al. (Eds.), Parallel Problem
We generalized the evolutionary programming (EP) Solving from Nature IV, Springer, Berlin–Heidelberg, 1996,
algorithm by replacing the Gaussian mutation (per- pp. 346–355.
turbation) by the general Lévy-type mutation and [8] A.M.C. de Souza, C. Tsallis, Physica A 236 (1997) 52.
[9] G.M. Viswanthan, V. Afanasyev, S.V. Buldyrev, S. Halvin,
proposed the generalized evolutionary programming M.G.E. da Luz, E.P. Raposo, H.E. Stanley, Physica A 282
(GEP). GEP includes both the classical EP (CEP) with (2000) 1.
the Gaussian mutation and the fast EP (FEP) with [10] C. Tsallis, D.A. Stariolo, Physica A 233 (1996) 395.
the Cauchy mutation, and can realize a more flexi- [11] K. Ohkura, Y. Matsumura, K. Ueda, in: B. McKay et al.
ble search using fat-tailed distribution. GEP with the (Eds.), Simulated Evolution and Learning, Springer, Berlin–
Heidelberg, 1998, pp. 10–17.
Lévy-type mutation is effective to hard optimization
[12] D.H. Wolpert, W.G. Macready, IEEE Trans. Evol. Comput. 1
problems with many local optima, while CEP with the (1998) 67.
Gaussian is more effective to simple problems with

You might also like