DOI 10.1007/s12065-010-0035-y
REVIEW ARTICLE
O. Kramer (&)
Technische Universitat Dortmund, Dortmund, Germany
e-mail: oliver.kramer@tu-dortmund.de
1 Introduction
Self-adaptation is a famous conceptnot only known in
computer sciencemeaning that a system is capable of
adapting itself autonomously. In evolutionary optimization
the term has a defined meaning: the evolutionary onlinecontrol of so called strategy variables. Strategy variables
define properties of an evolutionary algorithm, e.g., of its
genetic operators,1 of the selection operator or populations
sizes. Their online control during the search can improve
the optimization process significantly. Self-adaptation was
originally introduced by Rechenberg [60] and Schwefel
[68] for evolution strategies (ES), later by Fogel [25] for
evolutionary programming (EP). The goal of this paper is
to give a deeper insight into the underlying principles by
overviewing typical self-adaptive operators and techniques
for the search in the space of strategy parameters. Starting
from a definition of self-adaptation and an introduction to
the covariance matrix self-adaptation evolution strategy
(CMSA-ES) by Beyer and Sendhoff [16] as a successful
example for self-adaptation in real-valued solution spaces,
this paper is structured as follows. Section 2 classifies selfadaptation within the taxonomy of parameter setting
methodologies by Eiben et al. [22]. Section 3 gives a survey of strategy parameters that are typically controlled by
means of self-adaptation, from mutation strengths in realvalued domains to the self-adaptive aggregation mechanisms for global parameters. Section 4 concentrates on
techniques to discover the space of strategy variables and
to exploit knowledge from the search. Section 5 focuses on
premature convergence and describes situations when selfadaptation might fail.
1
123
Evol. Intel.
123
Evol. Intel.
sN i 0;1
;
Pl
1
NN 1
:
2l
123
Evol. Intel.
1 eXn
p . Later, Jagerskupper [38] proved the bound
Onk ln k for the (1 ? k)-ES with Rechernbergs rule.
He proposes a modified 1/5-th rule for k descendants as
the regular algorithm fails for more than a single isotropic
mutation. A further runtime analysis is reported by Auger
[3] who investigated the (1, k)-SA-ES on the onedimensional Sphere problem. She proved sufficient conditions on the algorithms parameters, and showed that
the convergence is 1t lnkXt k with parent Xt at generation
t. Her proof technique is based on Markov chains for
continuous search spaces and the Foster-Lyapunov drift
conditions that allow stability property proofs for Markov
chains. The main result is the log-linear type of the
convergence and an estimation of the convergence rate.
Although the proof is quite sophisticated, it does not show
how the number of steps scales depending on the search
space dimensions.
The work of Semenov and Terkel [70] is based on a
stochastic Lyapunov function. A supermartingale is used
to investigate the convergence of a simple self-adaptive
evolutionary optimization algorithm. A central result of
their research is that the convergence velocity of the
analyzed self-adaptive evolutionary optimization algorithm is asymptotically exponential. They use MonteCarlo simulations to validate the confidence of the martingale inequalities numerically, because they cannot be
proven.
123
Tuning
By Hand
Design of
Experiments
Control
Meta-Evolution
Deterministic
Adaptive
Self-Adaptative
Evol. Intel.
123
Evol. Intel.
pm t
r
c1 c3 t=2
p :
c2 l l
10
123
2.
ps : s=g
Change r according to (0 \ 0 \ 1):
8
if ps [ 1=5
< r=#;
r : r #;
if ps \1=5
:
r;
if ps 1=5
11
3.
4.
12
Go to step 1.
Optimal progress is problem-dependent and can be stated theoretically on artificial functions [12].
Evol. Intel.
1e+20
1e-20
x : x z:
fitness
1e-40
1e-80
1e-100
CMSA-ES
1e-120
13
1e-140
0
200
400
600
800
1000
generations
z : r1 N 1 0; 1; . . .; rN N N 0; 1;
14
2.
= 0.001
SA-ES
and
1.
1e-60
3.
4.
123
Evol. Intel.
SA-INV 5-1
SA-INV 10-2
4e+006
INV 1
INV 10
3.8e+006
3.6e+006
3.4e+006
3.2e+006
3e+006
16
2.8e+006
10
20
30
40
50
60
18
90
100
90
100
18
SA-INV 5-1
16
rt rt1 roundc N 0; 1;
80
20
17
SA-INV 10-2
14
SA-INV 20-5
12
70
generations
10
8
6
4
2
0
10
20
30
40
50
60
70
80
generations
123
Evol. Intel.
Table 1 Experimental comparison of 1-point and SA-1-point crossover on OneMax and 3-SAT
Best
Worst
Mean
Dev
OneMax
1-point
16
93
28.88
22.3
SA-1-point
13
71
27.68
19.7
1-point
15
491
63.68
291
SA-1-point
498
62.72
264
3-SAT
19
and
o2 : p21 ; . . .; p2r ; p1r 1 ; . . .; p1N ; r :
20
123
Evol. Intel.
123
Table 2 Experimental comparison of the CMSA-ES with two variants that control the global parameter sc and l self-adaptively
Best
Worst
Mean
Dev
CMSA-ES
8,687
10,303
9,486.9
426.4
CMSA-ES (sc)
11,212
27,170
17,708.3
3,786.1
CMSA-ES (l)
8,180
13,394
9,730.1
1,457.4
sc :
21
Evol. Intel.
22
r0 : es0 N 0 0;1 r1 es1 N 1 0;1 ; . . .; rN es1 N N 0;1 ;
25
and
c
s0 : p;
2N
26
as well as
c
s1 : p
p:
2 N
27
r0 : r esN 0;1 :
23
24
1X
ri :
q i1
q
r :
29
30
i1
123
Evol. Intel.
5 Premature convergence
The phenomenon that the evolutionary search converges,
but not towards the optimal solution, is called premature
convergence. Premature convergence of the search process due to a premature decrease of mutation strengths
belongs to the most frequent problems of self-adaptive
mutation strength control. Evolution rewards short term
success: the evolutionary process can get stuck in local
optima. Premature convergence is a result of a premature
mutation strength reduction or a decrease of variance in
the population. The problem is well known and experimentally proven. Only few theoretical works concentrate
on this phenomenon, e.g., the work of Rudolph [65] who
investigates premature convergence of self-adaptive
mutation strength control. He assumes that the (1?1)-EA
is located in the vicinity P of a local solution with fitness
f = e. He proves that the probability to move from a
solution in this set P to a solution in a set of solutions S
with a fitness f less than e (which in particular contains
the global optimum, the rest of the search domain exhibits
fitness values f [ e) is smaller than 1. This result even
holds for an infinite time horizon. Hence, an evolutionary
algorithm can get stuck at a non-global optimum with a
positive probability. Rudolph also proposes ideas to
overcome this problem.
Small success probabilities ps may result from disadvantageous fitness landscape and operator conditions. An
example for such disadvantageous conditions are constrained real-valued solution spaces [42]. Many numerical
problems are constrained. Whenever the optimum lies in
the vicinity of the constraint boundary, the success rates
decrease and make successful random mutations almost
impossible. Low success rates result in premature step size
reduction and finally lead to premature convergence.
Table 3 shows the corresponding result of a (25,100)CMSA-ES on Schwefels problem12 2.40 and the tangent
P
Schwefels problem 2.40 [69]: Minimize f(x): = - 5i=1xi,
P
constraints gj x : xj 0; for j 1; . . .; 5 and gj x 5i1
9 ixi 50000 0; for j 6, minimum x* = (5000, 0, 0, 0, 0)T
with f(x*) = - 5000.
123
Worst
Dev
ffe
TR2
2.40
31.9
127.4
280.0
65.2
40,214
problem13 TR2. The termination condition is fitness stagnation: the algorithm terminates, if the fitness gain from
generation t to generation t ? 1 falls below h = 10-12 in
successive generations. In this case the magnitude of the
step sizes is too small to effect further improvements.
Parameters best, mean, worst and dev describe the achieved
fitness14 of 25 experimental runs while ffe counts the
average number of fitness function evaluations. The results
show that the CMSA-ES with death penalty (discarding
infeasible solutions in each generation until enough feasible ones are generated) is not able to approximate the
optimum of the problem satisfactorily, but shows relatively
high standard deviations dev. To avoid this problem the
application of constraint handling methods is recommended, e.g., meta-modeling of the constraint boundary,
see Kramer et al. [44].
The phenomenon of premature step size reduction at the
constraint boundary has been analyzedin particular for
the condition that the optimum lies on the constraint
boundary or even in a vertex of the feasible search space
[42]. In such cases the evolutionary algorithm frequently
suffers from low success probabilities near the constraint
boundaries. Under simple conditions, i.e., a linear objective
function, linear constraints and a comparably simple
mutation operator, the occurrence of premature convergence due to a premature decrease of step sizes was proven. This phenomenon occurs in continuous search spaces
if the coordinate system that is basis of mutation is not
aligned to the constraint boundary. The region of success is
cut off by the constraint and decreases the success probability in comparison to the case of small step sizes that are
not cut off. Also Arnold et al. [2] analyzed the behavior at
the boundary of linear constraints and models the distance
between the search point and the constraint boundary with
a Markov chain. Furthermore, they discuss the working of
step length adaptation mechanisms based on success
probabilities.
The following causes for premature convergence could
be derived. Stone and Smith [75] came to the conclusion
P
TR (tangent
problem): minimize f x : Ni1 x2i , constraints
PN
t2R
(tangent); for N=k and t=k
gx : i1 xi t [ 0;
the minimum lies at: x 1; . . .; 1T ; with f x k, TR2 means
N = 2 and t = 2.
14
difference between the optimum and the best solution
jf x f xbest j:
13
12
Mean
Evol. Intel.
operators. Most theoretical work on self-adaptation concentrates on mutation. A necessary condition for the success of self-adaptation is a tight link between strategy
parameters and fitness: if the quality of the search process
strongly depends on a particular strategy variable, selfadaptive parameter control is reasonable. But not every
parameter type offers a strong link. e.g., crossover points
suffer from a weak link.
Although self-adaptation is a rather robust online
parameter control technique, it may fail if the success
probability in the solution space is low, e.g., at the
boundary of constraints. Premature convergence can be
avoided by specialized methods that increase the success
probabilities like constraint handling methods [44]. A
further possibility is to define lower bounds on the strategy
variables. But the drawback is that a lower bound on the
strategy parameters may prevent convergence to important
values. An increase of the number of offspring solutions is
another possibility to decrease the probability of unsuccessful strategy settings, but has to be paid with more
function evaluations. In some cases tuning of the strategy
parameters, e.g., s in the case of Gaussian mutations for
ES, see Sect. 4.1.2, might lead to an increase of success
probabilities and is a solution to the problem of premature
convergence. Further research on the conditions that allow
successful self-adaptation will be necessary. In the future
self-adaptation might be the key property towards a
parameter-free evolutionary optimization algorithm.
Acknowledgments The author thanks Gunter Rudolph and the
anonymous reviewers for their helpful comments to improve the
manuscript.
References
1. Angeline PJ (1995) Adaptive and self-adaptive evolutionary
computations. In: Palaniswami M, Attikiouzel Y (eds) Computational intelligence a dynamic systems perspective. IEEE Press,
New York, pp 152163
2. Arnold DV, Brauer D (2008) On the behaviour of the (1?1)-ES
for a simple constrained problem. In: Proceedings of the 10th
conference on parallel problem solving from naturePPSN X, pp
110
3. Auger A (2003) Convergence results for (1, k)-SA-ES using the
theory of /-irreducible markov chains. In: Proceedings of the
evolutionary algorithms workshop of the 30th international colloquium on automata, languages and programming
4. Back T (1998) An overview of parameter control methods by
self-adaption in evolutionary algorithms. Fundam Inf 35(1
4):5166
5. Bartz-Beielstein T (2006) Experimental research in evolutionary
computation: the new experimentalism. Natural computing series.
Springer, April
6. Bartz-Beielstein T, Lasarczyk C, Preu M (2005) Sequential
parameter optimization. In: McKay B, et al (eds) Proceedings of
the IEEE congress on evolutionary computationCEC, vol 1.
IEEE Press, pp 773780
123
Evol. Intel.
7. Bartz-Beielstein T, Preuss M (2006) Considerations of budget allocation for sequential parameter optimization (SPO). In:
Paquete L, et al. (eds) Workshop on empirical methods for the
analysis of algorithms, proceedings, Reykjavik, Iceland, pp 35
40
8. Back T (1991) Self-adaptation in genetic algorithms. In: Proceedings of the 1st European conference on artificial life
ECAL, pp 263271
9. Back T (1992) The interaction of mutation rate, selection, and
self-adaptation within a genetic algorithm. In: Proceedings of the
2nd conference on parallel problem solving from naturePPSN
II, pp 8594
10. Back T, Schutz M (1996) Intelligent mutation rate control in
canonical genetic algorithms. In: Foundation of intelligent systems,
9th international symposium, ISMIS 96. Springer, pp 158167
11. Berlik S (2004) A step size preserving directed mutation operator.
In: Proceedings of the 6th conference on genetic and evolutionary
computationGECCO, pp 786787
12. Beyer H-G (2001) The theory of evolution strategies. Springer,
Berlin
13. Beyer H-G, Deb K (2001) On self-adaptive features in realparameter evolutionary algorithms. IEEE Trans Evol Comput
5(3):250270
14. Beyer H-G, Meyer-Nieberg S (2006) Self-adaptation on the ridge
function class: first results for the sharp ridge. In: Proceedings of
the 9th conference on parallel problem solving from nature
PPSN IX, pp 7281
15. Beyer HG, Schwefel HP (2002) Evolution strategiesa comprehensive introduction. Nat Comput 1:352
16. Beyer HG, Sendhoff B (2008) Covariance matrix adaptation
revisitedthe cmsa evolution strategy. In: Proceedings of the
10th conference on parallel problem solving from naturePPSN
X, pp 123132
17. Coello CA (2000) Use of a self-adaptive penalty approach for
engineering optimization problems. Comput Ind 41(2):113127
18. Davis L (1989) Adapting operator probabilities in genetic algorithms. In: Proceedings of the 3rd international conference on
genetic algorithms, San Francisco, Morgan Kaufmann Publishers
Inc, pp 6169
19. de Landgraaf W, Eiben A, Nannen V (2007) Parameter calibration using meta-algorithms. In: Proceedings of the IEEE congress
on evolutionary computationCEC, pp 7178
20. DeJong K (2007) Parameter setting in EAs: a 30 year perspective. In: Parameter setting in evolutionary algorithms, studies in
computational intelligence. Springer, pp 118
21. Eiben A, Schut MC, de Wilde A (2006) Is self-adaptation of
selection pressure and population size possible? A case study. In:
Proceedings of the 9th conference on parallel problem solving
from naturePPSN IX, pp 900909
22. Eiben AE, Hinterding R, Michalewicz Z (1999) Parameter control in evolutionary algorithms. IEEE Trans Evol Comput
3(2):124141
23. Eiben AE, Michalewicz Z, Schoenauer M, Smith JE (2007)
Parameter control in evolutionary algorithms. In: Parameter setting in evolutionary algorithms, studies in computational intelligence. Springer, pp 1946
24. Fogarty TC (1989) Varying the probability of mutation in the
genetic algorithm. In: Proceedings of the 3rd international conference on genetic algorithms, San Francisco, Morgan Kaufmann
Publishers Inc, pp 104109
25. Fogel DB, Fogel LJ, Atma JW (1991) Meta-evolutionary programming. In: Proceedings of 25th asilomar conference on signals, systems & computers, pp 540545
26. georg Beyer H, Arnold DV (2003) Qualms regarding the optimality of cumulative path length control in csa/cma-evolution
strategies. Evol Comput 11
123
Evol. Intel.
48. Liang KH, Yao X, Liu Y, Newton CS, Hoffman D (1998) An
experimental investigation of self-adaptation in evolutionary
programming. In: Proceedings of the 7th international conference
on evolutionary programming VIIEP, Berlin, Springer,
pp 291300
49. Maruo MH, Lopes HS, Delgado MR (2005) Self-adapting evolutionary parameters: encoding aspects for combinatorial optimization problems. In: Proceedings of EvoCOP, pp 154165
50. Mercer RE, Sampson JR (1978) Adaptive search using a reproductive metaplan. Kybernetes 7:215228
51. Mersch B, Glasmachers T, Meinicke P, Igel C (2006) Evolutionary optimization of sequence kernels for detection of bacterial
gene starts. In: ICANN (2), pp 827836
52. Meyer-Nieberg S, Beyer HG (2007) Self-adaptation in evolutionary algorithms. In: Lobo FG, Lima CF, Michalewicz Z (eds)
Parameter setting in evolutionary algorithms. Springer, Berlin
53. Mezura-Montes E, Palomeque-Ortiz AG (2009) Self-adaptive and
deterministic parameter control in differential evolution for constrained optimization. Constraint-Handl Evol Optim 189:95120
54. Muhlenbein H (1992) How genetic algorithms really work:
mutation and hillclimbing. In: Proceedings of the 2nd conference
on parallel problem solving from naturePPSN II, pp 1526
55. Nannen V, Eiben A (2006) A method for parameter calibration
and relevance estimation in evolutionary algorithms. In: Proceedings of the 8th conference on genetic and evolutionary
computationGECCO, New York, ACM Press, pp 183190
56. Nannen V, Eiben A (2007) Relevance estimation and value calibration of evolutionary algorithm parameters. In: IJCAI, pp 975
980
57. Ostermeier A, Gawelczyk A, Hansen N (1994) A derandomized
approach to self adaptation of evolution strategies. Evol Comput
2(4):369380
58. Ostermeier A, Gawelczyk A, Hansen N (1995) A derandomized
approach to self adaptation of evolution strategies. Evol Comput
2(4):369380
59. Preuss M, Bartz-Beielstein T (2007) Sequential parameter optimization applied to self-adaptation for binary-coded evolutionary
algorithms. In: Parameter setting in evolutionary algorithms,
studies in computational intelligence. Springer, pp 91119
60. Rechenberg I (1973) Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution.
Frommann-Holzboog, Stuttgart
61. Rechenberg I (1994) Evolutionsstrategie 94. Frommann-Holzboog, Stuttgart
62. Reed J, Toombs R, Barricelli NA (1967) Simulation of biological
evolution and machine learning: I. selection of self-reproducing
numeric patterns by data processing machines, effects of
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
123