Abstract
This paper will clearly outline what a genetic algorithm is, and in
what capacity they have been utiliised to solve real world problems.
Taking this information it will then examine the set of features that
would be present in the ideal genetic algorithm package, before exam-
ining a specific package against these criteria to assess its suitability
as a teaching tool.
1
Contents
1 Awknowledgements 4
3 Basic Elements of a GA 8
3.1 Encoding and Population Size . . . . . . . . . . . . . . . . . . 8
3.2 Crossover Operations . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Single Point Crossover . . . . . . . . . . . . . . . . . . 9
3.2.2 Double Point Crossover . . . . . . . . . . . . . . . . . . 9
3.2.3 Cut and Splice Crossover . . . . . . . . . . . . . . . . . 9
3.2.4 Uniform Crossover . . . . . . . . . . . . . . . . . . . . 10
3.3 Mutation Operations . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Fitness Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 10
3.5 Halting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5.1 Resource Based Halting . . . . . . . . . . . . . . . . . 11
3.5.2 Solution Fitness Based Halting . . . . . . . . . . . . . 11
3.5.3 Progress Based Halting . . . . . . . . . . . . . . . . . . 12
2
7.2.3 createAllelesMap Method . . . . . . . . . . . . . . . . 18
7.3 Encoding and Population Size . . . . . . . . . . . . . . . . . . 18
7.4 Mutation and Crossover Operators . . . . . . . . . . . . . . . 19
7.4.1 Mutation Operations . . . . . . . . . . . . . . . . . . . 19
7.4.2 Crossover Operations . . . . . . . . . . . . . . . . . . . 19
7.5 Special GA Mechanisms . . . . . . . . . . . . . . . . . . . . . 20
7.5.1 Automatic Kick . . . . . . . . . . . . . . . . . . . . . . 20
7.5.2 Kin Competition Compensation . . . . . . . . . . . . . 20
7.5.3 Generational Memory . . . . . . . . . . . . . . . . . . 21
7.5.4 Pre and Post Breed Processing . . . . . . . . . . . . . 21
7.6 Fitness Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 22
7.7 Halting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
9 Appendix 29
9.1 Example GA Playground Input File . . . . . . . . . . . . . . . 29
9.2 GA Playground Mutation Function . . . . . . . . . . . . . . . 31
9.3 GA Playground Crossover Function . . . . . . . . . . . . . . . 33
9.4 Screenshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3
1 Awknowledgements
Project supervisor Professor VJ Rayward-Smith.
4
2 What is a Genetic Algorithm?
Optimization problems demand that a certain variable be maximised, whilst
remaining legal within some set of constraints. These problems are often
extremely large in their nature, even to the point of NP-hardness, which
effectively means that finding the exact, or optimum, solution is infeasiably
difficult. To enumerate every possible solution, and evaluate them to deter-
mine which is the optimum, would take an inordinate amount of time. In
certain applications, where optimality is not necessary, metahueristics can be
used to find a good solution, often in a very short time. Metahueristics range
in their strategies, but all use a technique to explore the space of possible
solutions, and to find a good solution in a much shorter time than it would
take to enumerate all possible solutions. One such example of a metahueris-
tic approach are genetic algorithms. [1] However, they are only suitable for
applications where a good solution is adequate, metahueristics do not guar-
antee to find the best solution (although they may stumble upon optimality).
This may sound like an unacceptable compromise, but the massive reduction
in computing time makes metahueristics very desirable in some applications.
[2]
This quote from Dr. Goldberg sums up the aim of genetic algorithms;
to model nature, and harness their proven ability to refine solutions. A ge-
netic algorithm (hereafter referred to as GA) is an algorithm designed to
find good solutions to problems by mimicking the process of natural selec-
tion. They are a form of metahueristic search in that they find solutions to
hard problems, possibly even NP-hard, where it is not feasible to enumerate
all possibilities in order to find the best solution. In order to arrive at an
acceptable solution (defined on a problem to problem basis), metahueristics
use various techniques to find good solutions, in a far shorter time. The way
in which GAs do this is to take a group of solutions, called a population,
and to breed them with each other in order to evolve the solutions towards
optimality. [7]The best solutions from each generation are used to create the
next, in a model of natural selection. Mutation is also used, as in evolution,
to randomly change these solutions in order to avoid local optimums and
ensure that a large section of the solution space is explored. [12]
5
2.1 Natural Selection and Mutation in Nature
Within nature, members of a population are born, procreate, and die. Pro-
creation creates offspring which are a combination of the two parents, with
occasional mutation also operating on the genes. This mutation does not
necessarily have to be obvious or large. The mutation of a single gene can
have little or no effect, but equally may have large repercussions - entirely
dependent on its role within the body. It is often the case that combina-
tions of genes affect a certain characteristic so that the alteration of a gene
may have no obvious effect, but actually subtly alter many charecteristics.
Mutation can occur within any cell in the body, and usually occurs during
replication. There are mechanisms which reduce the amount of mutation
that is allowed to occur, but they are not infallible. There are two types of
cell in the body; somatic and germline. Germline cells produce sperm and
eggs, and all others are somatic. Therefore if the mutation occurs in the
somatic cells, then this mutation will die with the cell, but if it occurs in the
germline cells then it will be passed onto offspring - provided the organism
isn’t detrimentally affected to the point of not surviving to procreation. [11]
These mutations can be beneficial or harmful, and can provide the animal
with an advantage over the other members of the species, or cause it to
be less capable of survival than others. As Dawkins explains in ’The Blind
Watchmaker’, these mutations are more than likely to be detrimental than
beneficial, as ’there are more ways of being dead than being alive’. By this
he means that within the vast space of possible gene sequences, there are
few that represent living and surviving organisms, and an almost limitless
amount of pools of non-living amino acids. [2] For example, an increase in
the capability to detect certain smells may make the animal a better hunter,
or better enable it to detect predators, and in either case would provide the
animal with an advantage over other members of that species. This would
mean that it would be more likely to survive to adulthood, and to procreate,
spreading its genes. An animal with a detrimental mutation however, such
as a reduced sense of smell, would be more likely to succumb to starvation or
attack from predators before procreation could occur. This is natural selec-
tion, and is a natural feedback process which causes ’good’ genes to spread,
and takes ’bad’ genes out of the pool. It is this interplay between entirely
random mutation, and non random selection that makes up the process of
evolution, causing species to adapt to their environment. It is a process that
takes an almost unimaginable length of time to occur.
6
There is little doubt ... that usually feedback mechanisms operate
to regulate the size of populations. [19]
7
3 Basic Elements of a GA
The basic elements of a GA are outlined in this section, and it will become
clear that each section is indeed an algorithm in its own right, and that there
are numerous choices of strategy for each step of the procedure. A GA is
simply an abstraction for a subset of algorithms, that in combination achieve
a refinement and improvement of solutions.
8
3.2 Crossover Operations
Crossover is the other fundamental operation needed in order to produce a
successful GA. If mutation were used alone then there would be no logical
methodology to the algorithm, and it would essentially be a random solution
generator which would have no higher chance of finding an acceptable solu-
tion in a given time than such an algorithm. [2] Crossover is the method by
which a GA can generate the next generation of solutions from the most fit
solutions from the previous generation. [13]
9
where the solution strings represent a solution to some singular problem to
optimize, for example in certain combinatorial optimization problems. [13]
10
’survive’ to the next generation, and the lower it is the more likely it is that
the solution will ’die off’.
3.5 Halting
Halting is an important and difficult problem in GAs. Without some form
of halting criteria, that is checked at every generation, the program would
continue to run, even once significant gains in fitness where no longer being
generated by new generations. There are many techniques that are used to
halt GAs, and their appropriateness depends entirely on the application, but
they fall into two main categories; those that are based on the runtime of the
algorithm, and those that are based on the quality of the solutions.
11
output can be guaranteed to be of a certain quality, determined in the halt-
ing method. This is still a desirable practice, and does not mean that they
are obsolete, to complete enumeration and evaluation. The reason for this
is that there is no way to find optimality for a problem, without enumer-
ating all possible solutions, which can be a very time intensive procedure.
A large traveling salesman problem would demand that an enormous num-
ber of different solutions be enumerated and evaluated in order to find an
optimal solution, where as a GA could be run for an arbitrary amount of
time, and with each iteration reach a better solution. To implement solution
based halting, the algorithm must be provided information about what is
an acceptable solution, in terms of the fitness evaluation method. In this
way the algorithm can check at each generation, to see if the best solution
of that generation exceeds the acceptability levels, and halt if it does. If not
then the algorithm simply proceeds with the next iteration of the generation
loop. This acceptability level can be set to a certain value, derived from the
user, or it can be derived by some algorithm within the system, that ensures
bounded in-optimality.
12
4 Background Information on GAs
4.1 The Birth of GA
Work conducted in the 1950’s and 1960’s in cellular automata started the idea
of using GA to solve problems inherent in engineering, optimisation problems.
[12] In the 1980s actual research into GAs started, and an international
conference for the field was started. As early as 1995 there were several
successful examples of GA optimisation being used in industry including
Texas Instruments designing chips to reduce size, but maintain functionality.
Critical designs such as the engine plans leading to the development of the
Boeing 777 engine by General Electric. US West uses GA to design fiber-
optic cable networks, cutting design times from two months to two days, and
saving US West $1 million to $10 million on each network design. [3] Genetic
algorithm derived designs have now even been used in satellites by NASA,
with the development of an aerial being taken completely out of engineers
hands. The orbit of those satellites is now even determined with the use of
a genetic algorithm. [25][26]
13
testing. If you were to imagine a plane as a GA, and the passenger as a GA
user, then you could imagine the stress that the thought of a failure would
cause. [9]
14
5 Current Applications of Genetic Algorithms
Genetic algorithms have been very successful in their transfer from academia
to real world application. They can now be found in use in most large com-
panies in some way, in a variety of uses. Normally a technology must be
adapted for each and every use, but by their very nature GAs do not require
this; they are inherently flexible. Theoretically all one must do to adapt a
GA to a certain problem is to design a genetic representation for solutions,
and write a fitness evaluation function for those solutions. The rest of the
system will have been duplicated many times before, and commercial and
open source packages are now available that allow quick and simple building
of GAs. Although the crossover operators etc in this package may not be op-
timum for a given solution they will operate, and allow a solution to be found.
Some packages allow these operators and much more to be customized, and
others only allow the very minimum of change; the representation and fitness
function.
15
In one particular paper by Wiggins and Papadopoulos a GA was constructed
where the fitness of solutions was calculated by several individual methods
that analysed the solution for certain criteria; the amalgam of these figures
then went to create the overall fitness. By using these sectional evaluators
the problem was broken down into understandable rules. However, this re-
search produced little in the way of listenable output. [18] Better output
has been achieved in other systems, where the fitness evaluators are more
appropriate, or where human evaluation is also made part of the selection
process. [20] Some systems have even been advanced to the point where they
can generate short pieces in real time, and take a real players response as
input, to dynamically update the fitness function. Computer generated mu-
sic has received much media attention in the past, which will undoubtedly
lead to increased funding for research, and consequently an improvement in
the quality of artificial composition software. One example is a competition
run by BBC Radio 3 where 3 pieces of music were played; one was genuine
Bach, one was imitation Bach written by a professor of music, and one was
generated by a GA.
16
6 Package Chosen for Assessment
The package that has been chosen for assessment is GA Playground, written
by Brian Dolan. It is implemented in Java, and is usable in two forms;
either as an application, or as an applet. It has been chosen because of its
simplicity, and therefore its possible suitability as a teaching tool - indeed
it is actually designed as an instructional and research tool. It is built in
a modular fashion, so that the user need only alter one or more of these
modules in order to run a customised problem. The problem itself is stored
in a definition file, in ASCII text, for an example of such a file see Apendix
9.1.
17
methods in order to provide a working problem class; getValue, draw and
createAllelesMap. Although it is not essential that a draw method is in-
cluded, it is reasonably important for a teaching tool to have this kind of
visual feebback for the student.
Population Size=30
This determines the population size of the algorithm, and in the GaaFileIn-
put file it is recommended that this variable be kept between 10 and 100.
The encoding of the problem is also highly customizable, a variable in
the definition file allows the user to set the size of each solution in genes.
18
Number of Genes=10
Once the size is set then the representation and organization is built
in using the fitness function. No other part of the GA needs information
about what each gene represents, unless some form of intelligent mutation is
implemented. The fitness function will evaluate each solution, and combine
information from the value of each gene into a fitness value.
GA Type=1
The first type of mutation (GA Type=1) takes an element from the alle-
les array, and performs some calculation, before reinserting it into the new
solution. The second type of mutation (GA Type=2) simply takes two points
in the chromosome at random, and switches them.
GA Type=2
In both types the first step is to use the user defined crossover rate value
to determine whether or not crossover is to take place. This is usually the
case, and the algorithm will proceed, although occasionally solutions will be
simply passed to the next generation. In type one GAs the algorithm will then
take a random number, and split the chromosomes at that point, swapping
around this point, and concatenating the remaining substring to the other
19
chromosome in order to create the children. This is a simple implementation
of single point crossover, as explained in section 3.2.1. In type 2 GAs the
algorithm again selects a random point, and from this point onwards reorders
the genes, looping to the start of the solution and attaching this to the end
of the new solution. In this way, the solution is shifted a certain amount of
characters.
Stagnation Limit=19
20
Kin Competition Factor=0.9
21
basic level, and from a teaching point of view, these methods could be very
useful for reporting certain values back to the user. For example, the gener-
ation number, and best fitness value for the generation could be output for
analysis and graphing, to teach students the way that metahueristics hone
in on the optimum; rapid improvement, slowing to gradual change before an
eventual stop at, or near to, optimality.
7.7 Halting
Halting is handled by GA Playground on a simple fitness value threshold.
Within the problem definition file there are the following two lines, that are
user definable.
Exit Value=100000
Exit Tolerance=0
The variable ’Exit Value’ defines at what fitness value the evolution should
stop. The best solution of every generation is compared to this value, and
if it is exceeded then evolution will stop, and that solution will be offered
as the best solution. The Exit Tolerance variable is the degree to which the
solution can deviate from the Exit Value, in this case the fitness must reach
and/or exceed the Exit Tolerance value, but if Exit Tolerance was 10 then the
occurrence of a solution of fitness 99991 would cause the algorithm to halt.
This simple halting strategy does limit the package somewhat, and it does
mean that it is essential to know roughly what fitness value is acceptable,
whereas with other solutions this can be more loosely defined. However, as
a teaching tool this is perfectly acceptable, and there is no need for a more
complex halting strategy.
22
8 Conclusion and Evaluation
8.1 Suitability of GA Playground as a Teaching Tool
Genetic algorithms have been proven to be enormously effective and flexible
in their application. They have quickly been put into service in many fields,
as described in section 5, and it is important that there be a suitable teaching
tool for tuition of this important field. GA Playground is aimed at this sort of
demographic, and does have all of the necessary features that this paper has
outlined are necessary for such an application. [5] It is an effective package
for building simple genetic algorithms, and although it does have limitations,
it is not necessarily made inappropriate as a teaching tool. In fact, myriad
options may even be detrimental to its effectiveness in such a capacity, as
they would overwhelm the students and they would not fully grasp the basics
of GAs - being too distracted by the minutiae.
23
GA Playground also offers advanced log generation. It can be customized
to include much data about the algorithm, listing populations at each gen-
eration, or the current method of breeding. In this was the output could be
used as a valuable educational tool, stepping through the output of a sim-
ple problem to determine the operation of a GA. [4] The crossover methods
used in the package are simple, but adequate. The most basic option, single
point crossover, is used in type 1 GAs and this is an ideal method to use as
a teaching tool as the code is simple enough to understand easily and the
output is obvious, once the random crossover point is known. There is no
need to explain more complex crossover strategies until the basics of genetic
algorithms are understood.
8.2 Conclusion
In summary, GA Playground is a reasonably powerful package, whilst remain-
ing relatively simple to operate - an essential quality for teaching tools. By
abstracting customizable functions from the rest of the package customiza-
tion is relatively safe, and open to experimentation. The problem definition
file, and the fitness evaluation are all that need to be changed in order to
enter a custom problem into GA Playground, however, the need for graphical
output in a teaching tool is undeniable. In this case it is either necessary
to write a draw function for a custom problem also, or use one of the pro-
vided problems that come built into the package. The range of problems
that are supplied is wide, and standard, therefore it would be highly likely
that any examples listed in text books or lecture slides would be found here,
this makes it a highly attractive prospect as a teaching tool as it can be used
straight after (a very simple) installation.
The package is also available without charge, under an open source license.
Therefore there is no licensing fee for the institution using it, and with larger
class sizes or numbers of workstations this is a huge advantage that this
package has over other, commercial packages.
The platform on which it was developed is also a benefit, as many com-
puter science degrees are now based around learning Java, as is the case at
UEA. Therefore, students will already be familiar with the implementation
language for the fitness function, and there will be no learning curve before
customization can occur.
The crossover and mutation functions of the package are acceptable, even
if mutation is slightly limited. However, for an instructional tool there is only
24
one requirement; and that is that they function properly. Once students
are taught the essential concepts of evolutionary algorithms then they will
be able to learn various strategies that may exceed the capabilities of GA
Playground, but it will still be perfectly capable of teaching them the basics,
and in its limitations will teach much more.
25
References
[1] Bundy, Alan (Ed.), Artificial Intelligence Techniques; A Comprehensive
Catalogue. Springer Publishing, Section 104.
[2] Richard Dawkins, The Blind Watchmaker Penguin Books, 1986.
[3] Sharon Begley, Gregory Beals, Software au Naturel. Newsweek, pp. 70-71,
May 8th 1995.
[4] Ariel Dolan, GA Playground Documentation,
http://www.aridolan.com/ga/gaa/gaa.html
[5] Jaron Lanier, One-Half of a Manifesto: Why Stupid Software Will Save
the Future from Neo-Darwinian Machines. WIRED Magazine, Issue 8.12,
pp. 158-179, December 2000.
[6] David Pescovitz, Monsters in a Box. WIRED Magazine, Issue 8.12, pp.
340-347, December 2000.
[7] Davis, Lawrence, The Handbook of Genetic Algorithms. Van Nostrand
Reinhold, New York, 1991.
[8] Koza, John R., Jones, Lee W., Keane, Martin. A., Streeter, Matthew
J., and Al-Sakran, Sameer H. Towards Automated Dewsign of Industrial
Strength Analog Circuits by Means of Genetic Programming. Kluwer Aca-
demic Publishing, Genetic Programming Theory and Practice 2, Chapter
8, pp. 121-142, 2004.
[9] David E. Goldberg, Department of General Engineering, University of
Illionois at Urbana-Champaign, From Genetic Design and Evolutionary
Optimization to the Design of Conceptual Machines. IlliGL Report No.
98008, May 1998.
[10] Michael R. Garey, David S. Johnson, ”Computers and Intractability -
A Guide to the Theory of NP-Completeness”, Bell Laboratories, New
Jersey, 1979.
[11] Bryan Sykes, The Seven Daughters of Eve, Corgi Publishing, 2001.
[12] Wikipedia - The Free Encyclopedia, Entry for Genetic Algorithm,
http://en.wikipedia.org/wiki/Genetic-algorithm, March 2006.
26
[13] Wikipedia - The Free Encyclopedia, Entry for Crossover Strategies,
http://en.wikipedia.org/wiki/Crossover-genetic-algorithm, March 2006.
[16] Gautam Naik, Back to Darwin: In Sunlight and Cells, Science Seeks
Answers to High Tech Puzzle, The Wall Street Journal, January 16th
1996.
[17] Biles, J, GenJam: A Genetic Algorithm for Generating Jazz Solos. Pro-
ceedings of the 1994 International Computer Music Conference. Aarhus,
Denmark: International Computer Music Association. 1994.
27
[26] NASA Website, Exploring the Universe -Evolvable Systems,
http://www.nasa.gov/centers/ames/research/exploringtheuniverse/exploringtheuniverse-
evolvablesystems.html
28
9 Appendix
9.1 Example GA Playground Input File
[Integers]
Problem Code=1
Population Size=30
Number of Genes=10
Map Order=0
Def Order=0
GA Type=1
MinMax Type=1
Crossover Type=1
Mutation Type=1
Selection Type=1
Inversion Type=1
Stagnation Limit=19
Degrade Limit=4
Survivors Percent=20
Redundancy Factor=1
Number of Variables=10
User Defined Integer=0
[Reals]
Crossover Rate=1
Mutation Rate=0.01
Inversion Rate=0
Shuffle Rate=0.7
Inversion Shuffle=0
Kick Distribution=1
Exit Value=100000
Exit Tolerance=0
Min Value=1
Max Value=10
Step Value=1
Default Value=1
Kin Competition Factor=0.9
User Defined Real=0
29
[Strings]
Title=Simpleton GA Problem
Description=A trivial maximum problem: x1*x2*x3*x4*x5/(x6*x7*x8*x9*x10)
Alleles Map File=None
Alleles Def File=None
Map Delimiter=,
Input String 1=x1*x2*x3*x4*x5/(x6*x7*x8*x9*x10)
Input String 2=None
User Defined Expression=None
User Defined String=None
[Flags]
Status Help=True
Text Window=False
Graphic Window=True
Sound=False
Logging=False
User Defined Flag=false
30
9.2 GA Playground Mutation Function
public String mutation(String chrom, double rate) {
int i, j, size;
char kar, kar2;
double n;
String st;
StringBuffer sb = new StringBuffer(chrom);
size = chrom.length();
switch (gaType) {
case 1:
for (i=0;i¡size;i++)
if (GaaMisc.flip(rate))
n = alleles[i].min + (double) Math.random()*(alleles[i].max-alleles[i].min);
kar = alleles[i].encodeValue(n);
sb.setCharAt(i,kar);
}
}
break;
case 2:
for (i=0;i¡size;i++) {
if (GaaMisc.flip(rate)) {
j = (int) Math.floor(Math.random()*size);
Integer int1 = new Integer(i);
Integer int2 = new Integer(j);
if (!int2.equals(int1)) {
kar = sb.charAt(i);
kar2 = sb.charAt(j);
sb.setCharAt(i,kar2);
sb.setCharAt(j,kar);
}
}
}
break;
}
31
st = sb.toString();
return st;
32
9.3 GA Playground Crossover Function
public String crossover(String chrom1, String chrom2) {
int i, pos;
String s1, s2, s;
char kar;
s = ””;
switch (gaType) {
case 1:
if (GaaMisc.flip(crossoverRate)) {
pos =(int) Math.floor((Math.random()*chrom1.length()));
s1 = chrom1.substring(0,pos);
s2 = chrom2.substring(pos);
s = s1.concat(s2);
}
else
s = chrom1;
break;
case 2:
if (GaaMisc.flip(crossoverRate)) {
pos =(int) Math.floor((Math.random()*chrom1.length()));
s = chrom1.substring(0,pos);
StringBuffer sb = new StringBuffer(s);
for (i=0;i¡chrom2.length();i++) {
kar = chrom2.charAt(i);
if (s.indexOf(kar) == -1) {
sb.append(kar);
}
s = sb.toString();
}
}
else
s = chrom1;
break;
}
33
return s;
34
9.4 Screenshots
35
36