Anda di halaman 1dari 36

An analysis of the ideal abstract genetic

algorithm package, and an evaluation of a


specific package in relation to these criteria,
with specific focus on its suitability as a
teaching tool
Matthew W. A. Sparkes BSc
School of Computing Sciences,
University of East Anglia,
Norwich,
United Kingdom,
NR4 7TJ
matt.sparkes@gmail.com
April 6, 2006

Abstract
This paper will clearly outline what a genetic algorithm is, and in
what capacity they have been utiliised to solve real world problems.
Taking this information it will then examine the set of features that
would be present in the ideal genetic algorithm package, before exam-
ining a specific package against these criteria to assess its suitability
as a teaching tool.

1
Contents
1 Awknowledgements 4

2 What is a Genetic Algorithm? 5


2.1 Natural Selection and Mutation in Nature . . . . . . . . . . . 6
2.2 Evolution as a Paradigm for Problem Solving . . . . . . . . . 7

3 Basic Elements of a GA 8
3.1 Encoding and Population Size . . . . . . . . . . . . . . . . . . 8
3.2 Crossover Operations . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Single Point Crossover . . . . . . . . . . . . . . . . . . 9
3.2.2 Double Point Crossover . . . . . . . . . . . . . . . . . . 9
3.2.3 Cut and Splice Crossover . . . . . . . . . . . . . . . . . 9
3.2.4 Uniform Crossover . . . . . . . . . . . . . . . . . . . . 10
3.3 Mutation Operations . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Fitness Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 10
3.5 Halting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5.1 Resource Based Halting . . . . . . . . . . . . . . . . . 11
3.5.2 Solution Fitness Based Halting . . . . . . . . . . . . . 11
3.5.3 Progress Based Halting . . . . . . . . . . . . . . . . . . 12

4 Background Information on GAs 13


4.1 The Birth of GA . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Problems Associated with GA/GP . . . . . . . . . . . . . . . 13
4.2.1 Complexity and Reliability . . . . . . . . . . . . . . . . 13

5 Current Applications of Genetic Algorithms 15


5.1 Electrical Circuit Design . . . . . . . . . . . . . . . . . . . . . 15
5.2 The Arts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.3 Gene Sequence Analysis . . . . . . . . . . . . . . . . . . . . . 16

6 Package Chosen for Assessment 17

7 Evaluation of Existing GA Package 17


7.1 Ease of Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
7.2 Problem Customisation . . . . . . . . . . . . . . . . . . . . . . 17
7.2.1 getValue Method . . . . . . . . . . . . . . . . . . . . . 18
7.2.2 draw Method . . . . . . . . . . . . . . . . . . . . . . . 18

2
7.2.3 createAllelesMap Method . . . . . . . . . . . . . . . . 18
7.3 Encoding and Population Size . . . . . . . . . . . . . . . . . . 18
7.4 Mutation and Crossover Operators . . . . . . . . . . . . . . . 19
7.4.1 Mutation Operations . . . . . . . . . . . . . . . . . . . 19
7.4.2 Crossover Operations . . . . . . . . . . . . . . . . . . . 19
7.5 Special GA Mechanisms . . . . . . . . . . . . . . . . . . . . . 20
7.5.1 Automatic Kick . . . . . . . . . . . . . . . . . . . . . . 20
7.5.2 Kin Competition Compensation . . . . . . . . . . . . . 20
7.5.3 Generational Memory . . . . . . . . . . . . . . . . . . 21
7.5.4 Pre and Post Breed Processing . . . . . . . . . . . . . 21
7.6 Fitness Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 22
7.7 Halting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

8 Conclusion and Evaluation 23


8.1 Suitability of GA Playground as a Teaching Tool . . . . . . . 23
8.1.1 Limitations of GA Playground . . . . . . . . . . . . . . 23
8.1.2 Benefits of GA Playground . . . . . . . . . . . . . . . . 23
8.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

9 Appendix 29
9.1 Example GA Playground Input File . . . . . . . . . . . . . . . 29
9.2 GA Playground Mutation Function . . . . . . . . . . . . . . . 31
9.3 GA Playground Crossover Function . . . . . . . . . . . . . . . 33
9.4 Screenshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3
1 Awknowledgements
Project supervisor Professor VJ Rayward-Smith.

4
2 What is a Genetic Algorithm?
Optimization problems demand that a certain variable be maximised, whilst
remaining legal within some set of constraints. These problems are often
extremely large in their nature, even to the point of NP-hardness, which
effectively means that finding the exact, or optimum, solution is infeasiably
difficult. To enumerate every possible solution, and evaluate them to deter-
mine which is the optimum, would take an inordinate amount of time. In
certain applications, where optimality is not necessary, metahueristics can be
used to find a good solution, often in a very short time. Metahueristics range
in their strategies, but all use a technique to explore the space of possible
solutions, and to find a good solution in a much shorter time than it would
take to enumerate all possible solutions. One such example of a metahueris-
tic approach are genetic algorithms. [1] However, they are only suitable for
applications where a good solution is adequate, metahueristics do not guar-
antee to find the best solution (although they may stumble upon optimality).
This may sound like an unacceptable compromise, but the massive reduction
in computing time makes metahueristics very desirable in some applications.
[2]

”Three billion years of evolution can’t be wrong. It’s the most


powerful algorithm there is.” [16]

This quote from Dr. Goldberg sums up the aim of genetic algorithms;
to model nature, and harness their proven ability to refine solutions. A ge-
netic algorithm (hereafter referred to as GA) is an algorithm designed to
find good solutions to problems by mimicking the process of natural selec-
tion. They are a form of metahueristic search in that they find solutions to
hard problems, possibly even NP-hard, where it is not feasible to enumerate
all possibilities in order to find the best solution. In order to arrive at an
acceptable solution (defined on a problem to problem basis), metahueristics
use various techniques to find good solutions, in a far shorter time. The way
in which GAs do this is to take a group of solutions, called a population,
and to breed them with each other in order to evolve the solutions towards
optimality. [7]The best solutions from each generation are used to create the
next, in a model of natural selection. Mutation is also used, as in evolution,
to randomly change these solutions in order to avoid local optimums and
ensure that a large section of the solution space is explored. [12]

5
2.1 Natural Selection and Mutation in Nature
Within nature, members of a population are born, procreate, and die. Pro-
creation creates offspring which are a combination of the two parents, with
occasional mutation also operating on the genes. This mutation does not
necessarily have to be obvious or large. The mutation of a single gene can
have little or no effect, but equally may have large repercussions - entirely
dependent on its role within the body. It is often the case that combina-
tions of genes affect a certain characteristic so that the alteration of a gene
may have no obvious effect, but actually subtly alter many charecteristics.
Mutation can occur within any cell in the body, and usually occurs during
replication. There are mechanisms which reduce the amount of mutation
that is allowed to occur, but they are not infallible. There are two types of
cell in the body; somatic and germline. Germline cells produce sperm and
eggs, and all others are somatic. Therefore if the mutation occurs in the
somatic cells, then this mutation will die with the cell, but if it occurs in the
germline cells then it will be passed onto offspring - provided the organism
isn’t detrimentally affected to the point of not surviving to procreation. [11]
These mutations can be beneficial or harmful, and can provide the animal
with an advantage over the other members of the species, or cause it to
be less capable of survival than others. As Dawkins explains in ’The Blind
Watchmaker’, these mutations are more than likely to be detrimental than
beneficial, as ’there are more ways of being dead than being alive’. By this
he means that within the vast space of possible gene sequences, there are
few that represent living and surviving organisms, and an almost limitless
amount of pools of non-living amino acids. [2] For example, an increase in
the capability to detect certain smells may make the animal a better hunter,
or better enable it to detect predators, and in either case would provide the
animal with an advantage over other members of that species. This would
mean that it would be more likely to survive to adulthood, and to procreate,
spreading its genes. An animal with a detrimental mutation however, such
as a reduced sense of smell, would be more likely to succumb to starvation or
attack from predators before procreation could occur. This is natural selec-
tion, and is a natural feedback process which causes ’good’ genes to spread,
and takes ’bad’ genes out of the pool. It is this interplay between entirely
random mutation, and non random selection that makes up the process of
evolution, causing species to adapt to their environment. It is a process that
takes an almost unimaginable length of time to occur.

6
There is little doubt ... that usually feedback mechanisms operate
to regulate the size of populations. [19]

2.2 Evolution as a Paradigm for Problem Solving


The powerful refinement and improvement abilities of natural selection can
be harnessed to solve combinatorial optimization problems using a computer.
By creating a model of an environment, where the organisms become poten-
tial solutions to the problem, and genes become variables modeling that
solution, we can recreate natural selection to ’breed’ solutions that increase
in fitness with each generation. We can simulate all processes of evolution;
procreation can be modeled by combining two or more solutions in certain
ways, mutation can be modeled using random number generators, natural
selection and death can be modeled using a fitness evaluation method, and
selecting which solutions will ’survive’ to the next generation. In this way
we can explore the search space, refining our solutions, and avoiding local
optimums by including random mutation - some of which will be detrimental
and not survive to procreation, and some which is beneficial and will steer
the solutions towards unexplored space.

7
3 Basic Elements of a GA
The basic elements of a GA are outlined in this section, and it will become
clear that each section is indeed an algorithm in its own right, and that there
are numerous choices of strategy for each step of the procedure. A GA is
simply an abstraction for a subset of algorithms, that in combination achieve
a refinement and improvement of solutions.

3.1 Encoding and Population Size


Population size in a GA is a variable that can be hard to decide upon, small
solution numbers allow a greater number of generations to iterate in a given
time, whereas larger sizes will provide more variety. However, given that
within a relatively small amount of generations the solutions will begin to
converge on a similar theme it is beneficial to choose a value for the popula-
tion size that is further towards the small end of the scale. The variation in
such an algorithm does not tend to lie in large numbers, but instead in the
mutation operators imposed on that population. Small variations, if benefi-
cial, will propagate through a small population just as they will a large, and
the greater number of generations will allow just as much chance for mutation
as the greater number of solutions in a large population. [2] Obviously the
intitial pop[ulation needs to be at least 2 solutions large, as with any smaller
amount you would only have one, and no crossover or mutation would be
possible, so no better solutuion would be deducable.

8
3.2 Crossover Operations
Crossover is the other fundamental operation needed in order to produce a
successful GA. If mutation were used alone then there would be no logical
methodology to the algorithm, and it would essentially be a random solution
generator which would have no higher chance of finding an acceptable solu-
tion in a given time than such an algorithm. [2] Crossover is the method by
which a GA can generate the next generation of solutions from the most fit
solutions from the previous generation. [13]

3.2.1 Single Point Crossover


Single point crossover is the simplest form of crossover operation. It simply
takes the genes of a pair of solutions, and bisects them at the same arbitrary
point, the tail or head sections are then exchanged. In this way the pair of
parent solutions can create a pair of children that share certain aspects of
both solutions. If the two sections that make up the new child both contain
features that are beneficial then a successful evolution has occurred, and a
solution that exceeds the fitness of previous solutions has been created. [13]

3.2.2 Double Point Crossover


With double point crossover the strategy is similar to single point, but the
transferred section of genes does not have to include the tail or head of a
solution. This enables greater flexibility in the sections altered, and also
provides all genes with an equal chance of exchange, whereas in the single
point strategy any point chosen is guaranteed to swap the end gene, and will
favor genes towards the edge of the solution. [13]

3.2.3 Cut and Splice Crossover


Cut and splice is a technique that does not maintain solution length, which
means that it is not acceptable for all GAs. In both double and single point
crossover the length of the solution is kept, and therefore genes can have
specific properties. For example, in a GA that designed possible engine parts
each gene could represent a quality of that part such as height or thickness. In
single and double point crossover these values would be swapped around, but
maintain their context and meaning, but in a strategy which varies solution
length this is not possible. Instead, cut and splice is more fitting to problems

9
where the solution strings represent a solution to some singular problem to
optimize, for example in certain combinatorial optimization problems. [13]

3.2.4 Uniform Crossover


In uniform crossover the bits from each parent are swapped, depending upon
a probability. In half uniform crossover the number of differing bits between
the two parents is calculated, and this number is divided by two. The result-
ing number is the number of non-matching bits that will be exchanged.

3.3 Mutation Operations


To fully equip a GA with the ability to find a solution as close as possible
to optimality within a given time it is desirable to use a combination of
crossover and mutation operations. With solely crossover operations there is
a distinct possibility that the algorithm would work towards a local optimum
and remain there, as promising solutions are crossed with others a local
optimum would quickly propagate through the solutions in use, and stall the
algorithm. With a mutation operation involved as well, random solutions are
thrown into contention throughout the cycle of finding a solution, and this
may eventually enable an algorithm to branch out from underneath a local
optimum, in order to pursue other avenues. [2]
Mutation is an essential part of any successful genetic algorithm. Without
mutation an algorithm would simply combine the fittest solutions. This
would mean that the generations would quickly converge into an amalgam
of good solutions, and cease to improve. Mutation can be performed in a
number of ways, and each will be appropriate for different problems. [15]

3.4 Fitness Evaluation


The fitness evaluation section of the GA is very important, and determines
how good a certain solution is. Once each generation has been created the
fitness evaluation method will ensure that each solution has had a fitness
value determined for it. It may not be necessary to calculate a value for all
solutions, of some have been continued from the previous generation intact.
Fitness evaluation mimics the environment in which members of a species
would live. The higher the fitness value, the more likely a solution is to

10
’survive’ to the next generation, and the lower it is the more likely it is that
the solution will ’die off’.

3.5 Halting
Halting is an important and difficult problem in GAs. Without some form
of halting criteria, that is checked at every generation, the program would
continue to run, even once significant gains in fitness where no longer being
generated by new generations. There are many techniques that are used to
halt GAs, and their appropriateness depends entirely on the application, but
they fall into two main categories; those that are based on the runtime of the
algorithm, and those that are based on the quality of the solutions.

3.5.1 Resource Based Halting


It is often the case that a GA can only be allocated a certain amount of
a resource, specifically time or computing cycles, to complete. In real time
critical systems it may be vital to arrive at a solution within a given period,
and in this case it is the role of the algorithm to find the best solution possible
in that time. Even in non-critical systems such as GPS route planners it is
necessary to impose some time constraints to avoid annoying the user; it
is unlikely that a user would be willing to wait hundreds of hours to find
the optimal route when 5 seconds of calculation may find a route near to
optimality. In cases such as this the halting criteria are time or computing
cycle based, rather than being associated with the fitness of the final solution.
The algorithm will check, at the end of every generation cycle, to see if it
has exceeded its allocated time, or if it is likely to in the next cycle, and
the algorithm can then be halted. In extremely time critical systems this
anticipation of possible time overruns can be used to avoid the algorithm
exceeding its time constraints in-between halting checks. The number of
generations can also be the limiting factor, in a very loose time sensitive
case, which although not accurate in terms of time constraints is very simple
to implement.

3.5.2 Solution Fitness Based Halting


If time is less of a constraint, then halting can be based on the fitness of
the solutions. This is more desirable from a reliability point of view, as the

11
output can be guaranteed to be of a certain quality, determined in the halt-
ing method. This is still a desirable practice, and does not mean that they
are obsolete, to complete enumeration and evaluation. The reason for this
is that there is no way to find optimality for a problem, without enumer-
ating all possible solutions, which can be a very time intensive procedure.
A large traveling salesman problem would demand that an enormous num-
ber of different solutions be enumerated and evaluated in order to find an
optimal solution, where as a GA could be run for an arbitrary amount of
time, and with each iteration reach a better solution. To implement solution
based halting, the algorithm must be provided information about what is
an acceptable solution, in terms of the fitness evaluation method. In this
way the algorithm can check at each generation, to see if the best solution
of that generation exceeds the acceptability levels, and halt if it does. If not
then the algorithm simply proceeds with the next iteration of the generation
loop. This acceptability level can be set to a certain value, derived from the
user, or it can be derived by some algorithm within the system, that ensures
bounded in-optimality.

3.5.3 Progress Based Halting


Another method that can be used is to monitor the progress that the al-
gorithm is making, in terms of numerical improvements correlating to the
fitness of the best solution of each successive generation. This progress can
be recorded and analyzed, in order to determine the most reasonable time to
stop the algorithm. It is highly likely that the algorithm will go through a
very short initial phase of confusion, especially if the initial population was
randomly generated and extremely varied, and then go through a period of
rapid improvement, before tailing off in a curve. There will be exceptions
to this curve, in that mutation will occasionally throw up a new solution
that avoids a local optimum, and causes a period of noticeable growth again.
However, there will come a time when the progress that the algorithm makes
will be negligible. This can be used as a halting strategy in many ways, for
example the algorithm could be instructed to halt if 3 successive generations
did not improve the best solution by more than 1%.

12
4 Background Information on GAs
4.1 The Birth of GA
Work conducted in the 1950’s and 1960’s in cellular automata started the idea
of using GA to solve problems inherent in engineering, optimisation problems.
[12] In the 1980s actual research into GAs started, and an international
conference for the field was started. As early as 1995 there were several
successful examples of GA optimisation being used in industry including
Texas Instruments designing chips to reduce size, but maintain functionality.
Critical designs such as the engine plans leading to the development of the
Boeing 777 engine by General Electric. US West uses GA to design fiber-
optic cable networks, cutting design times from two months to two days, and
saving US West $1 million to $10 million on each network design. [3] Genetic
algorithm derived designs have now even been used in satellites by NASA,
with the development of an aerial being taken completely out of engineers
hands. The orbit of those satellites is now even determined with the use of
a genetic algorithm. [25][26]

4.2 Problems Associated with GA/GP


4.2.1 Complexity and Reliability
Sometimes they are so complex or convoluted that no human programmer
could decipher what is going on in there. Goldberg talks of the difference of
conceptual machines and material machines, i.e. an algorithm and a vehicle
engine repesctively. The methods that GAs use to design systems are not
necesarily methodical so the finished code, no matter how effective, may be all
but indecipherable to the human user. This means that sometimes full testing
is not possible, and code that appears to work completely cannot be proven
to work in all examples. Although this is a normal problem with testing; not
all cases can be tested, but if the code is readable then a talented tester can
devise test cases that will likely trip the system up which is not possible with
highly complex code. This creates ethical problems with implementing GA
derived designs in mission critical or real time applications. Bearing in mind
that a failure in air traffic control, life support hardware etc could be fatal, or
that failure in a financial institution could be disastrous in other ways, but
that GA is also used to develop actual mechanical devices. Goldberg tells an
amusing story about an airline passenger worrying about the design and its

13
testing. If you were to imagine a plane as a GA, and the passenger as a GA
user, then you could imagine the stress that the thought of a failure would
cause. [9]

14
5 Current Applications of Genetic Algorithms
Genetic algorithms have been very successful in their transfer from academia
to real world application. They can now be found in use in most large com-
panies in some way, in a variety of uses. Normally a technology must be
adapted for each and every use, but by their very nature GAs do not require
this; they are inherently flexible. Theoretically all one must do to adapt a
GA to a certain problem is to design a genetic representation for solutions,
and write a fitness evaluation function for those solutions. The rest of the
system will have been duplicated many times before, and commercial and
open source packages are now available that allow quick and simple building
of GAs. Although the crossover operators etc in this package may not be op-
timum for a given solution they will operate, and allow a solution to be found.
Some packages allow these operators and much more to be customized, and
others only allow the very minimum of change; the representation and fitness
function.

5.1 Electrical Circuit Design


21 previous patents have been either duplicated or exceeded in performance
by devices designed by genetic algorithms, showing that these algorithms
are capable of producing the same output as educated engineers. In fact 2
devices have been created that are original, and would be patentable if not
for the fact that a computer designed them. There are patent laws that
forbid the submission of applications for designs that have been derived by
a computer. [8]

5.2 The Arts


Genetic algorithms have also been used extensively in research into artistic
endeavors. Their evolutionary approach is conducive to work in this field, as
’goodness’ of some piece of art, be it visual or aural, can be evolved, rather
than programmed. This is very useful when the criteria for good art are so
vague and subjective, as the fitness evaluation can be replaced by a human
evaluator initially. There has even been work to connect the fitness function
directly to physiological signals given off by the brain, which would allow
rapid evaluation. [21] There has also been work to replicate the human ca-
pability to improvise, in one case an attempt to produce jazz solos. [17][18]

15
In one particular paper by Wiggins and Papadopoulos a GA was constructed
where the fitness of solutions was calculated by several individual methods
that analysed the solution for certain criteria; the amalgam of these figures
then went to create the overall fitness. By using these sectional evaluators
the problem was broken down into understandable rules. However, this re-
search produced little in the way of listenable output. [18] Better output
has been achieved in other systems, where the fitness evaluators are more
appropriate, or where human evaluation is also made part of the selection
process. [20] Some systems have even been advanced to the point where they
can generate short pieces in real time, and take a real players response as
input, to dynamically update the fitness function. Computer generated mu-
sic has received much media attention in the past, which will undoubtedly
lead to increased funding for research, and consequently an improvement in
the quality of artificial composition software. One example is a competition
run by BBC Radio 3 where 3 pieces of music were played; one was genuine
Bach, one was imitation Bach written by a professor of music, and one was
generated by a GA.

5.3 Gene Sequence Analysis


In a bizarre twist, genetic algorithms have also been applied to the study of
genetics. In sequence analysis, two or more gene sequences are analysed in
order to discover correlations. In this way it is possible to determine whether
two species evolved from a common ancestor. [14] Genetic algorithms can
be used in this field, to discover correlations in a relatively fast time, as the
sequence analysis problem is NP-hard. The genes of the GAs solutions will
therefore represent actual genes, and the abstract concept borrowed from
evolutionary biology is put to work back in the field it came from. [24] Many
algorithms that do not utilise GAs are already in exisence, such as BLAST
and FASTA, but the GA package SAGA (Sequence Alignment by Genetic
Algorithm) has proven to outperform these in speed to optimality. [22][23]

16
6 Package Chosen for Assessment
The package that has been chosen for assessment is GA Playground, written
by Brian Dolan. It is implemented in Java, and is usable in two forms;
either as an application, or as an applet. It has been chosen because of its
simplicity, and therefore its possible suitability as a teaching tool - indeed
it is actually designed as an instructional and research tool. It is built in
a modular fashion, so that the user need only alter one or more of these
modules in order to run a customised problem. The problem itself is stored
in a definition file, in ASCII text, for an example of such a file see Apendix
9.1.

7 Evaluation of Existing GA Package


7.1 Ease of Use
One of the most important ways in which to evaluate a package such as this,
in regard to its suitably as a teaching tool, is the ease with which it could
be learned by new users. The user needs to be able to focus on learning the
concepts involved, rather than trying to gain familiarity with a package that
has a steep learning curve. The GUI of GA Playground is quite intuitive, and
minimal in its options so users could be running problems almost instantly.
This particular package is written in Java, so is particularly easy to install as
well; the only supporting software required is a Java compiler, which comes
as standard on most OS platforms. Java provides the additional benefit of
being runnable from within a browser window.

7.2 Problem Customisation


GA Playground comes with several classes that represent various common
problems. Obviously as a teaching tool these problems will suffice, and most
benchmark problems to be found in textbooks will also be found here. If,
however, a specific problem is required then the user will need to input this
problem into the package. This is no trivial task, but can be done by writ-
ing a replacement GaaFunction class. This is the only class that should be
adapted by the user, with all functionality that is not problem specific be-
ing abstracted in other files. The GaaFunction class should contain three

17
methods in order to provide a working problem class; getValue, draw and
createAllelesMap. Although it is not essential that a draw method is in-
cluded, it is reasonably important for a teaching tool to have this kind of
visual feebback for the student.

7.2.1 getValue Method


This method should calculate the fitness of a given solution, when passed to
the method in string format. The method should return a value as a double,
and this will be used to determine which solutions remain in subsequent
generations.

7.2.2 draw Method


If graphic output is needed in the GA then the draw method should be im-
plemented. This allows the current solutions, or best solution to be displayed
in a pane on the GA Playground window. It is almost always desirable to
see a graphical representation of the solutions in order to ensure that the
constraints you have placed upon a solution are correct. A solution may
begin to emerge that is legal, but the definition may be wrong and have an
omission, in this case it is vital to have visual reporting to inform the user.

7.2.3 createAllelesMap Method


The createAllelesMap is required only when a mapping file is not supplied,
and the mapping table should be generated by the program. It is often the
case that GA Playground can automatically generate this.

7.3 Encoding and Population Size


Population size is easily customizable within GA Playground, in the problem
definition file there is the following line.

Population Size=30

This determines the population size of the algorithm, and in the GaaFileIn-
put file it is recommended that this variable be kept between 10 and 100.
The encoding of the problem is also highly customizable, a variable in
the definition file allows the user to set the size of each solution in genes.

18
Number of Genes=10

Once the size is set then the representation and organization is built
in using the fitness function. No other part of the GA needs information
about what each gene represents, unless some form of intelligent mutation is
implemented. The fitness function will evaluate each solution, and combine
information from the value of each gene into a fitness value.

7.4 Mutation and Crossover Operators


7.4.1 Mutation Operations
The mutation operator in GA Playground is stored in the GAAMutation
class, in the mutation method (see section 9.2) which takes the chromosome
and a mutation factor as arguments. It offers two types of mutation, depen-
dent upon the value the user has placed in the following line in the definition
file (section 9.1).

GA Type=1

The first type of mutation (GA Type=1) takes an element from the alle-
les array, and performs some calculation, before reinserting it into the new
solution. The second type of mutation (GA Type=2) simply takes two points
in the chromosome at random, and switches them.

7.4.2 Crossover Operations


GA Playground offers two kinds of crossover method (see section 9.3), depen-
dent upon the value the user has placed in the following line in the definition
file (section 9.1).

GA Type=2

In both types the first step is to use the user defined crossover rate value
to determine whether or not crossover is to take place. This is usually the
case, and the algorithm will proceed, although occasionally solutions will be
simply passed to the next generation. In type one GAs the algorithm will then
take a random number, and split the chromosomes at that point, swapping
around this point, and concatenating the remaining substring to the other

19
chromosome in order to create the children. This is a simple implementation
of single point crossover, as explained in section 3.2.1. In type 2 GAs the
algorithm again selects a random point, and from this point onwards reorders
the genes, looping to the start of the solution and attaching this to the end
of the new solution. In this way, the solution is shifted a certain amount of
characters.

7.5 Special GA Mechanisms


As well as the standard mutation and crossover operators, GA Playground
includes what are called ’special mechanisms’ to aid the algorithm, which are
examined here.

7.5.1 Automatic Kick


If there has been no improvement in N generations, where N is a user defined
variable, then the package can ’kick’ the solution space. This is a mutation
on a large scale. To customise the value for N the user must change a variable
within the problem definition file, called stagnation limit, shown below. For
an example of a full problem definition file see 9.1.

Stagnation Limit=19

7.5.2 Kin Competition Compensation


There is no major problem inherent with allowing multiple identical solutions
in a given generation. Obviously the solutions will begin to converge after a
certain amount of generations, as the algorithm gets closer to the optimum,
and fluctuate upon mutation. This means that it is not unlikely that identical
solutions will sometimes occupy a generation. There is no specific need to
eradicate this, although it may be beneficial to check for duplicates in order
to reduce the number of times the fitness evaluation code is run. However,
GA Playground includes a strategy to reduce the occurrence of this, or at
least to reduce the effect that such an occurrence would have. If it detects
that duplicates are present, then the first solution is given the correct fitness
value, and subsequent identical solutions are given a reduced fitness value.
The factor can be altered in the problem definition file (see section 9.1),
where the line below can be altered.

20
Kin Competition Factor=0.9

There is a duplicate to this strategy to be found in natural evolution,


which is argued in the GA Playground documentation, in the self governing
of populations based on availability of resources. [4] If a greater number of
one species exists, then it will be more difficult for that greater number to
support itself due to a finite amount of food or water, or some other limit on
resources. However, this argument is slightly flawed, as for each of a certain
species in an environment the chances of survival are equal, and it is based
on chance that some survive, and others don’t. As it is in a GA with no kin
competition compensation; solutions will only survive as long as there are
not other, better solutions. Therefore the inclusion of such an algorithm is
controversial, and it’s merit would have to be investigated further before a
firm statement could be made on the wisdom of such an inclusion, but it can
be all but dismissed in regard to assessing the package as a teaching tool.
One would only wish to teach the fundamentals of genetic algorithms, and
would not need to teach the idiosyncrasies of GA Playground.

7.5.3 Generational Memory


Every solution in the population has two strings associated with it. One
os the chromosome, or the collection of genes, which defines the solutions
characteristics. The other string is the memory string whose use is not es-
sential, but can be used to record historical data on the solution. This may
be useful, in drawing a graph of the progress of the generations, a list of the
best solution from each generation could be recorded and used to illustrate
progress to students. Richard Dawkins uses a similar figure in ’The Blind
Watchmaker’ by showing lists of successive generations. [2]

7.5.4 Pre and Post Breed Processing


GA Playground provides two functions, one of which is automatically called
immediately prior to the crossover operations, and one of which is called af-
ter crossover execution. The pre breed function would allow the inclusion of
code to perform some computation on the last generation, just before cre-
ation of the new generation, whereas the post breed function would allow
computation on the new generation, prior to fitness evaluation, or any other
operators. This could be useful in a number of ways, and provides a level
of customisation higher than any simple parameter setting could. On a very

21
basic level, and from a teaching point of view, these methods could be very
useful for reporting certain values back to the user. For example, the gener-
ation number, and best fitness value for the generation could be output for
analysis and graphing, to teach students the way that metahueristics hone
in on the optimum; rapid improvement, slowing to gradual change before an
eventual stop at, or near to, optimality.

7.6 Fitness Evaluation


Fitness evaluation in GA Playground is fully customizable, although this is
true of all GA packages. The fitness evaluation is unique to each and every
problem, and must be custom written for each. GA Playground provides an
empty function in which to place your code, which is already linked to the
rest of the code; it will be called automatically upon the formation of each
new generation. The package comes with source code for many problems
already present, including most common benchmark problems such as the
traveling salesman problem and knapsack filling problem.

7.7 Halting
Halting is handled by GA Playground on a simple fitness value threshold.
Within the problem definition file there are the following two lines, that are
user definable.
Exit Value=100000
Exit Tolerance=0
The variable ’Exit Value’ defines at what fitness value the evolution should
stop. The best solution of every generation is compared to this value, and
if it is exceeded then evolution will stop, and that solution will be offered
as the best solution. The Exit Tolerance variable is the degree to which the
solution can deviate from the Exit Value, in this case the fitness must reach
and/or exceed the Exit Tolerance value, but if Exit Tolerance was 10 then the
occurrence of a solution of fitness 99991 would cause the algorithm to halt.
This simple halting strategy does limit the package somewhat, and it does
mean that it is essential to know roughly what fitness value is acceptable,
whereas with other solutions this can be more loosely defined. However, as
a teaching tool this is perfectly acceptable, and there is no need for a more
complex halting strategy.

22
8 Conclusion and Evaluation
8.1 Suitability of GA Playground as a Teaching Tool
Genetic algorithms have been proven to be enormously effective and flexible
in their application. They have quickly been put into service in many fields,
as described in section 5, and it is important that there be a suitable teaching
tool for tuition of this important field. GA Playground is aimed at this sort of
demographic, and does have all of the necessary features that this paper has
outlined are necessary for such an application. [5] It is an effective package
for building simple genetic algorithms, and although it does have limitations,
it is not necessarily made inappropriate as a teaching tool. In fact, myriad
options may even be detrimental to its effectiveness in such a capacity, as
they would overwhelm the students and they would not fully grasp the basics
of GAs - being too distracted by the minutiae.

8.1.1 Limitations of GA Playground


The single mechanism in place for halting is somewhat limiting, and it would
be advantageous to be able to customize this. In section 3.5 several halting
strategies were outlined, and only one of these, in it’s most simple form, is
implemented here. The problem here is entirely dependent upon the level at
which the package would be used as a teaching tool; to teach various halting
strategies would be a reasonably high level exercise. In a situation like that
it is entirely likely that a more complex or even custom made package would
be used. Therefore it is forgivable that halting strategy customization is not
offered here, and as GA Playground is open source it is even possible that
adding different halting strategies could be given as an assignment. The
mutation methods are simplistic, and do not offer any intelligent mutation.
However, they are still very effective, and the lack of advanced features is
not detrimental to its suitability as a teaching tool, the methods offered are
adequate for instructional purposes.

8.1.2 Benefits of GA Playground


As stated in section 7.6, many common benchmark problems are included
with GA Playground, so problems in popular text books would be already
present in the package. Therefore, as a teaching tool this package has the
benefit of being ready to use as a teaching tool, without any modification.

23
GA Playground also offers advanced log generation. It can be customized
to include much data about the algorithm, listing populations at each gen-
eration, or the current method of breeding. In this was the output could be
used as a valuable educational tool, stepping through the output of a sim-
ple problem to determine the operation of a GA. [4] The crossover methods
used in the package are simple, but adequate. The most basic option, single
point crossover, is used in type 1 GAs and this is an ideal method to use as
a teaching tool as the code is simple enough to understand easily and the
output is obvious, once the random crossover point is known. There is no
need to explain more complex crossover strategies until the basics of genetic
algorithms are understood.

8.2 Conclusion
In summary, GA Playground is a reasonably powerful package, whilst remain-
ing relatively simple to operate - an essential quality for teaching tools. By
abstracting customizable functions from the rest of the package customiza-
tion is relatively safe, and open to experimentation. The problem definition
file, and the fitness evaluation are all that need to be changed in order to
enter a custom problem into GA Playground, however, the need for graphical
output in a teaching tool is undeniable. In this case it is either necessary
to write a draw function for a custom problem also, or use one of the pro-
vided problems that come built into the package. The range of problems
that are supplied is wide, and standard, therefore it would be highly likely
that any examples listed in text books or lecture slides would be found here,
this makes it a highly attractive prospect as a teaching tool as it can be used
straight after (a very simple) installation.
The package is also available without charge, under an open source license.
Therefore there is no licensing fee for the institution using it, and with larger
class sizes or numbers of workstations this is a huge advantage that this
package has over other, commercial packages.
The platform on which it was developed is also a benefit, as many com-
puter science degrees are now based around learning Java, as is the case at
UEA. Therefore, students will already be familiar with the implementation
language for the fitness function, and there will be no learning curve before
customization can occur.
The crossover and mutation functions of the package are acceptable, even
if mutation is slightly limited. However, for an instructional tool there is only

24
one requirement; and that is that they function properly. Once students
are taught the essential concepts of evolutionary algorithms then they will
be able to learn various strategies that may exceed the capabilities of GA
Playground, but it will still be perfectly capable of teaching them the basics,
and in its limitations will teach much more.

25
References
[1] Bundy, Alan (Ed.), Artificial Intelligence Techniques; A Comprehensive
Catalogue. Springer Publishing, Section 104.
[2] Richard Dawkins, The Blind Watchmaker Penguin Books, 1986.
[3] Sharon Begley, Gregory Beals, Software au Naturel. Newsweek, pp. 70-71,
May 8th 1995.
[4] Ariel Dolan, GA Playground Documentation,
http://www.aridolan.com/ga/gaa/gaa.html
[5] Jaron Lanier, One-Half of a Manifesto: Why Stupid Software Will Save
the Future from Neo-Darwinian Machines. WIRED Magazine, Issue 8.12,
pp. 158-179, December 2000.
[6] David Pescovitz, Monsters in a Box. WIRED Magazine, Issue 8.12, pp.
340-347, December 2000.
[7] Davis, Lawrence, The Handbook of Genetic Algorithms. Van Nostrand
Reinhold, New York, 1991.
[8] Koza, John R., Jones, Lee W., Keane, Martin. A., Streeter, Matthew
J., and Al-Sakran, Sameer H. Towards Automated Dewsign of Industrial
Strength Analog Circuits by Means of Genetic Programming. Kluwer Aca-
demic Publishing, Genetic Programming Theory and Practice 2, Chapter
8, pp. 121-142, 2004.
[9] David E. Goldberg, Department of General Engineering, University of
Illionois at Urbana-Champaign, From Genetic Design and Evolutionary
Optimization to the Design of Conceptual Machines. IlliGL Report No.
98008, May 1998.
[10] Michael R. Garey, David S. Johnson, ”Computers and Intractability -
A Guide to the Theory of NP-Completeness”, Bell Laboratories, New
Jersey, 1979.
[11] Bryan Sykes, The Seven Daughters of Eve, Corgi Publishing, 2001.
[12] Wikipedia - The Free Encyclopedia, Entry for Genetic Algorithm,
http://en.wikipedia.org/wiki/Genetic-algorithm, March 2006.

26
[13] Wikipedia - The Free Encyclopedia, Entry for Crossover Strategies,
http://en.wikipedia.org/wiki/Crossover-genetic-algorithm, March 2006.

[14] Wikipedia - The Free Encyclopedia, Entry for Sequence Analysis,


http://en.wikipedia.org/wiki/Sequence-alignment March 2006.

[15] Wikipedia - The Free Encyclopedia, Entry for Mutation Strategies,


http://en.wikipedia.org/wiki/Mutation-genetic-algorithm, March 2006.

[16] Gautam Naik, Back to Darwin: In Sunlight and Cells, Science Seeks
Answers to High Tech Puzzle, The Wall Street Journal, January 16th
1996.

[17] Biles, J, GenJam: A Genetic Algorithm for Generating Jazz Solos. Pro-
ceedings of the 1994 International Computer Music Conference. Aarhus,
Denmark: International Computer Music Association. 1994.

[18] George Papadopoulos and Geraint Wiggins, A Genetic Algorithm for


the Generation of Jazz Melodies, citeseer.ist.psu.edu/87682.html

[19] Ehrlich and Holm, The Process of Evolution McGraw-Hill Publications,


1963.

[20] Bruce L Jacob, Composing With Genetic Algorithms, University of


Michigan, International Computer Music Conference, Banff Alberta,
September 1995.

[21] Website for European Workshop on Evolutionary Music and Art,


http://evonet.lri.fr/eurogp2006/

[22] BLAST Website, http://www.ncbi.nlm.nih.gov/BLAST/

[23] FASTA Documentation, University of Virginia,


ftp://ftp.virginia.edu/pub/fasta/README

[24] Cedric Notredame & Desmond G. Higgins, SAGA: Sequence Alignment


by Genetic Algorithm, EMBL outstation, The European Bioinformatics
Institute, Cambridge, March 4, 1996.

[25] West Lafayette, Genetic Algorithms ’naturally Select’ Better Satellite


Orbits, http://www.spacemart.com/reports/Genetic Algorithms naturally Select Better Satelli
Oct 15, 2001.

27
[26] NASA Website, Exploring the Universe -Evolvable Systems,
http://www.nasa.gov/centers/ames/research/exploringtheuniverse/exploringtheuniverse-
evolvablesystems.html

28
9 Appendix
9.1 Example GA Playground Input File
[Integers]
Problem Code=1
Population Size=30
Number of Genes=10
Map Order=0
Def Order=0
GA Type=1
MinMax Type=1
Crossover Type=1
Mutation Type=1
Selection Type=1
Inversion Type=1
Stagnation Limit=19
Degrade Limit=4
Survivors Percent=20
Redundancy Factor=1
Number of Variables=10
User Defined Integer=0

[Reals]
Crossover Rate=1
Mutation Rate=0.01
Inversion Rate=0
Shuffle Rate=0.7
Inversion Shuffle=0
Kick Distribution=1
Exit Value=100000
Exit Tolerance=0
Min Value=1
Max Value=10
Step Value=1
Default Value=1
Kin Competition Factor=0.9
User Defined Real=0

29
[Strings]
Title=Simpleton GA Problem
Description=A trivial maximum problem: x1*x2*x3*x4*x5/(x6*x7*x8*x9*x10)
Alleles Map File=None
Alleles Def File=None
Map Delimiter=,
Input String 1=x1*x2*x3*x4*x5/(x6*x7*x8*x9*x10)
Input String 2=None
User Defined Expression=None
User Defined String=None

[Flags]
Status Help=True
Text Window=False
Graphic Window=True
Sound=False
Logging=False
User Defined Flag=false

30
9.2 GA Playground Mutation Function
public String mutation(String chrom, double rate) {

int i, j, size;
char kar, kar2;
double n;
String st;
StringBuffer sb = new StringBuffer(chrom);
size = chrom.length();

switch (gaType) {
case 1:
for (i=0;i¡size;i++)
if (GaaMisc.flip(rate))
n = alleles[i].min + (double) Math.random()*(alleles[i].max-alleles[i].min);
kar = alleles[i].encodeValue(n);
sb.setCharAt(i,kar);
}
}
break;
case 2:
for (i=0;i¡size;i++) {
if (GaaMisc.flip(rate)) {
j = (int) Math.floor(Math.random()*size);
Integer int1 = new Integer(i);
Integer int2 = new Integer(j);
if (!int2.equals(int1)) {
kar = sb.charAt(i);
kar2 = sb.charAt(j);
sb.setCharAt(i,kar2);
sb.setCharAt(j,kar);
}
}
}
break;
}

31
st = sb.toString();
return st;

32
9.3 GA Playground Crossover Function
public String crossover(String chrom1, String chrom2) {

int i, pos;
String s1, s2, s;
char kar;

s = ””;

switch (gaType) {
case 1:
if (GaaMisc.flip(crossoverRate)) {
pos =(int) Math.floor((Math.random()*chrom1.length()));
s1 = chrom1.substring(0,pos);
s2 = chrom2.substring(pos);
s = s1.concat(s2);
}
else
s = chrom1;
break;
case 2:
if (GaaMisc.flip(crossoverRate)) {
pos =(int) Math.floor((Math.random()*chrom1.length()));
s = chrom1.substring(0,pos);
StringBuffer sb = new StringBuffer(s);
for (i=0;i¡chrom2.length();i++) {
kar = chrom2.charAt(i);
if (s.indexOf(kar) == -1) {
sb.append(kar);
}
s = sb.toString();
}
}
else
s = chrom1;
break;
}

33
return s;

34
9.4 Screenshots

35
36

Figure 1: Screenshot of GA Playground’s initial screen.

Anda mungkin juga menyukai