Anda di halaman 1dari 10

Creating a fitness function that is the right fit

for the problem at hand


Aaron Carl T. Fernandez
May 25, 2017

Abstract
Genetic Algorithm is an appealing tool to solve optimization prob-
lems [1]. To encode a problem using Genetic Algorithm, one needs to
address some questions regarding the initial population, crossover rate,
mutation rate, the stopping criteria, the type of selection operator and
the fitness function to be used in order to solve the problem. While
all of these parameters and operators are of the same importance and
directly affect the performance of a Genetic Algorithm, This paper
will concentrate on how to create a fitness function that will be the
right fit for the problem at hand and attempts on determining a better
way of defining such. Having said that, creation of an effective fitness
function is often difficult even for experienced developers [2]. Evidence
of this difficulty can be seen in publications like [3] [4] [5]. But a guide
on designing a generalized fitness function has been formulated on this
paper [6] by Josh Wilkerson and Daniel Tauritz which this paper will
outline on the succeeding sections.

1 Overview of Genetic Algorithm


Genetic Algorithm (GAs) is an adaptive method which may be used to solve
search and optimisation problems which is based on the genetic processes of
biological organisms. Over many generations, natural populations evolve ac-
cording to the principles of natural selection and ”survival of the fittest”. By
mimicking this process, genetic algorithms are able to ”evolve” solutions to
real world problems, if they have been suitably encoded. The basic principles

1
of Genetic Algorithm were first laid down rigorously by Holland [7]. Genetic
Algorithm work with a population of ”individuals”, each representing a pos-
sible solution to a given problem. Each individual is assigned a ”fitness score”
according to how good a solution to the problem it is. The highly-fit individ-
uals are given opportunities to ”reproduce”, by ”cross breeding” with other
individuals in the population. This produces new individuals as ”offspring”,
which share some features taken from each ”parent”. The least fit members
of the population are less likely to get selected for reproduction, and so ”die
out”. A whole new population of possible solutions is thus produced by se-
lecting the best individuals from the current ”generation”, and mating them
to produce a new set of individuals. This new generation contains a higher
proportion of the characteristics possessed by the good members of the pre-
vious generation. In this way, over many generations, good characteristics
are spread throughout the population. By favouring the mating of the more
fit individuals, the most promising areas of the search space are explored. If
the Genetic Algorithm has been designed well, the population will converge
to an optimal solution to the problem
Since Genetic Algorithms are designed to simulate a biological process,
much of the relevant terminology is borrowed from biology. However, the
entities that this terminology refers to in Genetic Algorithms are much sim-
pler than their biological counterparts [8]. The basic components common
to almost all Genetic Algorithms are:

• a fitness function for optimization

• a population of individuals

• selection of which individuals will reproduce

• crossover to produce next generation of individuals

• random mutation of individuals in new generation

2 Definition of Fitness
Fitness is the quantitative representation of natural and sexual selection
within evolutionary biology. It describes individual reproductive success and
is equal to the average contribution to the gene pool of the next genera-
tion that is made by an individual. A fitness score in Genetic Algorithm is

2
analogous to this fitness concept in the natural selection process wherein it
quantifies the adaptability of an individual or a ”candidate solution”, It is
the value assigned to an individual based on how far or close an individual is
from the optimal solution. To put it simply, the greater the fitness score the
better the solution it contains. So when compared to a problem, candidate
solutions with better fitness scores are more likely to survive and join the
next generation in a Genetic Algorithm.
The fitness function is the most crucial aspect of any Genetic Algorithm,
it is the process of assigning a fitness score to the individual which is problem
specific. It provides a measure of performance with respect to a particular
set of parameters. The evaluation of an individual representing a set of
parameters is independent of the evaluation of any other individual. The
fitness score of that individual, however, is always defined with respect to
other individual of the current population.

3 Characteristics of a ”good” Fitness Func-


tion
The goal of a fitness function is to guide the Genetic Algorithm through the
problem environment to an optimal solution so it must be devised for each
problem to be solved. Given a particular ”candidate solution”, the fitness
function returns a single numerical ”fitness score” or ”figure of merit,” which
is supposed to be proportional to the ”utility” or ”ability” of the individual
which that ”candidate solution” represents so the effectiveness of the fitness
function used by the Genetic Algorithm is directly related to the effectiveness
of the algorithm as a whole.
For many problems, particularly function optimisation, the fitness func-
tion should simply measure the value of the function. Its calculation is done
repeatedly in a Genetic Algorithm and therefore should be sufficiently fast.
A slow computation of the fitness value can adversely affect the algorithm
and make it exceptionally slow. A good fitness function should possess the
following characteristics:

• It should be sufficiently fast to compute.

• It should quantitavely measure how fit a given solution is or how fit


individuals it can be produced from the given solution.

3
In some cases, calculating the fitness score directly might not be possible
due to the inherent complexities of the problem at hand. In such cases,
fitness approximation is done to suit the need.

4 Creating a Fitness Function


The best way to start in creating a fitness function is to look at the examples
from the specific area since the fitness function depends completely on the
problem being solved. It is usually difficult to give a set of general guidelines
for designing an adequate fitness functions but Josh Wilkerson and Daniel
Tauritz have investigated and formalized the creation of fitness function pro-
cess that experts go through and developed a guide in the design of a high -
performance fitness functions [6].
The Traveling Salesman Problem (TSP) [9] is going to be used as a run-
ning example to elaborate the step by step procedure Wilkerson and Tauritz
had formulated in creating a generalized yet effective fitness function. The
specification of the TSP is that given an adjacency matrix A, Genetic Algo-
rithm should be able to find the shortest Hamiltonian circuit of the graph it
represents.
Step 1. Identify each solution requirement from the problem spec-
ifications. In order to generate an effective fitness function, a valid solution
to the problem should be defined. The fitness function must address every as-
pect of a problem’s defined solution in order to properly guide the Genetic Al-
gorithm.
The TSP has two apparent requirements for a candidate solution to be valid:
1. The path must be a Hamiltonian circuit (i.e., it must visit all nodes
without revisiting and returning to the starting node)
2. The Hamiltonian circuit must be the shortest possible.
Step 2. Classify the problem requirements.
The proposed method in doing so is through following the ”Requirement
Classification Taxonomy” Wilkerson and Tauritz formulated shown in fig. 1.
Starting from determining the solution requirements from the problem speci-
fications, the next step is to determine an appropriate solution representation
and EA configuration (EA stands for Evolutionary Algorithm. Genetic Al-
gorithm is a type of Evolutionary Algorithm) to solve the problem which are
also based on the problem specifications.

4
Figure 1: Requirement Classification Taxonomy formulated by Wilkerson and
Tauritz

5
Problem Requirement versus Algorithm Induced Requirement Aside from
the problem requirements that have already been identified in step 1, there
may be solution requirements that arise based on the algorithm selected and
cannot be expected to be included in the problem specifications, since for a
given problem there may be a number of possible algorithms to use. These
are called ”Algorithm Induced Requirement”. This entails additional fitness
function components which have nothing to do with the problem being solved.
For the TSP example, assume a standard EA is used and the candidate
solutions is an array of graph node identifiers, indicating the order in which
nodes are visited. For example, if a candidate solution C=[a,b,c,a] then the
path that C takes is node a, node b, node c, then back to node a. To make
this example simpler, assume that the EA operators will only generate valid
paths through the graph, so the fitness function does not need to check path
validity. There are no obvious algorithm induced requirements for this EA
configuration and representation, so this example will only have problem
based requirements.
Phenotypic versus Genotypic A phenotypic requirement is based on some
aspect of a candidate solutions performance in the problem environment,
independent of the candidates genetic representation making all problem
requirements always phenotypic since problem specifications are stated in-
dependent of a specific algorithm. Genotypic requirement is based on some
aspect of a candidate solutions genetic structure making all algorithm in-
duced requirements genotypic.
A phenotypic requirement is possible to be converted into a genotypic
requirement as shown as the dashed edge from Phenotypic to Genotypic in
fig. 1. The reason is that the desired candidate solution behavior can be
mapped to a specific genetic configuration.
For the TSP example, both requirements are problem requirements, so
they fall under phenotypic classification. However, the second requirement
(i.e., the Hamiltonian circuit must be the shortest possible) can be converted
to genotypic based on the representation being used for candidate solutions.
Instead of counting the number of steps taken when traversing the graph, the
number of elements in the candidate solution array can be used to calculate
the path length. This will speed up the calculation of the component fitness.
Tractable versus Intractable Tractable classification is the capability to
calculate the true fitness value of the resulting fitness function component
while an Intractable classification could only give an approximation to the
true fitness value.

6
In Fig. 1 you can see that genotypic requirements can only be classified
as tractable. The reasoning behind this restriction is that genotypic re-
quirements are based on the genetic structure of a candidate solution, which
implies that the calculation of the true fitness of such a requirement should
be feasible as long as the candidate solution representation is practical while
a phenotypic fitness function components can calculate either the true or
approximate fitness for the requirement.
In the running TSP example, the first requirement was classified as phe-
notypic so it can be either tractable or intractable. For this example assume
that resources are such that it is feasible to navigate a full Hamiltonian cir-
cuit of the graph, if such a circuit is discovered. So since the problem space
is manageable and navigation of it is feasible, it is possible to calculate the
true fitness for the first requirement, thus the requirement is classified as
tractable. The second TSP requirement was converted to genotypic classifi-
cation so it will also be classified as tractable, based on the prior reasoning
in this section.
Decision versus Optimization A decision requirement evaluates the solu-
tion if it satisfies or not while in optimization requirement, there are only
intermediate levels of satisfaction. These serves as a gradient in the fit-
ness function in order for it to be effective. A decision requirement can be
transformed into an optimization requirement. For example, suppose that
a requirement is that the output of a candidate solution is to be in sorted
order. Clearly, the output is either sorted or it is not, so the requirement
receives a decision requirement. So in order to transform the requirement to
optimization, a method is needed for determining how close to being sorted
the output is and then use that for the fitness function component. So,
some requirements may be classified as optimization, but will still need to be
expanded upon in order to generate a more effective fitness function.
In the TSP example, the first requirement is stated as a decision prob-
lem, i.e., either the path is a Hamiltonian circuit or it is not. So that means
that the next step is to decide on a way to convert this requirement into an
optimization problem. One possible conversion is to reward for each unique
node visited and to penalize for both revisiting a node and for not returning
to the starting node (if necessary, a penalty could also be applied for illegal
moves through the graph). So assuming that this conversion is acceptable,
the requirement is classified as a decision problem with a conversion to an
optimization problem. The second TSP requirement is already an optimiza-
tion problem, i.e., the shorter the path the better. However, the requirement

7
is a minimization problem; so it must be converted to maximization. Taking
the inverse of the value is one method to perform this conversion, assume for
the running example that this is what is decided upon.
Step 3. Implement. The last step is to combine all of the fitness function
components into the fitness function for the problem. This step is largely de-
pendent on the developers desired format for the fitness function. One option
for this step is to combine the fitness function components into a large single
function in which the component fitness values are combined together into
a single fitness value. This option may work well for some cases; however,
combining the various component fitness values can sometimes be difficult.
Weighting the component fitnesses can often be challenging and, even if a
good weighting scheme is developed, it is still possible for components to
conspire in order to increase their component fitness values to the detriment
of the fitness function overall. A second option is to use Multi-Objective
EA (MOEA) [10] methods to calculate a fitness value based on the pareto
front generated by using each fitness function component as a new dimension.
This allows for the optimization of the component fitness values without the
potential trouble of component weighting and conspiracies.
In the TSP example, the first requirement was classified as a pheno-
typic, tractable, decision problem with optimization conversion. So from
this, the component will take the problem space(i.e., the A matrix) as an
argument (in addition to a candidate solution) and will calculate the true
fitness using the method discussed. Suppose this method is implemented
as a function called CheckHamCirc which takes an individual S and the
adjacency matrix A as arguments. The second requirement was classified
as a (converted) genotypic, tractable, optimization problem. So the fitness
function component for the second requirement will take only a candidate
solution as an argument and will calculate the true fitness for the require-
ment by calculating the inverse of the path length. Suppose this compo-
nent is implemented as a function called InverseLength that takes a candi-
date solution as an argument. The last step in this example is to generate
the fitness function F for the problem using the fitness function compo-
nents. One option is to combine the components into a single expression:
F(S) = CheckHamCirc(S;A) + InverseLength(S)
If the component fitness values are normalized to fall in the same range (e.g.,
[0,100]) then there will likely be no problem with using this fitness func-
tion. However, another option is to implement this fitness function as a two
dimensional MOEA.

8
5 Conclusion
Formulating a fitness function varies depending on the problem at hand, the
general guide in designing presented on this paper could be very useful but
it is still subjected to each problem’s goal and developer’s interpretation of
the problem. A fitness function is the right fit for a specific problem if it
guides the Genetic Algorithm through the search space towards the most
optimal solution more effectively and efficiently. Bad fitness functions, on
the other hand, can easily make the Genetic Algorithm trapped in a local
optimum solution and lose it discovery power. This paper concludes that
the aforementioned guide in creating a fitness function can be classified as
a ”better way” of defining such since the proposed method is systematic
and has been proven, when properly applied, capable of generating a fitness
function that can be competitive with an expertly designed fitness function
[6]

References
[1] T. Back, Evolutionary algorithms in theory and practice: evolution
strategies, evolutionary programming, genetic algorithms. Oxford Univ.
Press, 1996.
[2] J. Branke and Y. Jin, Transactions on Evolutionary Computation, vol. 9.
3 ed., 2005.
[3] A. Rodrigues, P. De Mattos Netos, and T. Ferreira, “A prime step
in the time series forecasting with hybrid methods: The fitness
function choice,” International Joint Conference on Neural Networks,
p. 27032710, 2009.
[4] C. Yalcon, “Evolving aggregation behavior for robot swarms: Evolv-
ing aggregation behavior for robot swarms: A cost analysis for distinct
fitness functions,” International Symposium on Computer and Informa-
tional Sciences, p. 14, 2008.
[5] L. Doitsidis and N. Tsourveloudis, “An empirical study for fitness func-
tion selection in fuzzy logic controllers for mobile robot navigation,”
Annual Conference on IEEE Industrial Electronics, p. 38683873, Nov
2006.

9
[6] J. Wilkerson and D. Tauritz, “[outlining a practitioners guide to fit-
ness function design,” PROCEEDINGS OF THE 4TH ANNUAL ISC
RESEARCH SYMPOSIUM.

[7] J. Holland, “Adaptation in natural and artificial systems,” 1975.

[8] M. Mitchell, Genetic Algorithms: An Overview. Complexity. 1995.

[9] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction


to algorithms. The MIT Press, 2014.

[10] K. Deb, Multi-objective optimization using evolutionary algorithms.


John Wiley Sons, 2008.

10

Anda mungkin juga menyukai