Anda di halaman 1dari 65

Evolution

Heres a very oversimplified description of how evolution works

in biology
Organisms (animals or plants) produce a number of offspring
which are almost, but not entirely, like themselves
Variation may be due to mutation (random changes)
Variation may be due to reproduction (offspring have some characteristics

from each parent)

Some of these offspring may survive to produce offspring of their

ownsome wont
The better adapted offspring are more likely to survive
Over time, later generations become better and better adapted

Genetic algorithms use this same process to evolve better

programs
2

GENETIC ALGORITHM
A biologically inspired model of intelligence and the principles
of biological evolution are applied to find solutions to difficult
problems
The problems are not solved by reasoning logically about them;
rather populations of competing candidate solutions are spawned
and then evolved to become better solutions through a process
patterned after biological evolution
Less worthy candidate solutions tend to die out, while those that
show promise of solving a problem survive and reproduce by
constructing new solutions out of their components
3

GENETIC ALGORITHM
GA begin with a population of candidate problem solutions
Candidate solutions are evaluated according to their ability to
solve problem instances: only the fittest survive and combine
with each other to produce the next generation of possible
solutions
Thus increasingly powerful solutions emerge in a Darwinian
universe
This method is heuristic in nature and it was introduced by John
Holland in 1975

Basic Genetic Algorithm


Start with a large population of randomly generated

attempted solutions to a problem


Repeatedly do the following:
Evaluate each of the attempted solutions
Keep a subset of these solutions (the best ones)
Produce next generation from these solutions (using
inheritance and mutation)
Quit when you have a satisfactory solution (or you run out of time)

GENETIC ALGORITHM
Basic Algorithm
begin
set time t = 0;
initialise population P(t) = {x1t, x2t, , xnt} of solutions;
while the termination condition is not met do
begin
evaluate fitness of each member of P(t);
select some members of P(t) for creating offspring;
produce offspring by genetic operators;
replace some members with the new offspring;
set time t = t + 1;
end
end

GENETIC ALGORITHM
Representation of Solutions: The Chromosome
Gene: A basic unit, which represents one characteristic of the
individual. The value of each gene is called an allele
Chromosome: A string of genes; it represents an individual i.e. a
possible solution of a problem. Each chromosome represents a
point in the search space
Population: A collection of chromosomes
An appropriate chromosome representation is important for the
efficiency and complexity of the GA

GENETIC ALGORITHM
Evaluation/Fitness Function
It is used to determine the fitness of a chromosome
Creating a good fitness function is one of the challenging tasks of
using GA

GENETIC ALGORITHM
Fitness Function
The fitness function can be the score of the classification
accuracy of the rule-set over a set of provided training examples
Often other criteria may be included as well, such as the
complexity of the rules or the size of the rule set

GENETIC ALGORITHM
Selection Operators (Algorithms)
They are used to select parents from the current population
The selection is primarily based on the fitness. The better the
fitness of a chromosome, the greater its chance of being selected
to be a parent
The most popular method of selection is Proportionate Selection

GENETIC ALGORITHM
Reproduction Operators
Genetic operators are applied to chromosomes that are selected to
be parents, to create offspring
Basically of two types: Crossover and Mutation
Crossover operators create offspring by recombining the
chromosomes of selected parents
Mutation is used to make small random changes to a
chromosome in an effort to add diversity to the population

GENETIC ALGORITHM
Reproduction Operators: Mutation
Mutation is another important genetic operator
Mutation takes a single candidate and randomly changes some
aspect (gene) of it
For example, mutation may randomly select a bit in the pattern
and change it, switching a 1 to a 0 or to # (dont care)

Simple example

Suppose your organisms are 32-bit computer words


You want a string in which all the bits are ones
Heres how you can do it:
Create 100 randomly generated computer words
Repeatedly do the following:

Count the 1 bits in each word


Exit if any of the words have all 32 bits set to 1
Keep the ten words that have the most 1s (discard the rest)
From each word, generate 9 new words as follows:
Pick a random bit in the word and toggle (change) it

Note that this procedure does not guarantee that the next

generation will have more 1 bits, but its likely


13

Realistic Example
Suppose you have a large number of (x, y) data points
For example, (1, 4), (3, 9), (5, 8), ...

You would like to fit a polynomial (of up to degree 1) through these data

points

That is, you want a formula y = mx + c that gives you a reasonably

good fit to the actual data

Heres the usual way to compute goodness of fit:

Compute the sum of (actual y predicted y)2 for all the data points

The lowest sum represents the best fit

You can use a genetic algorithm to find a pretty good solution

Realistic Example
Your formula is y = mx + c
Your unknowns are m and c; where m and c are integers
Your representation is the array [m, c]
Your evaluation function for one array is:
For every actual data point (x, y)

Compute = mx + c

Find the sum of (y )2 over all x

The sum is your measure of badness (larger numbers are worse)

Example: For [m,c] = [5, 7] and the data points (1, 10) and (2, 13):

= 5x + 7 = 12 when x is 1

= 5x + 7 = 17 when x is 2

Now compute the badness


(10 - 12)2 + (13 17)2 = 22 + 42 = 20

If these are the only two data points, the badness of [5, 7] is 20

Realistic Example
Your GA might be as follows:
Create two-element arrays of random numbers
Repeat 50 times (or any other number):
For

each of the arrays, compute its badness (using all data


points)

Keep

the best arrays (with low badness)

From

the arrays you keep, generate new arrays as follows:

Convert the numbers in the array to binary, toggle one of


the bits at random

Quit if the badness of any of the solution is zero


After all 50 trials, pick the best array as your final answer

Realistic Example
(x, y) : {(1,5) (3, 9)}
[2 7][1 3] (initial random population, where m and c represent genes)

= 2x + 7 = 9 when x is 1

= 2x + 7 = 13 when x is 3

Badness: (5 9)2 + (9 13)2 = 42 + 42 = 32

= 1x + 3 = 4 when x is 1

= 1x + 3 = 6 when x is 3

Badness: (5 4)2 + (9 6)2 = 12 + 32 = 10

Now, lets keep the one with low badness [1 3]


Binary representation [001 011]
Apply mutation to generate new arrays [011 011]
Now we have [1 3] [3 3] as the new population considering that we keep

the two best individuals

Realistic Example
(x, y) : {(1,5) (3, 9)}
[1 3][3 3] (current population)

= 1x + 3 = 4 when x is 1

= 1x + 3 = 6 when x is 3

Badness: (5 4)2 + (9 6)2 = 1 + 9 = 10

= 3x + 3 = 6 when x is 1

= 3x + 3 = 12 when x is 3

Badness: (5 6)2 + (9 12)2 = 1 + 9 = 10

Lets keep the [3 3]


Representation [011 011]
Apply mutation to generate new arrays [010 011] i.e. [2,3]
Now we have [3 3] [2 3] as the new population

Realistic Example
(x, y) : {(1,5) (3, 9)}
[3 3][2 3] (current population)

= 3x + 3 = 6 when x is 1

= 3x + 3 = 12 when x is 3

Badness: (5 6)2 + (9 12)2 = 1 + 9 = 10

= 2x + 3 = 5 when x is 1

= 2x + 3 = 9 when x is 3

Badness: (5 5)2 + (9 9)2 = 02 + 02 = 0

Solution found [2 3]
y = 2x+3
Note: It is not necessary that the badness must always be zero. It can be some

other threshold value as well.

GENETIC ALGORITHM
Reproduction Operators: Crossover
Crossover operation takes two candidate solutions and divides
them, swapping components to produce two new candidates

GENETIC ALGORITHM
Reproduction Operators: Crossover
Figure illustrates crossover on bit string patterns of length 8
The operator splits them and forms two children whose initial
segment comes from one parent and whose tail comes from the
other
Input Bit Strings
11#0101#

#110#0#1

11#0#0#1

#110101#

Resulting Strings

The simple example


again
Suppose your individuals are 32-bit computer words, and you

want a string in which all the bits are ones

Heres how you can do it:


Create 100 randomly generated computer words
Repeatedly do the following:

Count the 1 bits in each word

Exit if any of the words have all 32 bits set to 1

Keep the 10 words that have the most 1s (discard the rest).

From each word, generate 9 new words as follows:

Choose one of the words

Take the first half of this word and combine it with


the second half of some other word

The simple example


again
Half from one, half from the

other:

A = 0110 1001 0100 1110 1010 1101 1011 0101


B = 1101 0100 0101 1010 1011 0100 1010 0101
----------------------------------------------------------------C = 0110 1001 0100 1110 1011 0100 1010 0101

Mutation vs Crossover
In the simple example of 32-bit words (trying to get all 1s):
The (two-parent, no mutation) approach, if it succeeds, is likely to succeed

much faster

Because up to half of the bits change each time, not just one bit

However, without mutation, it may not succeed at all

By pure bad luck, maybe none of the first randomly generated words
have (say) bit 17 set to 1

Then there is no way a 1 could ever occur in this position as we are


not changing individual bits separately

Another problem is lack of genetic diversity

Maybe some of the first generation did have bit 17 set to 1, but none of
them were selected for the second generation

The best technique in general turns out to be crossover with mutation

GENETIC ALGORITHM
Reproduction Operators: Crossover
The place of split in the candidate solution is an arbitrary choice.
This split may be at any point in the solution
This splitting point may be randomly chosen or changed
systematically during the solution process
Crossover can unite an individual that is doing well in one
dimension with another individual that is doing well in the other
dimension

GENETIC ALGORITHM
Reproduction Operators: Crossover
Two types: Single point crossover & Uniform crossover
Single type crossover
This operator takes two parents and randomly selects a single
point between two genes to cut both chromosomes into two
parts (this point is called cut point)
The first part of the first parent is combined with the second
part of the second parent to create the first child
The first part of the second parent is combined with the
second part of first parent to create the second child
1000010
1110001

1000001
1110010

GENETIC ALGORITHM
Reproduction Operators: Crossover
Uniform crossover
The value of each gene of an offsprings chromosome is
randomly taken from either parent
This is equivalent to multiple point crossover
1000010
1110001

1010010

GENETIC ALGORITHM
Example
Find a number: 001010
You have guessed this binary number. If you write a program to
find it, then there are 26 = 64 possibilities
If you find it with the help of Genetic Algorithm, then the
program gives a number and you tell its fitness
The fitness score is the number of correctly guessed bits

28

GENETIC ALGORITHM
Example
Find a number: 001010
Step 1. Chromosomes produced.
A) 010101
-1
B) 111101
-1
C) 011011
-4*
D) 101100
-3*
Best ones C & D

29

GENETIC ALGORITHM
Example
Find a number: 001010
C)
D)
C)
D)

Mating
New Variants
01:1011
01:1100 (E)
10:1100
10:1011 (F)
0110:11
0110:00 (G)
1011:00
1011:11 (H)

Selection of F & G

30

Evaluation
3
4*
4*
3

GENETIC ALGORITHM
Example
Mating
F)
1:01011
G)
0:11000
F)
101:011
G)
011:000

New Variants
1:11000 (H)
0:01011 (I)
101:000 (J)
011:011 (K)

Selection of I and J

31

Evaluation
3
5*
4*
4

GENETIC ALGORITHM
Example
I)
J)
I)
J)

Mating
0010:11
1010:00
00101:1
10100:0

New Variants
0010:00 (L)
1010:11 (M)
00101:0 (N)
10100:1 (O)

Evaluation
5
4
6 (success) *
3

In this game success was achieved after 16 questions, which is 4


times faster then checking all possible 26 = 64 combination

32

GENETIC ALGORITHM
Example
Mutation was not used in this example. Mutation would have
been necessary, if, e.g. there was a 0 in the third bit of all 3 initial
individuals. In that case no matter how the individuals are
combined, we can never change this bit into 1. Mutation takes
evolution out of a dead end.

33

Eight Queens Problem


The problem is to

place 8 queens on a
chess board so that
none of them can
attack the other. A
chess board can be
considered a plain
board with eight
columns and eight
rows.

Eight Queens Problem


The possible cells that

the Queen can move


to when placed in a
particular square are
shaded

Eight Queens Problem


We need a scheme to
denote the boards
position at any given
time

26834531

Eight Queens Problem


We need a scheme to
denote the boards
position at any given
time

26834531

Eight Queens Problem


Now we need a fitness function, a function by

which we can tell which board position is


nearer to our goal. Since we are going to
select best individuals at every step, we need
to define a method to rate these board
positions.

One fitness function can be to count the

number of Queens that do not attack others

Eight Queens Problem


Fitness
Q1 can
Q2 can
Q3 can
Q4 can
Q5 can
Q6 can
Q7 can
Q8 can

Function:
attack NONE
attack NONE
attack Q6
attack Q5
attack Q4
attack Q5
attack Q4
attack Q5

Fitness = No of. Queens that


can attack none
Fitness = 2

Eight Queens Problem


Choose initial population of board

configurations
Evaluate the fitness of each individual
(configuration)
Choose the best individuals from the
population for crossover

Eight Queens Problem


Suppose the following individuals are chosen for crossover

85727135

45827165

Eight Queens Problem


Using Crossover
Parents
Children

85727135

8572

45827165

4582

Eight Queens Problem

Eight Queens Problem


Mutation, flip bits at random
45827165
0100 0101 1000 0010 0111 0001 0110 0101
0100 0101 1000 0010 0111 0001 0011 0101
45827135

Eight Queens Problem


This process is repeated until an individual

with required fitness level is found. If no


such individual is found, then the process
is repeated further until the overall fitness
of the population or any of its individuals
gets very close to the required fitness
level. An upper limit on the number of
iterations is usually put to end the process
in finite time.

Eight Queens Problem


Solution!
Q
Q
Q
Q

8
Q
Q
Q
Q

46827135

GENETIC ALGORITHM

Selection Process

47

GENETIC ALGORITHM
Selection Process
It is used to select parents from the current population. The
selection is primarily based on the fitness. The better the
fitness of a chromosome, the greater its chance of being
selected to be a parent
The rate at which a selection algorithm selects individuals
with above average fitness is selective pressure
If there is not enough selective pressure, the population will
fail to converge upon a solution. If there is too much, the
population may not have enough diversity & converge
prematurely
48

GENETIC ALGORITHM
Selection Process: Random Selection
Random Selection:
Individuals are selected randomly with no reference to fitness
at all
All the individuals, good or bad, have an equal chance of
being selected

49

GENETIC ALGORITHM
Selection Process: Proportional Selection
Proportional Selection:
We can select the fittest chromosomes
However, the selection of only the fitter chromosomes may
result in the loss of a correct gene value which may be present
in a less fit member
One way to overcome this risk is to assign probability of
selection to each chromosome based on its fitness
In this way even the less fit members have some chance of
surviving into the next generation
50

GENETIC ALGORITHM

Selection Process: Proportional Selection


The probability of selection of a chromosome i may be
calculated as
pi = fitnessi / j fitnessj
Example
Chromosome
1
2
3
4

Fitness
7
4
2
1

Selection Probability
7/14
4/14
2/14
1/14

51

GENETIC ALGORITHM
Selection Process: Proportional Selection

52

GENETIC ALGORITHM
Selection Process: Proportional Selection
Chromosomes are selected based on their fitness relative to
the fitness of all other chromosomes
For this all the fitness are added to form a sum S and each
chromosome is assigned a relative fitness (which is its fitness
divided by the total fitness S)
A process similar to spinning a roulette wheel is adopted to
choose a parent; the better a chromosomes relative fitness,
the higher its chances of selection
53

GENETIC ALGORITHM
Selection Process: Proportional Selection
Once a parent is selected, the wheel is given a spin for finding
the second parent. If the same chromosome is selected as
the second parent, it is rejected and the wheel is spun
again
After finding a pair, a second pair is selected, and so on
A chromosome may get selected several times and appear as a
parent several times

54

GENETIC ALGORITHM
Selection Process: Proportional Selection
Advantage
Selective pressure varies with the distribution of fitness
within a population. If there is a lot of fitness difference
between the more fit and less fit chromosomes, then the
selective pressure will be higher
Disadvantage
As the population converges upon a solution, the selective
pressure decreases, which may hinder the GA to find
better solutions
55

GENETIC ALGORITHM

Selection Process: Tournament Selection


Tournament Selection:
One parent is selected by comparing a subset b of the
available chromosomes, and selecting the fittest; a second
parent may be selected by repeating the process
The selection pressure increases as b increases.
Value of b = 2 is most commonly used

56

GENETIC ALGORITHM

Selection Process: Tournament Selection


Its advantage is that the worse individuals of the population
will have very little probability of selection, whereas the
best individuals will not dominate the selection process,
thus ensuring diversity

57

GENETIC ALGORITHM
Selection Process: Rank based selection
Rank Based Selection:
Rank based selection uses the rank ordering of the fitness
values to determine the probability of selection and not the
fitness values themselves
This means that the selection probability is independent of
the actual fitness value
Ranking therefore has the advantage that a highly fit
individual will not dominate in the selection process as a
function of the magnitude of its fitness
58

GENETIC ALGORITHM

Selection Process: Rank based selection


The population is sorted from best to worst according to the
fitness
Each chromosome is then assigned a new
fitness based on a linear ranking function
New Fitness = (P r) + 1
where P = population size, r = fitness rank of the chromosome
If P = 11, then a chromosome of rank 1 will have a New
Fitness of 10 + 1 = 11 & a chromosome of rank 6 will have 6

59

GENETIC ALGORITHM
Selection Process: Rank based selection
A user adjusted slope can also be incorporated
New Fitness = {(P r) (max - min)/(P 1)} + min
where max and min are set by the user to determine the slope
(max - min)/(P 1) of the function
Let P = 11, max = 8, min = 3,
then a chromosome of rank 1 will have a New fitness of
10*5/10 + 3 = 8
& a chromosome of rank 6 will have 5*5/10 + 3 = 5.5
60

GENETIC ALGORITHM

Termination Requirement

61

GENETIC ALGORITHM
Termination Requirement
The GA continues until some termination requirement is met,
such as
- having a solution whose fitness exceeds some threshold
- pre-specified number of generations have evolved
- the fitness of solutions becomes stable & stops improving

62

GENETIC ALGORITHM

Population Size

63

GENETIC ALGORITHM

Population Size
Number of individuals present in an iteration (generation)
If the population size is too large, the processing time is high
and the GA tends to take longer to converge upon a
solution (because less fit members have to be selected to
make up the required population)
If the population size is too small, the GA is in danger of
premature convergence upon a sub-optimal solution (all
chromosomes will soon have identical traits). This is
primarily because there may not be enough diversity in
the population to allow the GA to escape local optima
64

Genetic Algorithms
Advantages of genetic algorithms:
Often outperform brute force approaches by
randomly jumping around the search space
Ideal for problem domains in which nearoptimal
(as opposed to exact) solutions are
adequate
Disadvantages of genetic algorithms:
Might not find any satisfactory partial solutions
Tuning can be a challenge