Global Optimisation
Optimisation methods aim to find the values of a set of related variable(s) in the objective
function that will produce the minimum or maximum value as required. There are two types of
objective function, deterministic and stochastic. When the objective function is a calculated
value in the model (deterministic), we simply find the combination of parameter values that
optimise this calculated value. When the objective function is simulated random variable, we
need to decide on some statistical measure associated with that variable that variable that
should be optimised. .then the optimisation algorithm must run a simulation for each set of
decision variables values and record the statistic. There are many optimisation methods
available in the literature and implemented in the commercial software.
The history of global optimisation begins from 1970s simulation based optimisation
research due to the invention of the genetic algorithm by John Holland [1]. A genetic algorithm
is a class of population based adaptive stochastic optimization procedures, characterising the
randomness in the optimisation process. The randomness may be present as either noise in
measurements or Montecarlo randomness in search procedure or both. The basic idea behind
the genetic algorithm is to mimic a simple picture of the Darwinian natural selection in order to
find a good algorithm and involves the operation such as mutation, selection, and
evaluation of the fitness repeatedly.
A little later, in 1978, Aimo Trn introduced his Clustering algorithm of global
optimisation [2]. The method improves upon the earlier local search algorithms that needed
multiple start from several points distributed over the whole optimisation region. Clustering
algorithm avoids the drawback of the Multi-start (many starting points are used) converged to
same minimum. The Clustering method avoids this repeated determination of local minima.
This is realised in three steps, which may be iteratively used, (1) sample points in the region of
interest (2) transform the sample to obtain the points grouped around the local minima and (3)
use clustering technique to group these points. Starting a single local optimisation from each
cluster would determine the local minima and, thus also the global minimum.
Little later in 1983 another global optimisation algorithm namely Simulated annealing
method was proposed to mimic the annealing process in metallurgy by Kirkpatrick et al [3,4].
In annealing process a metal in the molten state(at very high temparature) is slowly cooled so
The method of differential equation (DE), another global optimisation method, grew out of
Kenneth Prices attempts to solve Chebychev polynomial fitting problem in 1996[8]. The
crucial idea behind DE is a scheme for generating trial parameter vectors. Initially, a population
of points (p in d dimensional space) is generated and evaluated for their fitness. Then for each
point pi three different points pa, pb, pc are randomly chosen from the generated population. A
new population pz is subjected to a crossover with the current point p i with a probability of
crossover cr , yielding a candidate point, say pu is evaluated and if found
All population based methods of global optimisation have a characteristic of the
probabilistic nature inherited to them. As a result, one cannot obtain certainty in their results,
unless they are permitted to go for indefinitely large search attempts. Larger is the number of
attempts, greater is the probability that they would find out the global optimum, but even then it
would not reach at the certainty. Secondly, all of them adapt themselves to the surface on which
they find the global optimum. Each of these methods operates with a number of parameters that
may be changed at choice to make it more effective. This choice is often problem oriented and
for obvious reasons. A particular choice may be extremely effective in a few cases, but it might
be ineffective (or counterproductive) in certain other cases. Additionally there is a relation of
trade off among those parameters. These features make all these methods a subject of trail and
error.
i 0,L , k . The best so far point can be updated at each step k as follows.
(k )
best
x ( k )
k 1
xbest
( k 1)
if f ( x ( k ) ) f ( xbest
otherwise
position
t
X i (t ) X i1 (t ), X i 2 (t ),L , X in (t )
time
(4.1)
Curren
t
Particle
memory
Swarm
influence
at
Vij (t 1) wV
. ij (t ) c1.r1. Pij X ij (t ) c2 .r2 . Pgj X ij (t ) (4.2)
Inertia
Factor
Self
Confiden
ce
Swarm
Confidenc
e
X ij (t 1) X ij (t ) Vij (t 1)
For i 1,2, , M and j 1, 2,L , n . Parameters c1 and c2 are called acceleration coefficients and
satisfy c1 c2 4 to guarantee the convergence of the particles. Parameter w , which is the
inertia weight introduced to accelerate the convergence speed of the PSO. Vector
Pi ( Pi1 , Pi 2 , , PiD ) is the best previous position (the position giving the best fitness value)
experienced by particle i and is denoted pbest. Vector Pg ( Pg1 , Pg 2 , , PgD ) is the position of
the best particle (with best fitness value) among all the particles in the swarm and is denoted by
gbest. r1 and r2 are two different random numbers uniformly distributed within (0, 1). Empirical
studies show that the PSO performs well when w varies linearly from 0.9 to 0.4 over the run.
converged to a single solution). Holland also presented a proof of convergence to the global
optimum value for the case of chromosome is binary vectors.
The GA uses two operators to generate new solutions from the existing ones by so called
crossover and mutation operations. The crossover operator is the most important operator of
GA. In crossover, generally two chromosomes called parents, are combined together to form
new chromosomes, called offspring. The parents are selected among the existing chromosomes
in the population with preference towards fitness so that offspring is expected to inherit good
genes from its parents. By iteratively applying the crossover operator, genes of good
chromosomes are expected to appear more frequently in the population, leading to convergence
to an overall good solution. The mutation operator introduces random changes into
characteristics of chromosomes. Mutation is generally applied at gene level of the chromosome.
New chromosome produced by mutation will not be significantly different from the original
chromosome; nevertheless mutation plays a critical role in GA. Mutation reintroduces genetic
diversity back into the population and prevents the solution to trap into the local optimum.
Reproduction involves selection of chromosomes for the next generation. In the most
general case, the fitness of an individual determines the probability of its survival for the next
generation. There are different selection procedures in GA in various literatures. Proportional
selection, ranking and tournament selection are most popular selection procedures [24].
i
i
i
i
j
j
j
j
Selection of two parents x1 , x 2 , x3 , , x n , x1 , x 2 , x3 , , x n for crossover is performed by
nonlinear ranking selection procedure [19]. In this procedure, with the population
P ( x i ) q ' 1 q
'
where q
i 1
1 1 q
q is the selection probability of the best chromosome. After the selection probability of each
chromosome is determined, the roulette wheel selection is adopted to select the excellent
chromosome. This kind of selection procedure need neither use individual chromosomes fitness
nor transform the fitness scaling which can prevent the premature convergence. After the
~
selection, the offspring x i , x j are created using the following scheme [20-22].
~
x i ax i (1 a ) x j
~
x j ax j (1 a ) x i
Where a is random number between -0.5 and 1.5.
Mutation
A widely used mutation operator in real coded Genetic algorithm is Non Uniform Mutation
[23]. This mutation scheme of the algorithm is as follows. From a chromosome
x i x1i , x 2i , x3i , , x ni
i 1
i 1
i 1
i 1
i 1
the mutated chromosome x x1 , x 2 , x3 , , x n
is created as
follows.
i 1
j
x ij i, x uj x ij
i
i
l
x j i, x j x j
if
r 0.5
otherwise
Where i is the current generation number and r is a uniformly distributed random number
u
l
between [0, 1]. x j , x j are upper and lower bounds of the j th component of the mutated
chromosome respectively. The function (i, y ) given below takes values in the interval [0, y].
(i , y ) y 1 u
MaxIter
Where u is a uniformly distributed random number in the interval [0, 1], MaxIter is the
maximum number of iterations and b is a parameter, determining the strength of the mutation
operator. In Romara we set b = 5.
Local Technique
This technique helps to concentrate the points in the region S around the global minimum
[21]. The procedures of the local technique are as follows.
(1)Select a random number.
j 1,2, , n
best
Where j is a random number in [-0.5, 1.5] and x j is the j th component of the best
best
chromosome x
worst
(3)Replace the worst point x
in S with x , if f ( x ) f ( x worst )
min f ( x),
x Rn
g i ( x) 0, i 1,2, , m
Subject to
h j ( x) 0, j 1,2, , k
a i x i bi ,
1 i n
x x1 , x 2 , , x n
(*)
Where f(x) is an objective function, gi(x) and hj(x) are inequality and equality constaints
respectively, and ai and bi are the search space upper bound and lower bound respectively for
xi. The formulation of the constraints is not restrictive, since an inequality constraint of the
form gi(x) 0 can also be represented as g i(x) 0, and the equality constraint hj (x) = 0 is
equal to two inequality constraints gi(x) 0 and gi(x) 0. The most common approach to
solving the constraint optimisation problems is the use of a penalty function. The purpose of
using the penalty function is to transform the continuous non linear programming (CNLP)
problem to the unconstraint NLP (UNLP) problem by building a single objective function and
penalizing the constraints. Then we can minimize the new single objective function using the
unconstraint optimisation algorithm. This is main reason behind the concept of popular usage
of the penalty function approach. The drawback of this approach is the difficulty to select
suitable penalty values. If the penalty values are high the minimisation algorithms are usually
trapped in local minima and if the penalty values are low, they can barely detect feasible
optimal solutions.
Penalty Function Method
Various literatures have addressed this issue. One of the modifications recommended by [14] is
dynamically changing penalty values as the iteration progresses. The penalty function is
generally defined as follows [14]
F ( x) f ( x) h(t ) H ( x), x R n
Where f(x) is the original objective function of CNPL problem and h(t) is a dynamically
modified penalty value, t is the algorithms current iteration number and H(x) is a penalty factor
defines as
m
H ( x) q i x q i x
qi x
i 1
qi x
described in (*). The function h(.), (.), and (.) are problem dependent. According to [14] the
values of qi x 1 ,if qi(x) < 1; otherwise qi x 2 . Additionally qi(x) < 0.001,then qi x 10
qi x 20
the
h(t ) t t
h( x )
hk 1
Where V consists of vectors of inequality constraints g and equality constraints h for the
problem.
Compute X V , where
inverse of X V to be used.
Moore-Pensore inverse is defined as xV xV T xV xV T
1
14. Jun Sun et al, Using Quantum Behaved Particle Swarm Optimization Algorithm to
solve Non-linear Programming Problems, International Journal of Computer
Mathematics, 84(2), 2007,261-272
15. Leandro dos Santos Coelho, A Quantum Particle Swarm Optimizer with Chaotic
mutation Operator, Chaos, Solitons and Fractals, 37, 2008, 1409-1418
16. Maolong Xi et al , Quantum- behaved Particle Swarm Optimization with Elitist Mean
Best Position, Complex Systems and Applications- Modelling, Control and
Simulations, 14(S2), 2007, 1643-1647
17. Leandro
dos
Santos
Coelho,
Gaussian
Quantum-behaved
Particle
Swarm