Anda di halaman 1dari 4

PSO Algorithm Combined with Neural Network Training Study

Xiaorong Cheng, Dong Wang, Kun Xie

School of Computer Science and Technology North China Electric Power University Baoding, China
AbstractNeural network often is trained by multilayer feedforward neural network ago, but it may fall into local minimum point. In this article, swarm optimization particle is improved so that it can adapt to solve optimization problem of discrete variables. At the same time, introducing the crossover operation of genetic algorithm make it form hybrid particle swarm optimization. Then combining the method of neural network, weight training of neural network is transformed into function optimization. The error function is cited as the definition of particle fitness. Last, in the information filtering. The efficient is compared using the multilayer and particle swarm optimization. Keywords-neural network; multilayer feed-forward neural network; hybrid particle swarm optimization; F1 test

Jujie Zhang
Marketing Department Henan Puyang Power Supply Company Puyang, China



Multilayer feed-forward neural network is a common neural network structure model. It is a hierarchical network. Fig.1 is the sketch map of the three-layer (or multilayer) feedforward neural network construction. It includes input layer (a single layer), hidden layer (one or more layers) and output layer (a single layer too). Later, as the model is studied and evolved, a new multilayer feed-forward neural network training algorithms has gradually been established, which is named error back propagation method, i.e. BP algorithm [9]. Its chief principle is as follows. Firstly, apply the studying model to the neural network; its neural activation value is traveling through the median layer from input layer to output layer. And the final output from the output layer is corresponding to the network reaction of the input layer. Then modify the joint weight layer by layer from output layer to medium layer and input layer, on the basis of decreasing the offset between the expectation and the real output. The essence of the BP algorithm is to obtain the quadratic sum of the network inaccuracy (1) as the objective function, and find the point where the objective function reaches its minimum value



BP network has been widely used in many regions, for it has the abilities of self-organization, self-learning and selfadapting, its principle is simply and it is easy to achieve. However BP network also has the limitation of low studying efficiency, low velocity of convergence and easily getting into local minimum, especially its capacity of learning and generalization is overly sensitive to the selection of the BP network structure [1]. A lot of improvement of the learning algorithm was proposed to make up for its defects [2-4]. J.Kennedy and R.Eberhart first put PSO as the network training algorithm [5]. Later, Van den Bergh proposed to initialize neural network weights by PSO [6]. Reference [7] trained the Product Element Neural Network using the Synergetic Particle Swarm Optimization to classify the models. Lovbjerg Rasmussen and Krink imported the cross operation of Evolutionary Computation into the HPSO model of PSO. The author firstly gives some improvement on the defects of the fundamental particle swarm optimization. And then, fits the select and crossover operation of genetic algorithm in the PSO to form a combined particle swarm optimization. Finally, give an analysis and comparison on their computation result, through the training of BP algorithm and PSO using the study model in Information Filtering.

Import layer

Implication layer

Export layer

Figure 1. Multilayer feed-forward neural network

978-1-4244-4507-3/09/$25.00 2009 IEEE

by the gradient method (2). Suppose that N is a feed-forward neural network, K a given sample set. Iterative learning is carried on for its No. p sample of K. And suppose that w(t) is the network weight (or threshold matrix) of the iteration at the time t, E(w(t)) the corresponding quadratic sum of the inaccuracy. Search the optimum value of w through gradient iteration.



PSO was advanced by Kennedy and Eberhart which was an evolution computing technology based on intelligent in 1995 that inspired the behavior law of birds, fish and human social. Basic POS was as follows:
k k k k k k Vid = vid + c1r1 ( pid xid ) + c2 r2 ( p gd xid ) , k k k xid+1 = xid + vid+1 .

1 EP = 2

j =1

(7) (8)


o Pj ) ,
E ( w) . w


w(t + 1) = w(t ) Of which,


E ) ki ) = i o(1 i ) j , (l)i = (l +1 k w (i +1 f (I(l)i ) , wij I(l)i represents the net input of the No.i neuron, and o(l-1)j the output of the No.j neuron in the No.(l-1) layer.

In the above formula, Study factor c1 and c2 are nonnegative integer; r1 and r2 are the random number between [0,1], v [ vmax , vmax ] , vmax is a plus constant and a parameter of the Particle Swarm Optimization, which is used to limit the velocity, and is set by the users; the particle position x is the solution of the problem; the speed V is the amount of the location x that should be amended; Pid and Pgd respectively represent individual extreme value and global extreme value. Because of the Basic POS existing weak search capabilities, and lots of test function is Multi-peak nature and complex shape, but the POS is not strictly proved convergence in the overall situation in any type of extreme points in the theory, for the complex test function, It is difficult to get satisfactory results. The algorithm Design, because of the inappropriate selection of the particles number, resulting in the calculation process, the diversity of particles disappear easily lead to precocious puberty, which makes algorithm not to convergence to the global extreme points and the optimization of the integer variable is also far from ideal. In response to these questions, we appropriate improve the Basic POS. In the formula (7) and (8), x and v are equipped with integer, c1 and c2 are positive integer. For random numbers r1 and r2, we operate with a new method of random value. Choosing an integer value as the 2 and 3 item of the type k k k k (1) from the interval [0, c1 ( pid xid ) ](or [ c1 ( pid xid ) , 0])
k k k k and [0, c2 ( p gd xid ) ](or [ c2 ( p gd xid ) , 0]).Assuming an

Be similar to the biologic neural network, adjustment of the weights among the nets is made, when a new training sample is given. And the adjustment is made reversely from the output layer to input layer, detailed computation is as follows:

V j ,n (i + 1) = V j ,n (i ) + On ,

(3) (4)

n = On (1 On )(d p ,n O p ,n ) .
The weight between output layer and the hidden layer is: Wi , j (k + 1) = Wi , j (k ) + On n ,

(5) (6)

n = On (1 + On )

j =1

i, j j

Wherein, is the learning rate, i.e. the gain coefficient of the value; the inertial coefficient, used to adjust the study speed of convergence. Usually, is a number between 0 and 1, and too. BP algorithm shifts the problem of training sample output and target output into a nonlinear optimization problem that is to obtain the weight nodes through the gradient descend iteration. It can be seen from the formula that the bigger is the fast the convergence is, but if is oversize, it will go into oscillation. Using a stationary learning rate, in the flat of the error surface, will led to small amplitude of accommodation and slow velocity of convergence, due to the tiny partial derivative value error versus weight. Local minimum problem exists, because of the low converging velocity. When studies have been executed for a certain time in the training process, decrease rate of the overall network error is very slow, and even become zero, although the offset between the expectation and the real output is large.

integer N, if N0, randomly selecting an integer value from the interval[0, N], and remembering such a random value operation as rand(N), The iterative formula of new PSO can be expressed as follows:
k k k k k k vid+1 = vid + rand (c1 ( pid xid )) + rand (c2 ( p gd xid ) , k k k xid = xid + vid+1 .

(9) (10)

In the type, position x and speed v are integer, c1 and c2 are constant, the value is positive number vim <vmaxd, the constant is the d-dimensional speed limit. PSGA is a mixed PSO, which mainly improves the individual data structure and cross-cutting approach of the genetic algorithm, according the individual data structure of the PSO, PSGA adds the individual experience information in the individual data structure to keep the individual optimal fitness and its code in the search process. In PSGA, it cross-cutting of the individual parent, not only the both individual directly

exchange the code segments, but select a location to cross the current coding xj of the individual j and the own experience optimal code pi of individual i, then generates medium codes(pixj). Randomly select a crossing bit between it with the current code xi of the individual i, (cant be the same as the previous cross-location) producing the offspring individual, the offspring of the individual i retains the experience code pi, at the same time, symmetrically to generate individual j. As shown in Fig. 2. IF the new generation code fitness than their experience code fitness, then updating the individual code experience. IV. EXPERIMENT TEST

F1 test =

Accuracy Integrity 2 . Accuracy + Integrity


From the formula, the range of F1 is from 0 to 1. While all those involved in the classification of documents dont have a correct classification document, F1=0. While all those involved in the classification of documents are correctly classification, F1=1. Only are accuracy and integrity higher, F1 can attain higher value. So, F1 reflects two evaluation indexes about accuracy and integrity. In order to verify the proposed effectiveness of hybrid particle swarm algorithm in the calculating weight method, BP algorithm will be compared with the PSGA experiment, at the same training set, its corresponding precision and recall rate are shown in table I. In the experiments of studying training PSGA, the original keywords after removing stop words and some of the information with low-frequency words and high frequency words will train in the particle swarm algorithm. After around 270 iterations of training, the average fitness and best fitness is no longer changed and algorithm to stop training at this time. Through compared the best fitness individual decoding test with the BP algorithm of training model, we make a value of their F1 test as shown in Fig. 2. From the above model, the PSGA is better in the optimize study training. ACKNOWLEDGMENT Through the comparatively of BP arithmetic and PSGA arithmetic, we know that the PSGA have better advantage in the process of neural networks studying trained. The efficient of basic particle swarm optimization have better promotion obviously while it is improved in the discrete variables problem and it is converged fast. The particle swarm optimization is a simple heuristic arithmetic. Comparing to other bionic optimization arithmetic, it need coding and parameters less. But mathematical basis is relatively weak in the particle swarm optimization. At present, it lacks of profound and universal significance theory analysis. In the information filtering, the good or bad of keyword in the studying trained will affect the high or low of efficient of information filtering. Applying PSGA to learning training, not only does it promote the efficient of learning training, but also do the accuracy and integrity have better promotion
TABLE I. Evaluation target Category Political Economy Education Entertainment ACCURACY AND INTEGRITY Accuracy BP 0.817 0.862 0.891 0.861 PGSA 0.835 0.871 0.893 0.872 Integrity BP 0.795 0.855 0.850 0.869 PGSA 0.803 0.878 0.881 0.875

Neural networks often are used for study trained in the information. In this experiment, download 500 documents which involved political, educational, economy, entertainment and so on. This keyword may be quoted in these documents and demand that these keywords can express central idea in every paragraph. Such as crisis, finance, independence, length of schooling and so on. If every particle stands for a keyword, it adjusts state itself through three aspects which contain particle current position, experiential position and neighbor position. The thinking of crossover operation in PSGA also inheritance information interaction model of particle swarm optimization. If individual i and j need crossover, 1 is defined a crossover operation below coefficient 1.1 and 2 stand for random crossover position and the pi stands for experience of i. At the same time, 1 and 2 stand for rand natural number. The process of crossover is shown in formula: Xi=xi 1 (pi 2xj) 1

2, Xj=xj 1 (pj 2xi) 1 2.

(11) (12)

In this formula, xi and xj are individual i and js genetic coding after crossover operation separately. xi and xj replace xi and xj separately and calculate every individual fitness while all particles achieve crossover operation in this colony. If current fitness of particle i is better than fitness of separately experience coding, then update experience coding pi. In the formula, if it need generate the progenies xi of i, firstly, xj crossover with pi of individual i below coefficient 2 and generate two offspring. Secondly, it chooses one of them as x. This crossover model, it ensures individual that they can use three information to modify their chromosome coding. Verifying keywords test outcomes use accuracy and integrity. Accuracy = Correct classification of documents , (13) Documents that system identified actual Correct classification of documents . Documents that system have (14)

Integrity =

F1 test value is introduced to consider accuracy and integrity through using literature [11][12].

[1] Li Zuoyong, Peng Lihong. An exploration of the uncertainty relation satisfied by BP network learning ability and generalization ability[J]. Science in China Ser. F Information Sciences, 2004,47(2):pp137-150. [2] Chen Guo. Analysis of influence factors for forecasting precision of artificial neural network model and its optimizing[J]. Pattern Recognise Artificial Intelligentize, 2005,18(5):pp528-534.(in Chinese) [3] Li Zuoyong, Yi Yongzhi. Quantitative relation between learning ability and generalization ability of BP neural network[J]. Acta Electronica Sinica,2003,31(9):pp1341-1344.(in Chinese) [4] Pan Hao, Wang Xiaoyong, Chen Qiong, et al. Application of BP neural network based on genetic algorithm[J]. Computer Application,2005,25(12):pp2777-2779. (in Chinese) [5] Kennedy J, Eberhart R C. Particle swarm optimization[C].Proc. IEEE int1 conf. on neural networks, Piscataway,NJ,1995. [6] F den Bergh, Particle Swarm Weight Initialization in Multi-layer Perceptron Artificial Neural Networks, accepted for ICAI, Durban, South Africia,1999. [7] Van den Bergh f, Engelbrecht A P. Training Product Unit Network Using Cooperative Particle Swarm Optimizers[C], Proc. Of the third Genetic and Evolutionary, Anchorage, Alaska,USA,1998. [8] Lovbjerg M, Rasmussen Tk, T.Krink. Hybrid Particle Swarm Optimization with Breeding and Subpopulations[R]. IEEE International Conference on Evolutionary Computation, San, Diego, 2000:pp561-567. [9] Rumelhart D.E., and McClelland J.L.(1986).Parallel Distributed Processing: Explorations in the Mierostructures of Cogntion. Bradford Books. [10] Wang Wenjian. Optimize of BP neural network model[J]. Computer Engineer And Design. 2000.(6).pp21-23 [11] Liu Jingjing. The Improvent and Applications upon the Particle Swarm Optimization[D].Wuhan University of Techonlogy. [12] X.D. Duan, C.R. Wang, A Hybrid Algorithm Inspired by the Idea of Particle Swarm Algorithm, Congress on Evolutionary Computation (CEC)2005.

F1 test of PSGA

F1 test of BP Figure 2. F1 test