135

INTERNATIONAL JOURNAL OF IMAGING SCIENCE AND ENGINEERING (IJISE)
86
Modified Standard Backpropagation Algorithm with Optimum Initialization for Feedforward Neural Networks
1
V.V.Joseph Rajapandian1, N.Gunaseeli2 Lecturer, Dept. of Computer Science, ,The American College,Madurai. 2 Lecturer,Dept.of MCA,K.L.N College of Engineeering. 1 vvjosephraj@yahoo.com, 2parkavithai@yahoo.co.in
hidden errors can be determined. These estimates, along with the input vectors to the respective nodes, are used to produce an update set of weights using the modified form of the SBP algorithm. Training patterns are run through the network until convergence is reached. Feedforward neural networks are usually too slow for most applications. One of the approaches to speed up the training rate is to estimate optimal initial weights [7]-[10]. Therefore, the proposed MBP algorithm with optimum initialization aims at determining the optimal bias and magnitude of initial weight vectors based on multidimensional geometry [2].This method ensures the outputs of neurons are in the active region and the range of the activation function is fully utilized. A. Limitations in the Previous Work In the previous work [1], the MBP algorithm takes both linear and non-linear errors and thus it improves the convergence rate, but a proper method for initialization is not mentioned. Proper initialization always plays a key role in the robust neural networks. Therefore, Modified Standard Backpropagation with optimum initialization is proposed. B. Organization of the paper This paper is organized as follows: Section I introduces MBP algorithm and explains the need for optimum initialization. It also hints the limitations in the previous work [1] which are rectified in the present work. Section II explains the MBP algorithm [1] and optimum initialization algorithm [2] briefly. Section III proposes the algorithms for the present work. Section IV discusses the experimental results of the proposed algorithm and the resultant tables and graphs are given. Section V concludes the paper and outlines the future expansion of the work. II. (A) MBP ALGORITHM MBP stands for Modified form of Back-Propagation and is a local adaptive learning scheme, performing supervised batch learning in multilayer perceptrons. The Learning Algorithm is used for Training a Single Layer Perceptron. Let us develop the algorithm for a neuron j located in any given layers s of the network. For the chosen pattern, assume that this given neuron the desired non-linear
Abstract Modified Backpropagation algorithm (MBP), an approach for the learning process of multilayer feedforward neural network with optimum initialization is proposed. One of the common complaint about the Standard Backpropagation algorithm (SBP) is that it is very slow and even simple problems may take hundreds of iterations to converge. SBP algorithm reduces only nonlinear errors. Much work, therefore has been done in search of faster methods. One of such approach is modified form of the Standard Backpropagation algorithm. Modified Backpropagation algorithm consists of minimizing the sum of the squares of linear and nonlinear errors for all output units. This leads to an efficient process in the network. Proper initialization always plays a key role in the robust neural networks. Therefore, the optimum initialization method is used for weight initialization which ensures that the outputs neurons are in the active region and the range of activation function is fully utilized. The proposed method is implemented on 2 bit parity problem, 4 bit parity checker and encoder problem and produced good results. Index Terms --- Linear error, modified standard backpropagation algorithm (MBP), neural network(NN), nonlinear error, standard backpropagation(SBP),optimum initialization.
I. INTRODUCTION There have been impressive capabilities present in the neural network technology. Neural networks have been proposed for tasks ranging from battlefield management to minding the baby. These all use the Standard Backpropagation network [SBP], which is the most successful of the current algorithms. Although, it is successfully used in many realworld applications, the SBP algorithm suffers from a number of shortcomings. Several iterations are required to train a small network, even for a simple problem. Much work, therefore has been done in search of faster methods [3]-[6]. Therefore, a new alternative algorithm, which is considerably faster than the SBP algorithm, is introduced [1]. This algorithm used a modified form of the conventional SBP algorithm. It consists of minimizing the sum of the squares of linear and nonlinear errors for all output units and for the current pattern. The quadratic linear error is weighted by a coefficient . To find the linear output error, we calculate the desired output summation by inverting the output layer nonlinearity. By deriving the optimization criterion with respect to hidden learning coefficients, an estimation of the linear and nonlinear
IJISE,GA,USA,ISSN:1934-9955,VOL.1,NO.3, JULY 2007
V.V.JOSEPH RAJAPANDIAN et.al.,:MODIFIED STANDARD BACKPROPAGATION ALGORITHM ......FOR FEEDFORWARD NEURAL NETWORKS
87
output is known. Then the desired summation signal is directly calculated by inverting the non-linearity. The linearity and non-linear current outputs are, respectively, given
n i=0
Wj[s]
= wji[s] yi[s-1]
s-1
(1)
f(uj[s]) = 1/1+e-j[s] = yj[s] (2) State that we have ns+1 inputs to the jth neuron. The nonlinear and liner errors are, respectively, equal to e1j[s] = dj[s] yj[s] (3) e2j[s] = ldj[s] uj[s] Where ldj[s] is given by (4)
ldj[s] = f-1 (dj[s]) (5) Then, let us define the new optimization criterion Ep for the jth neuron and for the current pattern P Ep = (e1j[s])2 + ( e2j[s])2 (6) Where is a weighting coefficient. Hence, the weight update rule can be derived by just applying the gradient descent method to Ep, we get wji[s] = - Ep/ wji[s] = e1j[s] yi[s-1] + e2j[s]yi[s-1]
(7)
Compared with the SBP updating equations this algorithm differs by the term e2j[s] yi[s-1]. The experimental results proves that this algorithm converges faster than the SBP algorithm for the chosen activation function and for an appropriate choice of . II (B) AN OPTIMUM WEIGHT INITIALIZATION METHOD FOR IMPROVING TRAINING SPEED IN FEEDFORWARD NEURALNETWORK Proper selection of initial weights is the key factor in training the network. In this section, the focus is on a weight initialization method. Description of the weight initialization method Consider a single hidden layer perceptron. Let x = [x1,x2,x3,xN] be the input vector and H=[h1,h2,h3,.hH] be the value of the hidden nodes. O = [o1,o2,o3,oM] be the output vector. The weight matrix between the input and hidden layer is represented by and N*H weight matrix W=(wij). Similarly, the weight matrix between the hidden and output layer is represented by an H*M weight matrix V = (vij). The training patterns are denoted by x(s), s=1,2,p. The weighted sum aj to the jth hidden neuron can be represented as aj = woj + wij * xi(s) (8) Where wjo is the threshold value of the jth hidden node. According to Yam and Chow the weights wij with i!=j are supposed to independent and identically distributed uniform random variables between (-wmax, wmax), the range between which the weights of the weight matrix will lie. For this
investigation the conventional sigmoid function is used, which is given by hj = f(aj) =1/1 + e(-aj) (9) In this case the active region is assumed to be the sigmoid function is greater than one-twentieth of the maximum value. i.e., |aj| <= 4.36. (10) If P(aj) is assumed as a hyper plane then the distance between the hyper planes P(-4.36) and P(4.36) should be greater than or equal to the maximum possible distance between two points of the input space. The maximum possible distance between two points of the input space is given by Din = [ [max (xi)-min (xi)] (11) The distance between the two hyper places P(-4.36) and P(4.36) is given by Din =8.72 / (wij) (12) This equation for evaluating the length of the weight vector is obtained by [ wij ] = 8.72 / Din (13) The weights wij is assumed to be i.i.d. uniform random variables between wmax and wmax the length of weight vector is approximated by the following equation [wji]= [ N * E(w)] (14) where E[w] is the second moment of the weights between the input and the hidden layers is given by E(w)= Wmax/3 Thus the maximum magnitude of the weight is Wmax =8.72/din * [3/N] (15) The weight of the threshold is evaluated by the formula wj0 = - Cinwij (16) Similarly the Vmax value is obtained as Vmax =15.10/H (17) Using the boundary value for Wmax, Vmax and the threshold node value obtained above the remaining weights are initialized using the randomization technique. III. PROPOSED ALGORITHM Step 1: Initialization: From layer s =1 to L, set all y0 [s-1] to values different from 0, (0.5 for example) Obtain the value of Wmax and Vmax using the formulas (15) and (17). Randomize all the weights at random values within the limits of Wmax and Vmax. Choose a small value Step 2: Select training pattern: Select an input/output pattern to be processed into the network. Step 3: Run selected pattern p through the network for each layer (s), (s=1.L) and calculate for each node j the: Summation outputs: equation (1) The non linear outputs: equation (2) Step 4: Error signals:
INTERNATIONAL JOURNAL OF IMAGING SCIENCE AND ENGINEERING (IJISE)
88
s=L -
For the output layer L Calculate the desired summations: equation (5) for Calculate the output errors The non linear output errors: equation (4) for s=L For the hidden layers: s = L-1 to 1 Evaluate the nonlinear estimation: errors equation (3) Evaluate the linear estimation errors: equation (4)
example, it confirms that the MBP algorithm with optimum initialization and having suitable learning parameters exceed the SBP algorithm in the reduction of the total no. of iterations and in the learning time. The following table compares the results of 4-bit parity check problem implemented by both SBP and MBP algorithms with optimum initialization. TABLE II RESULT OF 4 BIT PARTY CHECKER PROBLEM WITH Two HIDDEN NODES At an interval of every 50 epochs and the corresponding MSE readings are obtained and listed in the below table . TABLE III Problem Structur e No. of epochs 189 5 850 600 191 40 36 MSE Time in sec. 1.07 0.6 0.42
Step 5: Updating the Synaptic Coefficients For any node j of the layer s= 1 to L modify the synaptic coefficients. Step 6: Testing for the ending of the running: Various criteria are tested for ending. We can use the mean square error of the network output as a convergence test or we can run the program for a fixed number of iterations. If the condition is not verified, go back to step 2. IV. EXPERIMENTAL RESULTS One can realize the significance, only when the sample problems are implemented using the proposed algorithm . All the problems are simulated in C, on Pentium III with 1.6 GHz system. The results obtained through the implementation of the MBP algorithm with optimum initialization are tabulated. The problems under discussion are 2-bit XOR, 4-bit parity checker and Encoder. TABLE I SIMULATION RESULTS FOR MBP WITH INITIALIZATION METHOD TO IMPROVE THE TRAINING
XOR Parameters: =0.10 = 0.0001 4 bit parity checker Parameters: =10 = 0.0001 Encoder Parameters: =0.5 = 0.0001
2-1-1 2-2-1 2-3-1
0.03 0.03 0.03
4-1-1 4-2-1 4-3-1
0.038977 1 0.039501 0.039862
0.34 5 0.05 2 0.05 0.11 6 0.05
2-1-4 2-2-4
44 10
0.072899 0.0725
Problem
Structure
No. of epochs 1895 850 600 191 40 36
MSE 0.03 0.03 0.03 0.0389771 0.039501 0.039862
Time in sec. 1.07 0.6 0.42 0.34 5 0.05 2 0.05 0.11 6 0.05
XOR Parameters: =0.10 = 0.0001 4 bit parity checker Parameters: =10 = 0.0001 Encoder Parameters: =0.5 = 0.0001
2-1-1 2-2-1 2-3-1 4-1-1 4-2-1 4-3-1
A COMPARISON BETWEEN SBP AND MBP ALGORITHM FOR 4-BIT PARITY CHECKER PROBLEM Epoc h 50 100 150 200 250 5 0.057522 6 0.055513 3 0.05235 1 0.051507 1 0.048750 0.03891 0.03961 0.04125 0.04408 0.04796 SBP MBP
2-1-4 2-2-4
44 10
0.072899 0.0725
A Comparison with SBP Algorithm This section takes the 4-bit parity check problem, which is one of the sample problems for this paper. Through this
V.V.JOSEPH RAJAPANDIAN et.al.,:MODIFIED STANDARD BACKPROPAGATION ALGORITHM ......FOR FEEDFORWARD NEURAL NETWORKS
89
A comparison between SBP and MBP Algorithm for 4-Bit Parity Checker Problem 0.059 0.057 0.055 0.053 0.051 0.049 MSE 0.047 0.045 0.043 0.041 0.039 0.037 0.035 0 50 100 150 Epochs 200 250 300
1. The parameter values used in this work are derived by trial and error basis[1]. Careful measures may be taken in order to select the parameters through some efficient techniques. 2. Undoubtedly, The MBP algorithm with the optimum weight initialization method for feedforward neural networks has a faster convergence rate. But a comparative study of this method with some of other known faster methods may be done.
REFERENCES [1]. Abid, S. Fnaiech,F., and Najim, M., A Fast Feed forward Training Algorithm Using A Modified Form Of The Standard Back-propagation Algorithm, IEEE Transaction on Neural Networks, 12 (2001) 424 430. [2] Yam, J.Y.F. and Chow, T.W.S., A weight Initialization Method for improving Training Speed In FeedForward Neural Network, IEEE Transactions on Neural Networks, 30 (2000) 219 232. [3] I.M.R.Azimi-Sadjadi and R.J.Liou, Fast learning process of multilayer neural nets using recursive least squares technique, in Proc. IEEE Trans. Signal Processing, vol.40, Feb.1992. [4] M.R.Azimi-Sadjadi and S.Citrin, Fast learning process of multi-layer neural nets using recursive least squares technique, in Proc. IEEE Int. Conf. Neural Networks, Washington, DC, May 1989. [5] V.N.B. Carayiannis and A.V.Venetsanopoulos, Fast learning algorithm for neural networks, IEEE Trans. Circuits Syst.II, vol.39, pp.453-474,1992. [6] J.F.Shepanski, Fast learning in artificial neural systems:Multilayer perceptron training using optimal estimation, in IEEE Int. Conf. Neural Networks. New York:IEEE Press,1988, Vol.1, pp.465-472. [7] Y.F.Yam and T.W.S. Chow, Determining initial weights of feedforward neural networks based on least square method, Neural Processing Lett., vol.2, pp. 13-17,1995. [8] A new method in determining the Initial weights of feedforward neural networks, Neurocomput., vol.16, pp. 23-32, 1997. [9] S.Osowski, New approach to selection of initial values of weights in neural function approximation, Electron .Lett., vol.29,pp.313-315,1993. [10] G.Thimm and E.Fiesler, High-order and multilayer perceptron initialization IEEE Trans. neural networks, vol8, pp 349359,1997.
The Comparison between SBP and MBP Algorithm for 4-Bit Parity Checker Problem The table III and Figure 1 gives a clear picture about how MBP algorithm with optimum initialization improves the convergence rate of the error to a minimum value when compared to SBP algorithm. V. CONCLUSION The feedforward network is trained with the proposed MBP algorithm using optimum weight initialization method . It is a supervised learning method and weight updating is done in batch mode. The problem like XOR, encoder and 4-bit parity checker problem are successfully implemented using the proposed method in training the network. The rate of convergence of the network is depending on the parameters used and the initial weights. The maximum magnitude of the weights is obtained by the method proposed by Yam and Chow [2], which is proved to be improving the convergence rate. Here, the lower bound and upper bound are obtained by the maximum magnitude and the weights are obtained randomly with different seed values at every run within this range. Therefore, the different initialization shows the robustness of the network. The parameters used in the algorithm like =0.0001 and =10 are carefully set to their values after a number of trial runs. This method can be applied to any problem by assigning proper random boundaries and perfect parameters involved. The following suggestions and enhancement may be incorporated in order to improve the training rate further.
Fig 1

135

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

135

Diunggah oleh

Hak Cipta:

Format Tersedia

INTERNATIONAL JOURNAL OF IMAGING SCIENCE AND ENGINEERING (IJISE)

INTERNATIONAL JOURNAL OF IMAGING SCIENCE AND ENGINEERING (IJISE)

2-1-1 2-2-1 2-3-1

0.03 0.03 0.03

4-1-1 4-2-1 4-3-1

0.038977 1 0.039501 0.039862

0.34 5 0.05 2 0.05 0.11 6 0.05

No. of epochs 1895 850 600 191 40 36

MSE 0.03 0.03 0.03 0.0389771 0.039501 0.039862

2-1-1 2-2-1 2-3-1 4-1-1 4-2-1 4-3-1

Anda mungkin juga menyukai