Anda di halaman 1dari 3

A FIXED POINT IMPLEMENTATION OF THE BACKPROPAGATION

LEARNINGALGORITHM
R. Kimball Predey, M.S.

Roger L. Haggard, Ph.D.

Depadment of Electrical Engineering


Tennessee Technological UniverSity
cookeville, Temessee 38505, U.S.A.

Department of Electrical Engheerhg


Tennessee Technological University
Cookeville, Temessee 38505,U.S.A.

AbstMct- In hardware implaneatations of digital


artificial neural networlrs, the amount of logic that can be
utilized is limited, Dw to this limitation, leaming
algorithms that are to be executed in hardware must be
implemented using fmed point arithmetic. Adapting the
back propagation learning algorithm to a fmed point
arithmetic !3ystm reguires Inany app"ations, scaling
techniques and the we of lookup tables. These methods
are explained. The convergence results for a test example
w i q fmed point, floating point and
implementations of the backpropagation algorithm are

.-P
INTRODUCTION
The m of artificial neural networks holds promise in
solving many problems that traditional digital computers have
had only limited success in solving. ASICs and
neuro"pute.rs are still very expensive. So, most artificial
n d systems exist as computer simulations. These
simulations require a large number of floating point
calculations. This means that the simulations must be m on
fast machines such as "es, workstations or high
performance PC's to produce solutions in a timely manner.
The high performance machines are also expensive. Thus
artificial neural networks are still out of reach for mauy
applications.
At Tennessee Technological University, research is being
conducted on the umstruction of a multilayer perceptron
trained by the backpropagation algorithm @PA) [l]. The
network is an attached processor in a low performance
personal computer. It is being implemeated using Xilinx
field prSpz"ble gate arrays WAS)
on a standard
PC/AT expansion card. The network is compatible with a
reconfigurable system being developed at the university [2].
Due to limitations on the amout of hardware that could
be placed in the FPGAS, the BPAwas adapted to use fixed
point arithmetic. The hardware accelerator designed to
simulate multilayer perceptrons using the backpropagation
training algorithm is called FA"-BACK. FA"-BACK is
a proof-ofconcept research effort in implementing a
neumc0xnpute.r using an FPGA-based reconfigurable system.
Software simulators of the fixed and floating point

0-7803-1797-1/94/$3.00 0 1994 IEEE


136

versions of the back propagation algorithm were written in


Borland Tuibo c+ + ,version 1.00. These simulationswere
run on a DOSuxupatible, 386SL, 25 M H z persanaI
computer with 4 Mbytes of RAM.
The hardware design and simulation of FA"-BACK
requited a significantly higher level of CoIflPuting power than
the two softwaresimulators. The design was completed using
Viewlogic Workview Version 4.1.0. Within Workview,
ViewDraw was used to design the hardware and ViewSim
was used to simulate FA"-BACK. The s o w used in
the hardware design and simulation was run on a Sun
Microsystems
1+ with 16 Mbytes of RAM.
Adapting the BPA to a fixed point arithmetic system
required approximations, scaling techniques, and the use of
lookup tables. The design steps included the coding of a
floatingpoint simulator, the coding of a fixedpoint simulator,
and the design of FA"-BACK'S hardware. Thew design
steps are covered in the next sections.

SYSTEM DESIGN

The scope of the systems investigated consisted of


networb with one hidden layer and up to 25 wdes per layer.
Each node in the network used the bipolar sigmoidal
activation fimction. The hidden and output layers each
contained one additional node that was used for a threshold
input. For simplicity in the hardwpre, the learning rate used
in weight update was set to om. These algorithms were
based on the backpropagation algorithm presented by
Freeman and Skapura [3].
Elwting Point SImulotiOn Pmg"
There were three major reasons for completing this
program. One, it tested the basic approrrch to the problem
and control flow of the algorithm. Two, it produced a
measure of the time required for the network to converge
when implemented in software. And three, it provided the
skeleton structure for the fixed point adaptation of the
backpropagation training algorithm.
The floating point simulation program was divided into
major tasks that were executed in procedures. The main
program initialized all files, read the setup data, and
sequenced through the training patterns until the error was

within the tolerance. The procedurea randomly assigned


initial weights, executed the forward propagation of input
pattans, calculated the error of the epoch, calculated the
backpropagated error, adjusted the weights, and calculated the
output and its derivative using the bipolar sigmoidal function.

Faed Point Stnulation Rvg"


After rmccessful results were obtained using the floating
point backpropagation algorithm simulation program, the
algorithm was adapted to execute all calculations using fixed
point numbers. This was done by changing all variables that
were of the C language types f l o a t and double to i n t
(integer) types, scaling them to retain the necessary accuracy,
and c o w g for the scaling.
The floating point simulation program was modified by
adding a procedure after every calculation to check the result
and determine the maximum and mini"values typically
encountered. This program was run several times with
different initial weights. From this investigation, the
minimumnumberencouutemd determinedthatascalefador
ofat least ten was requiredto retain theneeded accuracy. To
make the multiplication and division of the scale factor use
"
I
bardware, a scale factor that was a power of two
was used. The smallest power of two that was also greater
than or equal to ten was chosen, 16. The maxi" number
encountered suggested that 16 bits were needed to avoid
o v d o w after the scale factor of 16 was chosen.
The basic control flow (the structure of the loops and
conditionals) of the fixed point simulation program was
unchanged from that of the floating point program. Since
int data types require less memory to store than f l o a t or
double types, the number of nodes in each layer was
i n d . The maximum number of nodes was increased in
the fixed point program to 49 per layer. This was done in
case the loss of accuracy eLlcoulltefed as a result of using
fixed point data required more hidden layer nodes for
convergence. All of the assumptions and decisions made for
"alhard-,
limited the scope of problems that FANNBACK and the fixed point simulation program could solve.
The number of inputs had to be kept at three or fewer.
There were two major changes made to adapt the
backpropagation training algorithm to fixed point numbers.
These were cor.qen&hg for the scale factor and handling
the sigmoidal function and derivative of the sigmoidal
function. The methods that were used to handle each of these
are discussed below.
When two numbers multiplied by the scale factor were
added or subtracted, no special compensation for the scale
factor was required. However, when two scaled numbers
were multiplied, the result had to be divided by the scale
factor to obtain the proper result. The division was simply
executed in software in the fixed point simulator. The
division by the scale was very simple in hardware since

dividing by 16 is equivalent to shiftins the result of the

rmlltiplicationright by fourplaces or simply using the sixteen


bits numbed four through 19.
A major change in the structure of the fixed point
program over that of the floating point program was the
removal of the two procedures that mathematically calculated
These two
the sigmoidal output and its derivative.
procedures were replaced with procedures that used lookup
tables. This was done to simdate the mechanism by which
the sigmoid and its derivative would be calculated in FANN-

BACK'S bardware.
In the calculation of the sigmoidal output and its
derivative, the net input to thenode wasused as an address
into a lookup table, which supplied the nodes's output value.
Fig. 1illustmks this lookup table mechanism. The derivative
of the sigmoidal functionwas calculated with a similar lookup
table.

204+-]
2W7

0'

15

Fig. 1. Output Calculation Using a Lookup Table.


These functionscould have been calculated using C code,
but the program was writtm to simulate the operation of the
hardware that was to be used in FA"-BACK.
TO
accomplish this, the data for the lookup tables of the
sigmoidal function and its derivative were calculated in a
separate p r o g r a m and stored in data files. Equation (1) was
used to calculate the data for the sigmoidal function lookup
table.

Equation (2) was used to calculate the data for the derivative
lookup table.
r

137

recorded for ten random weight initializations. Due to the


The lookup tables were read into onedhedonal complexity in building simulation command files, FA"m y s by the main program. Within the one-dhmional BACK was only sirrmlated for one of these weight
arrays, the index of an element corresponded to the net value initializatim. Table I shows the results obtained from the
used to access the elernent.
To avoid " b l e
effects due to quantizition of these Table I. Results of Simulations.
functions,a large number of outputs cOrrespOnding to net
inputs were stored in the lookup table. Eleven bit addresses
were used in the lookup table, so the values for the scaled net
terms b e e n -1024 and 1023 were stored in the lookup
table. This m"&d to a range of net inputs of [a,
63.751.
The floating point derivative approaches zero at 00 and
00, but never reaches zero. This means that there is always
some weight adjustment. However, when the derivative is
below 1/16, the scaled fixed point derivative is zero and no simulations. All ten floating point simulations Cmverged,
weight adjustment is made. This problem was solved by while ten of eleven fixed point simulations converged. The
shifting the entire derivative function upward by adding one times shown for FA"-BACK to umverge were based on the
to each stored value in the lookup table. This obviously time per epoch and the fact that FANN-BACK executes the
introduced some error, the effects of which have not yet been same calculations as the fixed point simulator.
The above results show that the for the 2-input EXOR
fully investigated.
example, the fixed point simulation generally took slightly
longer to converge and requiredm y more epochs compared
FMN-BACK Hardwan
to the floating point simulation. On the other hand, the fixed
The hardware for FANN-BACK'S data path was placed point stlucture resulted in simpler hardwate as designed in
into four Xilinx XC3090 field programmable gate arrays FA"-BACK'S data path. This hardware was 12 to 19 times
(FPGAs). FA"-BACK was designed to execute all faster than either of the software simulators.
calculations required for trpining and recall. The initial
There are limitations on the current versions of the fixed
weights, the number of layers, the number of nodes in each point simulator and FA"-BACK. They are limited to 2layer, and the error tolerance were loaded into the 84 Kbyte input networks until more work can be done on the fixed
memory subsytem. The net terms, inputs and outputs, point algorithm. Larger numbers of inputs result in networks
weights, backpropagated error terms, desired outputs, for which the epoch error oscillates around local "a.
sigmoidal lookup table, and derivative lookuptable were each This problem is being investigated.
stored in separateiy addressable and controllable memory
modules. This was done to avoid memory access conflicts in
REFERENCES
the control of the data path.
The logic for FA"-BACK'S data path was organized as [l] R. Kimball Presley, "FANN-BACK: An FPGA-Based
three sections ofone pipeline that were activated sequentially.
Artificial Neural Network Trained by the
Activating the sections sequentially required 221 pipe cycles
BackpropagatiOn Algorithm", M.S. Thesis, Tennessee
per epoch using the test example. A pipe cycle consisted of
Technological University, CooLeviUe, TN, 1994.
18 system clock cycles with a period of 330 ns. Therefore, [2] Leo J. Klaes, "A PC/AT Jnterface for an E A - B a s e d
each epoch required 1.31 IUS on FA"-BACK.
Reconfigurable System", M.S. Thesis, Teamasee
FANN-BACK'S data path and "oiy SUSwere
Technological University, cookeville, TN, 1994.
designed and simulated. However, the umtroller has not [3] James A. Freeman and David M. Skapura, Neural
been designed and the system has not been wnstructed yet.
Networks Algorithms Applications and PrOgrMvning
Techniques, Addison-Wesley, Reading, MA, 1991.

In Equations (1) and (2), the constant scale was 16 and X was
0.7.

RESULTS

The floating point, fixed point, and FA"-BACK neural


network simulators were trained to fimction as a 2-input
EXOR function using a network structure with five hidden
layer nodes. The number of epochs required to train the
network as well as the duration of time required to tmin were

138

Anda mungkin juga menyukai