net/publication/3186557
CITATIONS READS
2,245 1,027
2 authors, including:
John J Hopfield
Princeton University
231 PUBLICATIONS 52,985 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by John J Hopfield on 15 September 2014.
Abstract -We describe how several optimization problems can be rapidly These networks are guaranteedof obtaining globally opti-
solved by highly interconnected networks of simple analog processors. mal solutions since the solution spaces(in the vicinity of
Analog-to-digital (A/D) conversion was considered as a simple optimiza-
tion problem, and an A/D converter of novel architecture was designed.
specific initial conditions) have no local minima. The A/D
A/D conversion is a simple example of a more general class of signal-deci- converter is actually one simple example of a class of
sion problems which we show could also be solved by appropriately problems for which appropriately constructed collective
constructed networks. Circuits to solve these problems were designed using networks should rapidly provide good solutions. The gen-
general principles which result from an understanding of the basic collec- eral class consists of signal decomposition problems in
tive computational properties of a specific class of analog-processor net-
works. We also show that a network which solves linear programming
which the goal is the calculation of the optimum fit of an
problems can be understood from the same concepts. integer coefficient combination of basis functions (possibly
a nonorthogonal set) to an analog signal. The systematic
I. INTRODUCTION approach we have developed to design such networks
should be more broadly applicable.
E HAVE shown in earlier work [l], [2] how highly
W interconnected networks of simple analogprocessors
can collectively compute good solutions to difficult optimi-
Fahlman [4] has suggested a rough classification of
parallel-processor architecturesbased upon the complexity
of the messagesthat are passedbetween processingunits.
zation problems. For example, a network was designed to
At the highest complexity are networks in which each
provide solutions to the traveling salesmanproblem. This
processor has the power of a complete von Neumann
problem is of the np-complete class [3] and the network
computer, and the messageswhich are passed between
could provide good solutions during an elapsed time of
individual processors can be complicated strings of in-
only a few characteristic time constants of the circuit. This
formation. The simplest parallel architectures are of the
computation can be considered as a rapid and efficient
“ value-passing” type. Processor-to-processorcommunica-
contraction of the possible solution space. However, a
tion between local computations consists of a single binary
globally optimal solution to the problem is not guaranteed;
or analog value. The collective analog networks considered
the networks compute locally optimal solutions. For the
here are in this class; each processor makes a simple
traveling salesman problem, even among the extremely
computation or decision based upon its analysis of many
good solutions, the topology of the optimization surface in
analog values (information) it receives in parallel from
the solution space is very rough; many good solutions are
other processorsin the network. Our motivation for study-
at least locally similar to the best solution, and a com-
ing the computational properties of circuits with this
plicated set of local minima exist. In difficult problems of
organization arose from an attempt to understand how
recognition and perception, where rapidly calculated good
known bidphysical properties and architectural organiza-
solutions may be more beneficial than slowly computed
tion of neural systemscan provide the immense computa-
globally optimal solutions, collective computation in cir-
tional power characteristic of the brains of higher animals.
cuits of this design may be of practical use.
In our theoretical modeling of neural circuits [l], [2], [5],
We have recently found that several less complicated
[6], each neuron is a simple analog processor,while the rich
optimization problems which are not of the np-complete
connectivity provided in real neural circuits by the syn-
class can be solved by networks of analog processors.The
apsesformed between neurons are provided by the parallel
two circuits described in detail here are an A/D converter
communication lines in the value-passinganalog processor
and a circuit for solving linear programming problems.
networks. Hence, in addition to designs for conventional
implementation with electrical components, the circuits
Manuscri t received August 27,1985; revised This work was supported and design principles described here add to the known
in part by tfl e National Science Foundation under Grant PCM-8406049.
D. W. Tank is with the Molecular Biophysics Research Department, repertoire of neural circuits which seem neurobiologically
AT&T Bell Laboratories, Murray Hill, NJ 07974. plausible. In general, a consideration of such circuits pro-
J. J. Hopfield is with the Division of Chemistry and Biology, California
Institute of Technology, Pasadena, CA 91125. vides a methodology for assigning function to anatomical
IEEE Log Number 8607497. structure in real neural circuits.
L i=O
binary word of the converter, a network of feedback resis-
tors connecting the outputs of one amplifier to the inputs The structure of this term was chosen to favor digital
of the others, a set of resistors (top row) which feed representations. Note that this term has minimal value
different constant current values into the input lines of the when, for each i, either V;:= 1 or V;.= 0. Although any set
amplifiers, and another set of resistors (secondrow) which of (negative) coefficients will provide this bias towards a
inject current onto the input lines of the amplifiers which digital representation, the coefficients in (4) were chosenso
are proportional to the analog input voltage x, which is to as to cancel out the diagonal elementsin (3). The elimina-
be converted by the circuit. For the present we assumethat tion of diagonal connection strengthswill generally lead to
the output voltages (V;.) of the amplifiers can range be- stable points only at corners of the space. The term (4)
tween a minimum of 0 V and a maximum of 1 V. Thus as equally favors all comers of the space,and does not favor
described above for the variables in (l), the F range over any particular digital answer. Thus the total energy E
the domain [O,l]. We further assumethat the value of x in which contains the sum of the two terms in (3) and (4) has
volts is the numerical value of the input which is to be minimal value when the 5 are a digital representation
converted. The converter network is operating properly close to x.
when the integer value of the binary word representedby This completes the energy function for the A/D con-
the output states of the amplifiers is numerically equal to verter. It can be expandedand rearrangedinto the form
the analog input voltage. In terms of the variables defined
above, this criterion can be written as E=-f; i (-2’+j)Fy.
L j=Oi#j=O
iioF2i=x. (2)
- i (-2Qi-1) +2’x)4. (5)
i-0
The circuit of Fig. 2 is organized so that this expression
always holds. This is of the form of (1) if we identify the connection
The strategy employed in creating this design is to matrix elements and the input currents as
consider A/D conversion as a simple example of an opti- T. = - 2(‘+i)
‘J
mization problem. If the word V,V,V,I’,, is to be the “best”
digital representation of x, then two criteria must be Ii = (-2(*i-1) +2ix).
(6)
fulfilled. The first is that each of the v have the value of 0
The complete circuit for this 4-bit A/D converter with
or 1, or at least be close enough to these values so that a
components as defined above is the network shown in Fig.
separatecomparator circuit can establishdigital logic levels.
2. The inverting output of each amplifier is connected to
The second criterion is that the particular set of l’s and O’s
the input of the other amplifiers through a resistor of
chosen is that which “best” represents the analog signal.
conductance 2’+j. The other input currents to each ampli-
This second criterion can be expressed,in a least-squares
fier are provided through resistors of conductance 2’ con-
sense, as the choice of y which minimize the energy
nected to the input voltage x and through resistors of
function
conductance 2(2i-1) connected to a - 1-V referencepoten-
tial. These numbers for the resistive connections on the
(3)
feedback network and the input lines represent the ap-
536 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS,VOL. CAS-33,~0. ~,MAY 1986
be scaled for power dissipation) for the feedback connec- Fig. 3. The digital word computed in simulations of the circuit shown in
Fig. 2 as a function of the analog input voltage x. The initial conditions
tions are for each of the calculations is ui = 0, for all i.
z(i+A
Tj=--
VBB
while the input voltage x will be fed into the ith amplifier 15 - _. . -
.
through a resistor of conductance2(4+i)/VH, and the con-
stant current is provided through resistors of conductance
(2(‘-‘) +(2(2i-1)/VR)) connected to the - V, reference
voltage. P
10 -
the network. These initial conditions are defined by the Fig. 4. The results of a calculation similar to that described in Fig. 3
input voltages (ui) on the amplifiers at the time that the except that the initial conditions were determined by the a, which
stabilized during the previous calculation. Calculations were performed
calculation is initiated. In Fig. 3 is plotted the value of the with monotonically increasing values of the analog input voltage x,
binary word V3V2V,Vo computed by the network as a func- starting at x = 0 V.
tion of the value of (x + 0.5) for the initial conditions
ui = 0. The responseis the staircasefunction characteristic the value for (16.0- x) in the experiment with x ascend-
of an A/D converter. In a real circuit, separateelectronics ing.) Some stable states of the network are skipped under
which would ground the input lines of the amplifiers before this set of initial conditions.
each convergence would be required to implement the One can understand this hysteresis, and its absencefor
initial conditions (ui = 0) used in these simulations. If the the u, = 0 initial conditions, by considering the topology of
input lines are not zeroedbefore each calculation, then the the energy surface for fixed x and how it changesas x is
circuit exhibits hysteresis as the input voltage x is being varied. In Fig. 5 is shown a stylized representation of the
continuously varied. For example, if x is slowly turned up energy surface for two different x values. The energy at
through the same seriesof values used in the calculation of specific locations in state spaceis represented,with energy
Fig. 3, but, instead of zeroing the input lines before a value along the vertical axis. Different comers of state
simulated convergence,we allow the ui to retain the values spacenear the global minimum in E (with value E = 0) are
stabilized at the end of the previous calculation, we obtain indicated along the curve by the set of indices V,V,V,V,. As
the response shown in Fig. 4. Slowly turning down the x shown in Fig. 5(a), the energy function for x = 7 V has a
input from its maximum value would provide a response deep minimum at the corner of state space which is the
which is the “inverse” of Fig. 4. (The value for any x, in digital representation of 7, and has local minima at higher
the experiment with x descending,is equal to 15.0 minus E values at the digital representations of 6 and 8. Al-
TANK AND HOPFIELD: NEURAL OPTIMIZATION NETWORKS 537
(b)
Fig. 5. A schematic drawing of the energy surface in the vicinity of the 0 20 40 60 80 100
global minima for two analog input voltages. t
fier, with
f e-[(li-t)/o]2+[(i-t~)/~~]*]
T,,, (r’ff=
i=l
N 1 N
I,, = C xie-l(i-W~12+ 2 ,F ,-2K-Wo12. (13)
i=l r-1
This function describes a network which has an energy prised of the coefficients which describe a linear summa-
minimum (with E = 0) when the “best” digital combina- tion of the basis functions which is closest, in the least
tion of basis functions are selected(with v = 1) to describe squares sense,to the input signal.
the signal. The expression (7) can be expanded and re- For the A/D converter problem and the Gaussian de-
arranged to give composition/decision network just described, the basis
1 functions which span the signal space are not orthogonal.
E = -c c (Z’kG’k,)VkVk, For an orthogonal set, by definition, the connection
2 k k’#k
strengths (9) would all vanish. For example, if the signal
consists of N analog-sampled points of a differentiable
function, and the basis functions were sines and cosines(a
Fourier decomposition network), then the computational
This is a function which is comprised of terms which are circuit would have no feedback connections since these
linear and quadratic in the Vk’s. It is, therefore, of the‘form basis functions are orthogonal. In this case, the indepen-
(1) (plus a constant), if we define dent computations made by each amplifier are the con-
Tkk, = - ( Zk.&,) volution of the signal with the particular basis function
represented. This is just the familiar rule for calculating
Ik= (z.zk)+ i(7,.?;) . Fourier coefficients-all decisionsare independent. In gen-
[ I eral, one can interpret the connections strengths in the
Hence, for the general decomposition/decision problem decomposition/decision networks as -the possible effect of
mapped onto the computational network in Fig. 1, the one decision being tested (Vi) on another ( y.); theseeffects
connection strengths between amplifiers correspond to the should be zero for orthogonal basis functions.
dot products of the corresponding pairs of basis functions
while the input currents correspond to the convolution of
the corresponding basis function with the signal and the IV. THE LINEAR PROGRAMMING NETWORK
addition of a constant bias term. The linear programming problem can be stated as the
The A/D converter describedearlier can be seento be a attempt to minimize a cost function
simple example of this more general circuit. In the A/D
case, the signal is one-dimensionaland consists of only an r = j+. j7
04)
analog value sampled at a single time point. The basis
functions are the values 2”; n = 0,. . . , (n - 1) which are a where Jis an N-dimensional vector of coefficients for the
complete set over the integers in the limited domain N variables which are the components of I? This minimi-
[0,2” - 11. The binary word output of the circuit is com- zation is to be accomplished subject to a set of M linear
TANK AND HOPFIELD: NEURAL OPTIMIZATION NETWORKS 539
VARIABLES
constraints among the variables: , /
\
-A,~I- -AZ-V- CONSTRAINTS
Zj.V> Bj, j=l;..,M I
rDjI1
- Dw.. -DG.
-8,-4l- -Bp-u- -83-w -84-t
-D31,. -D32,.
DfZ D22 D32 D42
- Da.. - Dzz..
#j=f('j)Y UjAZj.& Bj
where
z20 Since C, is positive and g-‘(q) is a monotone increasing
f(i) = 0,
function, this sum is nonnegative and
f(z)=--, z < 0. (16)
dE dE dQ
--so; -=oo~=o, for all i. (21)
This function provides for the output of the 4 amplifiers to ’ dt dt
be a large positive value when the correspondingconstraint Thus as for the network in Fig. 1, the time evolution of the
equation it representsis being violated. (The specific form system is a motion in state spacewhich seeksout a minima
of f(z) used here was chosenfor conveniencein building a to E and stops. The network in Fig. 8 should not show any
corresponding real circuit and the stability proof _onlx oscillation even though there are nonsymmetric connection
depends upon f being a function of the variable z = Dj. V strengths between the two sets of 5 and $j amplifiers, as
- Bj (see below).) If we assumethat the responsetime of long as the qj are sufficiently fast.
the 3;: is negligible compared to that of the variable A small computational network was constructed out of
amphfiers, then the circuit equation for the variable ampli- conventional electronic components to solve a 2-variable
540 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, VOL. CM-33, NO. 5, MAY 1986
-Y
Fi 10. The trajectory of x and y for the circuit described in the text as
ta e gradient (indicated by the two vectors from the origin) is rapidly
switched.
Fig. 9. A plot of the measured values of x and y for the linear
programming network described in the text, as a function of the gradient
of the optimization plane. The set of gradients is depicted by their
the location in the xy spaceat which the network stabilized
projections onto the x, y plane, drawn as vectors from the origin.
at as the cost-plane gradient vector (indicated by the array
of short line segments emanating from the origin) was
problem with four constraints using the network organiza- swept in a circle. The circuit was stable at the optimum
tion of Fig. 8. A simple op amp/diode active clamp circuit simplex points corresponding to the correct constrained
was used to provide the nonlinear f input-output function. choice for a given gradient direction.
The equations of constraint for the two variables (x and y) In another experiment, the variable amplifiers were
were artificially slowed using large input capacitance and the
y15 trajectory followed by V, and V, was collected by rapid
-X25 data sampling as the gradient was rapidly switched in
5 35 direction. The trajectory is shown in Fig. 10. The network
ip-YIz follows the gradient until it reachesa constraint wall which
it then follows until the optimum simplex is reached.Since
5 35 the solution spaceis always convex for linear programming
2x+yyI’2 (22) problems, the network is guaranteed to find the optimum
These equations defined the connection strengths (0,;) and solution.
the input currents (Bj) for the Gj amplifiers. In the xy
plane characterizing the solution space, they defined the V. CONCLUSIONS
simplex shown in Fig. 9. A microcomputer-based data We have demonstrated how interconnected networks of
acquisition system was used to control the circuit and to simple analog processorscan be used to solve decomposi-
measure the output voltages of the V, and V, amplifiers tion/decision problems and linearprogramming, problems.
which corresponded to the x, y solutions, as a function of Networks for both problems were;designedusing concep-
rapidly changing sets of input currents supplied to the tual tools which allow one to understand the influence of
input lines of these amplifiers. As indicated in Fig. 8, these complicated feedback in highly interconnected networks of
input currents correspond to the coefficients Ai in the cost analog processors. There appears to be a large class of
function which is to be minimized. For this simple 2-vari- computation problems for which this simple concept of an
able problem, the cost function can be geometrically “energy” function generatesa complete stable circuit de-
thought of as a plane defined by the equation z 2 A,x + sign without the need for a detailed dynamic analysis of
A, y hovering above the xy plane, and the direction of the stability. The function produces the required values of the
gradient of that plane A,; + A29 can be representedby a many resistors from a short statement of the overall prob-
vector in the xy solution plane. The lowest point on the lem.
portion of this cost plane lying above the feasible solution The two basic computations-digital decomposition and
space in the xy plane lies above the optimum simplex the linear programming network-are quite different com-
point. As the cost function is changed, the cost plane tilts putations in several respects. In the decomposition/deci-
in a new direction, the gradient projection in the xy plane sion networks discussed,the answers are digital, and this
rotates, and the optimum simplex point may also change. requirement that the stable states of the network lie on the
We recorded the values of x and y computed by the corners of the solution spacedetermines the highly nonlin-
network for a set of cost functions. The operating points of ear input-output relations for the variable amplifiers. Also,
the circuit are plotted in Fig. 9. Each diamond represents the equations of motion for the individual elements in the
TANK AND HOPFIELD: NEURAL OPTIMIZATION NETWORKS 541
network are of no intrinsic relevance to the problem to be [41 S. E. Fahlman, “Three flavors of parallelism,” in Proc. of the
Fourth National Conf. of the Canadian Society for Computational
solved; they are a program which is used to compute the Studies of Intelligence, Saskatoon, Sask., Canada, May 1982.
correct solution. In contrast, the amplifiers for the vari- [51 J. J. Hopfield, “Neurons with graded response have collective
computational properties like those of two-state neurons,” Proc.
ables in the linear programming network are linear and N&l. Acud. Sci. U.S. A., vol. 81, pp. 3088-3092, 1984.
furthermore the circuit equations of the linear program- [61 J. J. Hopfield, “Neural networks and physical systems with emer-
gent collective computational abilities,” Proc. Natl. Acad. Sci.
ming network (17) have a more direct relationship to the U.S. A., vol. 79, pp. 2554-2558, 1982.
problem to be solved; the constraint relationships are [71 I. B. Pyne, “Linear programming on an electronic analogue com-
puter,” Trans. AIEE, Part I (Comm. & Elect.), vol. 75, 1956.
explicitly represented.This is similar to conventional meth- F31 M. Sivilotti, M. Emerling, and C. Mead, “A novel associative
ods of analog computation in which the processing ele- memory implemented using collective computation,” 1985 Conf. on
VLSI’s, H. Fuchs, Ed., Rockville, MD: Computer Science Press,
ments are chosen to compute specific terms in a differential 1985, p. 329.
equation to be solved. In fact, a computational circuit [91 L. D. Jackel, R. E. Howard, H. P. Graf, R. Straughn, and J. Denker,
“Artificial neural networks for computing,” in Proc. o the 29th Int.
similar to that in Fig. 8 has been described [7]. Here we Svmo. on Electron. Ion. and Photon Beams. to be DUL hshed m the
have analytically shown the stability of this circuit design % l&c. Sri. Tech..’ ’
WI D. Psaltis and N. Farhat, “Optical information processing based on
and illustrate it as a limiting caseof more generalnetworks an associative-memory model of neural nets with thresholding and
for which the circuit equations do not necessarilyrelate to feedback,” Opt. Left., vol. 10, pp. 98-100, 1985.
the problem to be solved. Another distinction is that the
signal decision/deconvolution circuit makes a decision on
the basis of the absolute values of its analog inputs, while rfc
the linear programming circuit decisions are_based only on
the relative values of the input amplitudes A. This self-scal- David W. Tank was born in Cleveland, OH, on
ing property is often desired in signal processing and June 3, 1953. He received his undergraduate edu-
pattern recognition. cation from Case Western Reserve University,
Cleveland, OH, and Hobart College, Geneva,
The practical usefulness of analog computational net- NY. He received the Ph.D. degree in physics
works remains to be determined. Here, we have demon- from Cornell University, Ithaca, NY in 1983.
strated, that for “simple” computational tasks and well- From 1983 to 1984 he was a Post-Doctoral
defined initial conditions, the networks can sometimesbe fellow at AT&T Bell Laboratories in Murray
Hill, NJ. He has remained at Bell Laboratories,
guaranteed of finding the global optimum solution. The joining the Molecular Biophysics Research De-
major advantage of these architectures is their potential partment in 1984. His research interests concern
combination of speedand computational power [l]. Inter- the biophysics of individual nerve cells, neural representations and coding,
and the computational properties of neural circuits.
esting practical usesof such circuits for complicated prob-
lems necessitate huge numbers of connections (resistors)
and amplifiers. Such circuits might be built in integrated 8
circuit technology. Work has begun on questions of the
microfabrication of extensiveresistive connection matrices
[8], [9]. Optical implementations of such circuits are also John J. Hopfield received the B.A. degree from
feasible [lo]. Swarthmore College in 1954 and the Ph.D. de-
gree in physics from Cornell University, Ithaca,
NY, in 1958.
REFERENCES His research has included work on electron
111 J. J. Hopfield and D. W. Tank, “‘Neural’ computation of decisions transfer in photosynthesis, accuracy and proof-
optimization problems,” Biological Cybern., vol. 52, pp. 141-152, reading in biomolecular synthesis, studies of
1985. “neural” networks in biological computation, and
PI J. J. Hopfield and D. W. Tank, “Collective computation with optical properties and impurity levels of semicon-
continuous variables,” in Disordered Systems and Biological Organi-
zution, E. Bienenstock, F. Fogelman, and G. Weisbuch, Eds., ductors. He is currently Roscoe G. Dickonson
Berlin, Germany: Springer-Verlag, 1985. Professor of Chemistry and Biology at the Cali-
131 M. R. Garey and D. S. Johnson, Computers and Intractability. fornia Institute of Technology and a member of the Molecular Biophysics
New York: Freeman, 1979. Research Department at AT&T Bell Laboratories, Murray Hill, NJ.