Summary
In [1] the authors consider extending the concept of Hopeld associative memory to neural networks that utilize higher-order neurons. The state of these higher-order neurons not only depend on the state of their neighbors, but also on their second and higher order correlations. The authors claim the assumption of having higher-order neurons is biologically meaningful as the brain is highly packed with neurons in close vicinities of each others and in such conditions, neighboring neurons can aect the the synaptic weight of other connections in addition of that of their own. Using such a model, the authors show that the storage capacity of Hopeld networks can be improved. More specically, the authors obtain storage capacities that are polynomial in N with exponents linear in p, the order of the network. In other words, M = O(N p2 ). However, note that the number of stored patterns is still proportional to the number of synapses. For a network of size N , we can get more synapses if we consider higher-order correlations. Rate: 10+/10
wij sj +
j,k
wijk sj sk + . . .)
(1)
where f (.) is non-linear function and is the ring threshold. Figure 1 illustrates schematics of a high-order neuron. 1
The overall problem is the same as that of Hopeld associative memory: we have M binary patterns of length N , denoted by X = {x }, which we would like to memorize using i multiconnected neural networks. The update rule is given by equation (1). [disadvantage][question]The authors have assumed that the absolute value of the higherorder connection weights scaled with second-order ones, i.e. |wijk | |wij | (2)
[very important]With the assumption of = 0, the learning rule for the weights is an extended version of the Hopeld learning rule: wij ... wijk...p = p2 < s s s . . . s > p i j k (3) where < . > denotes averaging over all patterns in the training set. Using an analysis similar to that of Hopeld, the authors breaks down the linear input sum of each neuron into a desired term and an interference term. Assuming the patterns to be 2 = < s s > i j wijk = < s s s > i j k
stochastically independent, the interference term will become a Gaussian random variable as a consequence of central limit theorem. Therefore, the stability condition reduces to the probability of a Gaussian random variable being smaller than the desired term. Limiting this probability to be small will give us the maximum number of patterns than can be stored.
Results
1 [very important][advantage][good for report]For leq 1/N , the capacity is (1N )2 times that of Hopeld networks. p2 [very important][advantage][good for report]For > 1/ N , the capacity is 2p1N (p1/2) times that of Hopeld networks. Here, p is the order of the network.
In the above equation, note that the exponent is linear in p. [important]For networks with order two and three, the pattern retrieval capacity is: M (1 + N )2 MHopf ield 1 + 25/2 1/2 N 1/2 + 3 2 N (4)
this formula is valid in the whole range of values of . [important][idea]An interesting point to note is that the number of stored patterns is still proportional to the number of synapses. For a network of size N , we can get more synapses if we consider higher-order correlations. It has also been found that the proportionality factor between the number of stored bits and the number of synapses is a decreasing function of the synaptic order.
[disadvantage][question]The authors have assumed that the absolute value of the higherorder connection weights scaled with second-order ones, i.e. |wijk | |wij | (5)
[very important][idea]With the assumption of = 0, the learning rule for the weights is an extended version of the Hopeld learning rule: wij ... wijk...p = p2 < s s s . . . s > p i j k (6) where < . > denotes averaging over all patterns in the training set. [idea][My opinion]: I think in order to fully utilize the power of higher-order neurons, not only the learning rule should be based on higher order correlations, but also neural update rule itself should look like (1). Because if neural update rule is the same as the traditional one, then even with the best learning rule, which is the pseudo-inverse rule, one is unable to learn more than a linear number of patterns. [advantage]Using the proposed scheme, the authors obtain polynomial storage capacities with exponents linear in p, the order of the network. [idea][important]The authors have also investigated the eect of noise during the learning phase. = < s s > i j
References
[1] P. Peretto, J. J. Niez, Long term memory storage capacity of multiconnected neural networks, Biological Cybernetics, Vol. 54, No. 1, 1986, pp. 53-63.