Anda di halaman 1dari 23

A Hierarchical Self-organizing

Associative Memory for


Machine Learning

Guided By
Prof V.S.Patil
Introduction: A biological point of view

Memory is a critical component for understanding and developing


natural intelligent machines/systems
The question is: How???
Self Organizing Map

• Characteristics:
• * Self-organization
• * Unsupervised learning data

Remote neurons System clock

Nearest neighbour neuron

Other Neurons

II: information index


ID: information
deficiency
Kohonen has demonstrated a neural learning structure
involving networks that perform dimensionally reduction
through conversion of feature space to yield
topologically ordered similarity graph or maps or
clustering diagram
• This network contains two layers of nodes - an input layer and a
mapping (output) layer in the shape of a two-dimensional grid.

• The input layer acts as a distribution layer.

• The number of nodes in the input layer is equal to the number of


features or attributes associated with the input.

• Each node of the mapping layer also has the same number of
features as there are input nodes.

• The network is fully connected in that every mapping node is


connected to every input node.

• The mapping nodes are initialized with random numbers.

• Each actual input is compared with each node on the mapping


grid.
• The ``winning'' mapping node is defined as that with the
smallest Euclidean distance between the mapping node vector and
the input vector

• The input thus maps to a given mapping node. The value of the
mapping node vector is then adjusted to reduce the Euclidean
distance.

• In addition, all of the neighboring nodes of the winning node are


adjusted proportionally. In this way, the multi-dimensional (in
terms of features) input nodes are mapped to a two-dimensional
output grid.

• After,all of the input is processed (usually after hundreds or


thousands of repeated presentations), the result should be a
spatial organization of the input data organized into clusters of
similar (neighboring) regions
S.O.M. Algorithm

.
1 Initialize weights with 3 Select unit c as the
small random values winner such that
c= min(dj) j

2 Calculates distances
from new input to all units
according to dj= ||i(k) –
mj(k)||, where i(k) is the 4. Update the weights of
input and mj(k) is the winning unit
weight vector of unit j

Enough Iteration?
NO

YES

STOP
Associative learning algorithm
Basic learning element
P00 P01 P10 P11

I1

O
Processing Element

I2

Memory architecture consists of a multilayer array of the


processing elements (PE).

Its organization follows a general self-organizing learning array


concept.

Figure gives the interface model of an individual PE, which


consists of two inputs (I1 and I2 ) and one output (O). Each PE
stores observed probabilities P00,P01,P10,and P11, corresponding
to four different combinations of inputs I1and I2 ({I1,I2}={00},
{01},{10},{11} )
I2

1
n01 n11

0.5

n00 n10
0 0.5 1 I1

• Figure gives an example of possible distribution of the observed


input data points (scaled to the range[0 1]).

• Probabilities are estimated from p00 =n 00/ntot, p01 =n01/


ntot, p10 =n10/ntot and p11 =n11/ntot,

where n00, n01, n10 and n11 is the number of data points located
in I1<0.5 & I2<0.5, I1<0.5 & I2> 0.5,
I 1 > 0 5 & I 2 < 0 5 and I1>05&I2>05,
respectively n tot is the total number of data points defined as
• ntot = n 00 + n 01 + n 10 + n11
• Based on the probability distribution p00, p01, p10 and p11 of an
individual PE, each PE decides its output function value F by
specifying its truth table as shown in Table 1.

For example.
During training, each PE counts its input data points in n00, n01,
n10 and n11 and estimates their corresponding probabilities
p00,p01,p10 and p11.

The objective of the training stage for each PE is to discover the


potential relationship between its inputs. This relationship is
remembered as the corresponding probabilities and is used to
make associations during the testing stage.

Considering the example in Fig. 2, this particular PE finds that


most of its input data points are distributed in the lower-left
corner (I1<0.5 & I2<0.5).

Therefore, if this PE only knows one of the input signal is


I1 < 0.5, it will associatively predict that the other signal most
likely should also be I2 <0.5.
Three types of associations

• IOA: Input only association;


• OOA: Output only association;
• INOUA: Input-output association;

I1
I1 I1

Of Of
Of PE PE
PE

I2
I2 I2

IOA: I1 I2 f OOA: O f { I 2 f , I1 f } INOUA: { I , O f }


1 I2 f

(a) (b) (c)


Defined signal Undefined signal Make association
Probability based associative learning algorithm
V ( I1 ) = m

 Case 1: PE
V (O )
Given the values of both inputs, decide
the output value; V (I 2 ) = n

p ( I1 =1, I 2 =1, F =1) p ( I1 = 0, I 2 =1, F =1)


V (O ) = •V11 + •V01
p ( I1 =1, I 2 =1) p ( I1 = 0, I 2 =1)

p ( I1 =1, I 2 = 0, F =1) p ( I1 = 0, I 2 = 0, F =1)


+ •V10 + •V00
p ( I1 =1, I 2 = 0) p ( I1 = 0, I 2 = 0)

V11 =mn ; V01 =(1 −m) n;


V10 =m(1 −n); V00 =(1 −m)(1 −n)
 Case 2: I1

Given the values of one input and an un- Of

defined output, decide the value of the


PE

other input; I2

For instance:

p ( I1 = 1, I 2 = 1) p ( I1 = 0, I 2 = 1)
V (I2 ) = • V ( I1 ) + • ( 1 − V ( I1 ) )
p ( I1 = 1) p ( I 1 = 0)

p ( I1 = 1) = p10 + p11
p ( I1 = 0) = p00 + p01
I1
 Case 3:
Given the values of the output, decide PE
Of

the values of both inputs;


I2

p ( F = 1, I1 = 1) p ( F = 0, I1 = 1)
V ( I1 ) = •V ( O) + • (1 − V ( O ) )
p ( F = 1) p ( F = 0)

p ( F = 1, I 2 = 1) p ( F = 0, I 2 = 1)
V (I2 ) = •V ( O) + • (1 − V ( O ) )
p ( F = 1) p ( F = 0)
 Case 4: I1

Given the values of one input and the Of


output, decide the other input value; PE

For instance: I2

p ( I1 =1, F =1, I 2 =1) ˆ p ( I1 = 0, F =1, I 2 =1) ˆ


V (I2 ) = •V11 + •V01
p ( I1 =1, F =1) p ( I1 = 0, F =1)
p ( I1 =1, F = 0, I 2 =1) ˆ p ( I1 = 0, F = 0, I 2 =1) ˆ
+ •V10 + •V00
p ( I1 =1, F = 0) p ( I1 = 0, F = 0)
Memory network architecture and operation
Input Input
data data
1 7 13 19
1 7

2 8 14 20 ? 2 8 14

. 3 9 15
3 9

?
15 21

4 10 16 22
4 10 16 22

5 11 17
5 11 17 23

6 12 18 24
6 12 18 24

Depth Depth

Feed forward operation Feedback operation


• A feed forward network structure for the proposed memory
architecture. For simplification, 4 layers with 6 PEs per layer
and 6 input signals. The bold lines from PE 1 to PE 11 and from
PE18 to PE21 are two examples of the distant connections

Feedback operation is essential for the network to make


correct associations and to recover the missing parts
(undefined signals) of the input data
Simulation analysis
Hetero-associative memory:
Iris database classification
Class ID is coded using M bit code
redundancy. Since there are 3 classes in
database M*3 bits to code class Id

N-bits sliding-bar coding mechanism: N

Features: V-Min L

3M
M

Class 1
Class identity labels:
3M
M

Class 2

In this simulation: 3M
M

N=80, L=20, M=30


Class 3
Neuron association pathway

Classification accuracy: 96%


Auto-associative memory:
Panda image recovery
64 x 64 binary panda image:
pi = ( x1 x2 ... x n ) , n = 4096 xi = 1 for a black pixel; xi = 0 for a white pixel;
Conclusion and future research

• A hierarchical associative memory architecture is proposed for


machine learning that uses probability based associations.

• Through the associative learning algorithm, each processing


element in the network learns the statistical data distribution,
and uses such information for input data association and
prediction.

• Simulation results on both classification and image recovery


applications show the effectiveness of the proposed method.
Future Research:
• Multiple input (>2) association Mechanism
• Dynamically Self Configurable
• Hardware Implementation
• Facilate Goal driven learning

Anda mungkin juga menyukai