ANN Basic Concepts

AI Applications to Power Systems
A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.1

ARTIFICIAL NEURAL NETWORK

Concepts of Neural Network Multi-layer feed forward networks Back propagation algorithms
Radial basis function and recurrent networks

Introduction
The neural network of an animal is part of its nervous system, containing a large number of
interconnected neurons (nerve cells). "Neural" is an adjective for neuron, and "network" denotes
a graph-like structure. Artificial neural networks refer to computing systems whose central theme
is borrowed from the analogy of biological neural networks.
Bowing to common practice, we omit the prefix "artificial." There is potential for
confusing the (artificial) poor imitation for the (biological) real thing; in this text, non-biological
words and names are used as far as possible.
Artificial neural networks are also referred to as "neural nets," "artificial neural systems,"
"parallel distributed processing systems," and "connectionist systems." For a computing system
to be called by these pretty names, it is necessary for the system to have a labeled directed graph
structure where nodes perform some simple computations. From elementary graph theory we
recall that a "directed graph" consists of a set of "nodes" (vertices) and a set of "connections"
(edges/links/arcs) connecting pairs of nodes. A graph is a "labeled graph" if each connection is
associated with a label to identify some property of the connection. In a neural network, each
node performs some simple computations and each connection conveys a signal from one node
to another, labeled by a number called the "connection strength" or "weight" indicating the extent
to which a signal is amplified or diminished by a connection.

Properties of neural networks
The use of neural networks offers the following useful properties and capabilities:
1. Nonlinearity. A neural network, made up of an interconnection of nonlinear neurons,
is itself nonlinear. Moreover, the nonlinearity is of a special kind in the sense that
it is distributed throughout the network. Most real systems, including power
systems are nonlinear, so this property is very desirable for its applications in power
systems.
2. Input-Output Mapping. A popular paradigm of learning called learning with a teacher
or supervised learning involves modification of the synaptic weights of a neural network
by applying a set of labeled training samples or task examples. Each example consists
of a unique input signal and a corresponding desired response. The network learns
from the examples by constructing an input-output mapping for the problem. In
Unit III Artificial Neural Network
power system voltage security analysis, the traditional approaches which are widely
used can be used to generate those training samples.
3. Adaptivity. Neural networks have a built-in capability to adapt their synaptic weights
to changes in the surrounding environment. In particular, a neural network trained to
operate in a specific environment can be easily retrained to deal with minor
changes in the operating environmental conditions. Moreover, when it is operating in
a nonstationary environment, a neural network can be designed to change its
synaptic weights in real time.
4. Fault tolerance. A neural network has the potential to be inherently fault tolerant in
the sense that its performance degrades gracefully under missing or erroneous data.
The reason is that the information is distributed in the network, the errors must be
extensive before catastrophic failure occurs.

Biological neurons

A typical biological neuron is composed of a cell body, a tubular axon, and a multitude of
hair-like dendrites, shown in figure. The dendrites form a very fine filamentary brush
surrounding the body of the neuron. The axon is essentially a long, thin tube that splits into
branches terminating in little end bulbs that almost touch the dendrites of other cells. The small
gap between an end bulb and a dendrite is called a synapse, across which information is
propagated. The axon of a single neuron forms synaptic connections with many other neurons;
the presynaptic side of the synapse refers to the neuron that sends a signal, while the
postsynaptic side refers to the neuron that receives the signal. However, the real picture of
neurons is a little more complicated.
1. A neuron may have no obvious axon, but only "processes" that receive and transmit
information.
2. Axons may form synapses on other axons.
3. Dendrites may form synapses onto other dendrites.
The number of synapses received by each neuron range from 100 to 100,000.
Morphologically, most synaptic contacts are of two types.

Type I: Excitatory synapses with asymmetrical membrane specializations; membrane thickening
is greater on the postsynaptic side. The presynaptic side contains round bags (synaptic vesicles)
believed to contain packets of a neurotransmitter (a chemical such as glutamate or aspartate).

Type II: Inhibitory synapses with symmetrical membrane specializations; with smaller ellipsoidal
or flattened vesicles. Gamma-amino butyric acid is an example of an inhibitory neurotransmitter.

Fig. Biological neurons
Basic Concepts
Artificial neural networks (ANN) have been used for two main tasks: 1) function
approximation and 2) classification problems. Neural networks offer a general framework for
representing non-linear mappings

Artificial Neuron Structure
The human nervous system, built of cells called neurons is of staggering complexity. An
estimated 10
11
interconnections over transmission paths are there that may range for a meter or
more. Each neuron shares many characteristics with the other cells in the body, but has unique
capabilities to receive, process and transmit electrochemical signals over neural pathways that
comprise thebrains communication system. Figure shows the structure of typical biological
neurons. Biological neuron basically consists of three main components cell body, dendrite and
axon. Dendrites extend from the cell body to other neurons where they receive signals at a
connection point called a synapse. On the receiving side of the synapse, these inputs are
conducted to the cell body, where they are summed up. Some inputs tend to excite the cell
causing a reduction in the potential across the cell membrane; others tend to inhibit its firing
causing an increase in the polarization of the receiving nerve cell. When the cumulative
excitation in the cell body exceeds a threshold, the cell fires and action potential is generated and
propagates down the axon towards the synaptic junctions with other nerve cells.

Neural Network
A neural network (NN) is an abstract computer model of the human brain. The human
brain has an estimated 10
11
tiny units called neurons. These neurons are interconnected with an
estimated 10
15
links. Although more research needs to be done, the neural network of the brain is
considered to be the fundamental functional source of intelligence, which includes perception,
cognition, and learning for humans as well as other living creatures. Similar to the brain, a neural
network is composed of artificial neurons (or units) and interconnections. When we view such a
network as a graph, neurons can be represented as nodes (or vertices), and interconnections as
edges.

Although the term "neural networks" (NNs) is most commonly used, other names include
artificial neural networks (ANNs)to distinguish from the natural brain neural networksneural
nets, PDP(Parallel Distributed Processing) models (since computations can typically be
performed in both parallel and distributed processing), connectionist models, and adaptive
systems.

Neuron
Artificial neural networks (ANNs) are software or hardware systems designed to simulate
the operation of a simple biological nervous system.

The basic element of the brain is a natural neuron; similarly, the basic element of every
neural network is an artificial neuron, or simply neuron. That is, a neuron is the basic building
block for all types of neural networks.

A Typical ANN Structure
ANNs are collections of interconnected entities named neurones.
Similarly to the biological model, each neurone has many inputs (the dendrites) but a
single output (the axon).
Some inputs have excitatory effect on the axon while others have inhibitory effect.
The activity of the ANN is the combined effect of the operation of the constituent
neurones of the ANN.

Fig. ANN Structure

From the above figure, we see that neuron has a head like structure at the top called Soma
which has many dendrites and a long Axon. The axon is connected to the dendrites of other
neurons. Similarly its dendrites are connected to axons of other neighbouring neurons. The
neuron receives information in the form of electric currents from other neurons on the dendrites.
The information is "processed" and the neuron "fires" to pass its result in the form of current, to
other neuron through its axon.

A Simple Neurone Model (The Perceptron)
A perceptron has analogue inputs but binary output.
Each input has an associated weight.
Positive weights correspond to excitatory inputs and negative weights to inhibitory
inputs.

Fig. Neuron Model

Description of a neuron
s
>
=
=
=
=
threshold net if
threshold net if
f
x w net
net f out
n
i
i i
0
1
) (
1

A neuron is an abstract model of a natural neuron, as illustrated in Figs. We have inputs x
1
,
x
2
, ..., x
m
coming into the neuron. These inputs are the stimulation levels of a natural neuron.
Each input x
i
is multiplied by its corresponding weight w
i
, then the product x
i
.w
i
is fed into the
body of the neuron. The weights represent the biological synaptic strengths in a natural neuron.
The neuron adds up all the products for i = 1, m. The weighted sum of the products is usually
denoted as net in the neural network literature, so we will use this notation. That is, the neuron
evaluates net = x
1
w
1
+ x
2
w
2
+ ... + x
m
w
m
.
In mathematical terms, given two vectors x = (x
1
, x
2
, ..., x
m
) and w = (w
1
, w
2
, ..., w
m
), net is
the dot (or scalar) product of the two vectors, xw = x
1
w
1
+ x
2
w
2
+ ... + x
m
w
m
. Finally, the neuron
computes its output y as a certain function of net, i.e., y = f (net). This function is called the
activation (or sometimes transfer) function. We can think of a neuron as a sort of black box,
receiving input vector x then producing a scalar output y. The same output value y can be sent
out through multiple edges emerging from the neuron.

Fig. (a)A neuron model that retains the image of a natural neuron. (b) A further abstraction of Fig. (a).

Back Propagation Network (BPN)
It is a multi-layer forward network used extend gradient-descent waste delta learning rule.

Fig. Structure of biological neuron

The artificial neuron was designed to mimic the first order characteristics of the biological
neuron. McCulloch and Pitts suggested the first synthetic neuron in the early 1940s. In essence, a
set of inputs are applied, each representing the output of another neuron. Each input is multiplied
by a corresponding weight, analogous to a synaptic strength, and all of the weighted inputs are
then summed to determine the activation level of the neuron. If this activation exceeds a certain
threshold the unit produces an output response. This functionality is captured in the artificial
neuron known as the threshold logic unit (TLU) originally proposed by McCulloch and Pitts.

Fig. Artificial neuron structure (perceptron model)

Figure shows a model that implement this idea. Despite of the diversity of network paradigms,
nearly all are based upon this neuron configuration. Here a set of input labeled X1, X2, . . . .,Xn
is applied from the input space to artificial neuron. These inputs, collectively referred as the
input vector X corresponds to the signal into the synapses of biological neuron. Each signal is
multiplied by an associated weight W1, W2, . . .Wn, before it is applied to the summation block.
The activation a, is given by

This may be represented more compactly as

the output y is then given by y = f(a), where f is a activation function. In McCullohPitts
Perceptron model hard limiter as activation function was used and defined as:

The threshold s will often be zero. The activation function is sometimes called a step-function.
Some more non-linear activation functions also tried by the researchers like sigmoid, Gaussian,
etc. and the neuron responses for different activation functions shown in Fig. 3.3

Network Architectures

Network architectures can be categorized to three main types: feedforward networks,
recurrent networks (feedback networks) and self-organizing networks. This classification of
networks has been proposed by Kohonen [1990]. Network is feedforward if all of the hidden and
output neurons receive inputs from the preceding layer only. The input is presented to the input
layer and it is propagated forwards through the network. Output never forms a part of its own
input. Recurrent network has at least one feedback loop, i.e., cyclic connection, which means that
at least one of its neurons feed its signal back to the inputs of all the other neurons. The behavior
of such networks may be extremely complex.

Haykin divides networks into four classes [Haykin, 1994]: 1) single-layer feedforward
networks, 2) multilayer feedforward networks, 3) recurrent networks, and 4) lattice structures. A
lattice network is a feedforward network, which has output neurons arranged in rows and
columns.

Layered networks are said to be fully connected if every node in each layer is connected to
all the following layer nodes. If any of the connections is missing, then network is said to be
partially connected. Partially connected networks can be formedif some prior information about
the problem is available and this information supports the use of such a structure. The following
treatment of networks applies mainly to feed forward networks (single layer networks, MLP,
RBF, etc.). The designation n-layer network refers to the number of computational nodes or the
number of weight connection layers. Thus the input node layer is not taken into account.
Feed-forward networks
Feed-forward ANNs allow signals to travel one way only from input to output. There is
no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward
ANNs tend to be straight forward networks that associate inputs with outputs. They are
extensively used in pattern recognition. This type of organisation is also referred to as bottom-up
or top-down.

Fig.3.7 An example of a simple feedforward network

Figure A feedforward network

Feedback networks

Feedback networks can have signals travelling in both directions by introducing loops in
the network. Feedback networks are very powerful and can get extremely complicated. Feedback
networks are dynamic; their 'state' is changing continuously until they reach an equilibrium
point. They remain at the equilibrium point until the input changes and a new equilibrium needs
to be found. Feedback architectures are also referred to as interactive or recurrent, although the
latter term is often used to denote feedback connections in single-layer organisations.

Multilayered/non-multilayered - Topology of the network architecture

(i) Multilayered
The back propagation model is multilayered since it has distinct layers such as input, hidden,
and output. The neurons within each layer are connected with the neurons of the adjacent layers
through directed edges. There are no connections among the neurons within the same layer.

(ii) Non-multilayered
We can also build neural network without such distinct layers as input, output, or hidden.
Every neuron can be connected with every other neuron in the network through directed edges.
Every neuron may input as well as output. A typical example is the Hopfield model.

Non-recurrent/recurrent - Directions of output

(i) Non-recurrent (feedforward only)
In the backpropagation model, the outputs always propagate from left to right in the
diagrams. This type of output propagation is called feedforward. In this type, outputs from the
input layer neurons propagate to the right, becoming inputs to the hidden layer neurons, and then
outputs from the hidden layer neurons propagate to the right becoming inputs to the output layer
neurons. Neural network models with feedforward only are called non-recurrent. Incidentally,
"backpropagation" in the backpropagation model should not be confused with feedbackward.
The backpropagation is backward adjustments of the weights, not output movements from
neurons.

(ii) Recurrent (both feedforward and feedbackward)
In some other neural network models, outputs can also propagate backward, i.e., from right
to left. This is called feedbackward. A neural network in which the outputs can propagate in
both directions, forward and backward, is called a recurrent model. Biological systems have
such recurrent structures. A feedback system can be represented by an equivalent feedforward
system

Single-layer Feed forward Networks
A Single Layer Feed forward Network represents the simplest form of Neural Network. In
such Network, there are only 2 layers, an input layer and an output layer. The phrase Single
layer refers to the output layer of neurons (computation nodes). The input layer is not
considered as a layer as no computation is done in the layer. The inputs are multiplied by a
weight denoted by W. For instance, the input X
1
is multiplied by a weight of W
1
. The same is
done for the rest of the inputs as well. Finally a weight vector comprising of all the weights is
formed. The result of all the multiplication of the inputs and weights are then fed to the summer
where addition is executed. The output of the summer is then fed to the Linear Threshold unit. If
the input to the summer is above the threshold level, an output of 1 will take place. Else, an
output of 0 will occur. All the data can be presented to the Network in binary. (Example: 1
and 0) or in bipolar (Example: 1 and -1) Figure illustrates the block diagram of a Single
Layer Feed forward Network.

Fig. Block Diagram of Single Layer Feed forward Network

The simplest choice of neural network is the following weighted sum

where d is the dimension of input space, x0 = 1 and w0 is the bias parameter. Input vector x can
be considered as a set of activations of input layer. In classification problem is called a
discriminant function, because y(x) = 0 can be interpreted as a decision boundary. Weight vector
w determines the orientation of decision plane and bias parameter w0 determines the distance
from origin. In regression problems the use of this kind of network is limited; only (d-1) -
dimensional hyper planes can be modeled. An example of single-layer networks is a linear
associative memory, which associates an output vector with an input vector.

where again x0 = 1 and wk 0 is the bias parameter. The connection from input i to output k is
weighted by a weight parameter wki .

Figure The simplest neural networks. Computation is done in the second layer of nodes.

Functions of the form can be generalized by using a (monotonic) linear or nonlinear activation
function which acts on the weighted sum as

where g(v) is usually chosen to be a threshold function, piecewise linear, logistic sigmoid,
sigmoidal or hyperbolic tangent function (tanh). The first neuron model was of this type and was
proposed as early as in 1940s by McCulloch and Pitts.

Threshold function (step function):

Piecewise linear function (pseudolinear)

Logistic sigmoid

Multilayer Feed forward Network
The clear distinction between a single and Multilayer Feed forward Network is the
introduction of hidden units. In a single layered network there is an input layer of source nodes
and an output layer of neurons. A multi-layer network has in addition one or more hidden layers
of hidden neurons. Some standard three-layer feed-forward networks are used widely.
The objective of the hidden unit is to intervene between the input and output layer,
enabling the Network to extract higher-order statistics. Figure illustrates the Architecture of the
Multilayer Feed forward Network. The data processing between the input layer and the summer
is similar to the single layer feed forward Network. Apart from formation of a weight vector
between the two layers, another vector between the hidden units and the output layer must be
formed.
A representative feed-forward neural network consists of a three layer structure: input
layer, output layer and hidden layer. Each layer is composed of variable nodes. The number of
nodes in the hidden layers is selected to make the network more efficient and to interpret the data
more accurately. The relationship between the input and output can be non-linear or linear, and
its characteristics are determined by the weights assigned to the connections between the nodes
in the two adjacent layers. Changing the weight will change the input-to-output behavior of the
network.

Fig. A fully connected feed-forward network with one hidden layer and one output layer

Figure 3.10 The multilayer perceptron network

The summing junction of the hidden unit is obtained by the following weighted linear
combination:

where wji is a weight in the first layer (from input unit i to hidden unit j) and wj0 is the bias for
hidden unit j. The activation (output) of hidden unit j is then obtained by

For output of whole network the following activation is constructed

Two-layered multilayer perceptron in Fig. can be represented as a function by combining the
previous expressions in the form

The activation function for the output unit can be linear. In that case becomes a special case of
in which the basis functions are

If the activation functions in the hidden layer are linear, then such a network can be converted
into an equivalent network without hidden units by forming a single linear transformation of two
successive linear transformations. So the networks having non-linear hidden unit activation
functions are preferred.
Note: In literature the equation for output of neuron can also be seen written as Follows

where uk is the bias parameter. Mathematically this is equivalent to the former equations where
the bias was included in the summation. We can always set wk 0 k = u and x0 = 1. The sign
can, of course, be included in weight parameter and not in the input.
MLPs are mainly used for functional approximation rather than classification problems.
They are generally unsuitable for modeling functions with significant local variations. The
universal approximation theorem states that MLP can approximate any continuous function
arbitrarily well, although it does not provide indication about the complexity of MLP. The
Vapnik-Chervonenkis dimension dVC gives a rough approximation about the complexity.
According to this principle, the amount of training data should be approximately ten times the
dVC, or the number of weights in MLP.
MLPs are suitable for high-dimensional function approximation, if the desired function can
be approximated by a low number of ridge functions (MLPs employ ridge functions in the
hidden layer). They may perform well, although the training data have redundant inputs.

Back-Propagation Algorithm

Multiple layer perceptrons have been applied successfully to solve some difficult diverse
problems by training them in a supervised manner with a highly popular algorithm known as the
error back-propagation algorithm. This algorithm is based on the error-correction learning rule.
It may be viewed as a generalization of an equally popular adaptive filtering algorithm- the least
mean square (LMS) algorithm.
Error back-propagation learning consists of two passes through the different layers of the
network: a forward pass and a backward pass. In the forward pass, an input vector is applied to
the nodes of the network, and its effect propagates through the network layer by layer. Finally, a
set of outputs is produced as the actual response of the network. During the forward pass the
weights of the networks are all fixed. During the backward pass, the weights are all adjusted in
accordance with an error correction rule. The actual response of the network is subtracted from a
desired response to produce an error signal. This error signal is then propagated backward
through the network, against the direction of synaptic connections. The weights are adjusted to
make the actual response of the network move closer to the desired response.


Fig.3.11 Multiple layer perceptrons with back-propagation algorithm

A multilayer perceptron has three distinctive characteristics:

1. The model of each neuron in the network includes a nonlinear activation function. The
sigmoid function is commonly used which is defined by the logistic function:
1.
Another commonly used function is hyperbolic tangent.
2.

The presence of nonlinearities is important because otherwise the input- output relation of the
network could be reduced to that of single layer perceptron.

2. The network contains one or more layers of hidden neurons that are not part of the input or
output of the network. These hidden neurons enable the network to learn complex tasks.
3. The network exhibits a high degree of connectivity. A change in the connectivity of the
network requires a change in the population of their weights.

Learning Process
To illustrate the process a three layer neural network with two inputs and one output,which
is shown in the picture below, is used.
Each neuron is composed of two units. First unit adds products of weights coefficients and
input signals. The second unit realise nonlinear function, called neuron activation function.
Signal e is adder output signal, and y = f(e) is output signal of nonlinear element. Signal y is also
output signal of neuron.

Three layer neural network with two inputs and single output
The training data set consists of input signals (x1 and x2 ) assigned with corresponding
target (desired output) y. The network training is an iterative process. In each iteration weights
coefficients of nodes are modified using new data from training data set. Symbols wmn represent
weights of connections between output of neuron m and input of neuron n in the next layer.
Symbols yn represents output signal of neuron n.



Propagation of signals through the output layer.

In the next algorithm step the output signal of the network y is compared with the desired
output value (the target), which is found in training data set. The difference is called error signal
of output layer neuron.

It is impossible to compute error signal for internal neurons directly, because output values of
these neurons are unknown. For many years the effective method for training multiplayer
networks has been unknown. Only in the middle eighties the backpropagation algorithm has been
worked out. The idea is to propagate error signal (computed in single teaching step) back to all
neurons, which output signals were input for discussed neuron.



When the error signal for each neuron is computed, the weights coefficients of each neuron
input node may be modified. In formulas below df(e)/de represents derivative of neuron
activation function (which weights are modified).



Coefficient affects network teaching speed. There are a few techniques to select this
parameter. The first method is to start teaching process with large value of the parameter. While
weights coefficients are being established the parameter is being decreased gradually. The
second, more complicated, method starts teaching with small parameter value. During the
teaching process the parameter is being increased when the teaching is advanced and then
decreased again in the final stage. Starting teaching process with low parameter value enables to
determine weights coefficients signs.

Fig 3.2 Flowchart showing working of BPA
Recurrent Neural Network (RNN)
A feed forward architecture does not maintain a short-term memory. Any memory effects are
due to the way past inputs are re-presented to the network.

Fig. 3.11 A simple recurrent network
A simple recurrent network has activation feedback which embodies short-term memory. A
state layer is updated not only with the external input of the network but also with activation
from the previous forward propagation. The feedback is modified by a set of weights as to enable
automatic adaptation through learning (e.g. backpropagation).

Fig. 3.12 A simple recurrent network
Neural networks with closed paths in their topology are known as recurrent neural
networks (RNNs). RNNs are an improvement on MLPNs, and are characterized by cyclic paths
between neurons. RNNs can propagate data from later processing stages to earlier stages. In
RNNs, the present activation state is a function of the previous activation state as well as the
present inputs. In essence, the recurrent connections allow storing information from the past
input and the past state of the network. Adding feedback from the prior activation step introduces
a form of memory to the process. This enhances the networks ability to learn temporal
sequences without fundamentally changing the training process. Therefore, RNNs have the
capability of dealing with spatio-temporal problems which have been found to be difficult for
feedforward networks
A recurrent neural network differs from a feedforward neural network in the fact that
there are no restrictions on the placements of synapes in a recurrent network. This makes all
kinds of feedbacks and connections possible and achieves the full computational power of neural
networks. With such a general architecture, recurrent neural networks have important capabilities
not found in feedforward networks, such as attractor dynamics and the ability to identify a time-
varying system.
Various learning algorithms in recurrent neural networks have been proposed. Algorithms
for associative memory networks which are recurrent networks settling to stable states have been
proposed by Hopfield and Pineda. However, Jordan, Gallant and King and Pearlmutter develop
algorithms to train recurrent networks to handle time-varying systems. The algorithm is a real-
time recurrent learning algorithm for completely recurrent networks running in continually
sampled time devised by R. J. Williams and D. Zipser[S].
The real-time recurrent learning algorithm exhibits the generality of the backpropagation-
through-time approach without the growing memory requirement in arbitrarily long training
sequence. With the feedbacks from the output layer, a small recurrent neural network can well
simulate a time-varying and nonlinear system.
A typical real-time recurrent neural network is shown in Figure. It consists of two layers:
output layer and input layer. The output layer includes output and hidden neurons. Some or all of
the output/hidden neurons are delayed and fedback to the input layer. Therefore, the input layer
consists of delayed output and external input. The algorithm proceeds as follows:

Fig. A Recurrent Neural Network

1. Forward process: compute output y
j
for all j C as

2. Backward process: with

Compute error gradient as

3. Weight updates:

where
A : external input neurons
B : feedback output/hidden neurons
O : desired output neurons
C : all output/hidden neurons
U
i
: neurons of input layer where i A U B

: Kronecker delta function

: learning rate
w
ji
: weight between output/hidden neuron j and input neuron i
f : logistic function

Backpropagation through time
In the original experiments presented by Jeff Elman (Elman, 1990) so-called truncated
backpropagation was used. This basically means that y
j
(t -1) was simply regarded as an
additional input. Any error at the state layer,
j
(t), was used to modify weights from this
additional input slot (see Figure 4).
Errors can be backpropagated even further. This is called backpropagation through time
(BPTT; (Rumelhart et al., 1986)) and is a simple extension of what we have seen so far. The
basic principle of BPTT is that of unfolding. All recurrent weights can be duplicated spatially
for an arbitrary number of time steps, here referred to as . Consequently, each node which sends
activation (either directly or indirectly) along a recurrent connection has (at least) number of
copies as well.
In accordance with Equation , errors are thus backpropagated according to

where h is the index for the activation receiving node and j for the sending node (one time step
back). This allows us to calculate the error as assessed at time t, for node outputs (at the state or
input layer) calculated on the basis of an arbitrary number of previous presentations.

Fig. The effect of unfolding a network for BPTT ( = 3).

Radial Basis Function Network (RBFN)
Radial basis function network is a network of radial symmetric basis functions described
above. Functions whose response increases monotonically away from a central point are also
radial basis functions, but because of their globality they are not as commonly used as the local
ones. The most used are the Gaussians mainly because of their good analytical properties. RBF
-network produces a mapping

3.23
where bi are radial basis functions. Other common choices for radial basis functions are the
Cauchy function, the inverse multiquadric and the non-local multiquadric. When the basis
function is Gaussian (3.23) can be seen as approximating a probability density by a mixture of
known densities.

RBF -network produces a local mapping. The extrapolation properties of the mapping can
be very poor outside the optimization data set. However, the RBF network is an efficient method
for low dimensional tasks when input vector dimension is low and very accurate approximations
can be obtained. For higher dimensional tasks several difficulties are encountered. Distributing
the basis function centers evenly on the input space results in a complex model. The number of
hidden layer nodes depends exponentially on the size of input space. Especially, irrelevant inputs
are problematic, since they do not add information but increase the number of basis functions. To
overcome this problem, a Gaussian bar network has been proposed, where the product of
univariate Gaussians (tensor product) is replaced by the sum [Hartman & Keeler, 1991].
3.24
The same idea of replacing the product by a sum has also been proposed to be used with
fuzzy logic systems for control problems. This means that AND -operation is replaced by an OR
-operation. However, the linguistic interpretation is then lost.

RBF -network differs from the RBF -interpolation such that
- Number of basis functions is not determined by the size of data.
- Centers of basis functions are not constrained to be input data vectors.
- Each basis function may have its own width.
- Biases and normalization may be included.

Training of artificial neural networks
A neural network has to be configured such that the application of a set of inputs
produces (either 'direct' or via a relaxation process) the desired set of outputs. Various methods
to set the strengths of the connections exist. One way is to set the weights explicitly, using a
priori knowledge. Another way is to 'train' the neural network by feeding it teaching patterns
and letting it change its weights according to some learning rule.

Learning Techniques
Two types of learning prevailed in ANNs:
Supervised learning :- learning with teacher signals or targets
Unsupervised learning :- learning without the use of teacher signals
(i) Supervised Learning
- In supervised learning the training patterns are provided to the ANN together with a
teaching signal or target.
- The difference between the ANN output and the target is the error signal.
- Initially the output of the ANN gives a large error during the learning phase.
- The error is then minimized through continuous adaptation of the weights to solve the
problem through a learning algorithm.
- In the end when the error becomes very small, the ANN is assumed to have learned the
task and training is stopped.
- It can then be used to solve the task in the recall phase.

Learning configuration

For each input, a teacher knows what should be the correct output and this information is
given to the neural network. This is supervised learning since the neural network learns under
supervision of the teacher. The backpropagation modelis such an example, assuming an
existence of a teacher who knows what are correct patterns. In the backpropagation model, the
actual output from the neural network is compared with the correct one, and the weights are
adjusted to reduce the difference.

Fig.3 Block diagram for explanation of basic learning modes: (a) supervised learning and (b) unsupervised learning.
(ii) Unsupervised learning
In some models, neural networks can learn by themselves after being given some form of
general guidelines. There is no external comparison between actual and ideal output. Instead, the
neural network adjusts by itself internally using certain criteria or algorithms - e.g., to minimize a
function (e.g., "global energy") defined on the neural network. Such form of learning is called
unsupervised learning. (Unsupervised learning does not mean no guidance is given to the
neural network; if no direction is given, the neural network will do nothing.)

- In unsupervised learning, the ANN is trained without teaching signals or targets.
- It is only supplied with examples of the input patterns that it will solve eventually.
- The ANN usually has an auxiliary cost function which needs to be minimized like an
energy function, distance, etc.
- Usually a neuron is designated as a winner from similarities in the input patterns
through competition.
- The weights of the ANN are modified where a cost function is minimized.
- At the end of the learning phase, the weights would have been adapted in such a manner
such that similar patterns are clustered into a particular node.

(iii) Reinforcement Learning
Reinforcement Learning type of learning may be considered as an intermediate form of
the above two types of learning. Here the learning machine does some action on the environment
and gets a feedback response from the environment. The learning system grades its action good
(rewarding) or bad (punishable) based on the environmental response and accordingly adjusts its
parameters. Generally, parameter adjustment is continued until an equilibrium state occurs,
following which there will be no more changes in its parameters. The self-organizing neural
learning may be categorized under this type of learning.

Applications

Aerospace
High performance aircraft autopilots, flight path simulations, aircraft control systems,
autopilot enhancements, aircraft component simulations, aircraft component fault detectors
Automotive
Automobile automatic guidance systems, warranty activity analyzers
Banking
Check and other document readers, credit application evaluators
Defense
Weapon steering, target tracking, object discrimination, facial recognition, new kinds of
sensors, sonar, radar and image signal processing including data compression, feature extraction
and noise suppression, signal/image identification
Electronics
Code sequence prediction, integrated circuit chip layout, process control, chip failure
analysis, machine vision, voice synthesis, nonlinear modeling
Financial
Real estate appraisal, loan advisor, mortgage screening, corporate bond rating, credit line use
analysis, portfolio trading program, corporate financial analysis, currency price prediction
Manufacturing
Manufacturing process control, product design and analysis, process and machine diagnosis,
real-time particle identification, visual quality inspection systems, beer testing, welding quality
analysis, paper quality prediction, computer chip quality analysis, analysis of grinding
operations, chemical product design analysis, machine maintenance analysis, project bidding,
planning and management, dynamic modeling of chemical process systems
Medical
Breast cancer cell analysis, EEG and ECG analysis, prosthesis design, optimization of
transplant times, hospital expense reduction, hospital quality improvement and emergency room
testadvisement
Robotics
Trajectory control, forklift robot, manipulator controllers, vision systems
Speech
Speech recognition, speech compression, vowel classification, text to speech synthesis
Securities
Market analysis, automatic bond rating, stock trading advisory systems
Telecommunications
Image and data compression, automated information services, real-time translation of
spoken language, customer payment processing systems
Transportation
Truck brake diagnosis systems, vehicle scheduling, routing systems

ANN Basic Concepts

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

ANN Basic Concepts

Diunggah oleh

Hak Cipta:

Format Tersedia

AI Applications to Power Systems

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.1

: Kronecker delta function

Anda mungkin juga menyukai