NEURAL NETWORKS
A PROJECT REPORT
BACHELOR OF TECHNOLOGY
IN
ELECTRONICS AND COMMUNICATION ENGINEERING
BY
This is to certify that this project work entitled IMAGE COMPRESSION AND
DECOMPRESSION USING NEURAL NETWORKS is being submitted in
INDEX
1. ABSTRACT 1
2. INTRODUCTION 2
3. THEORY 4
3.1 NEURAL NETWORKS 4
Artificial neural networks
The Analogy to the Brain
The Biological Neuron
The Artificial Neuron
Design
Layers
Communication and types of connections
Learning laws
Applications of Neural Networks
3.2 IMAGE PROCESSING 17
Image Compression
Principles of Image Compression
Performance measurement of image Compression
Compression Standards
4. IMAGE COMPRESSION WITH NEURAL NETWORK 27
Back-Propagation image Compression
Hierarchical Back-Propagation Neural Network
Adaptive Back-Propagation Neural Network
Hebbian Learning Based Image Compression
Vector Quantization Neural Networks;
Predictive Coding Neural Networks.
1
INTRODUCTION:-
Neural networks are inherent adaptive systems, they are suitable for handling
nonstationaries in image data. Artificial neural network can be employed with success to
image compression. The advantages of realizing a neural network in digital hardware are:
The greatest potential of neural networks is the high speed processing that is
provided through massively parallel VLSI implementations. The choice to build a neural
network in digital hardware comes from several advantages that are typical for digital
systems:
The crucial problems of neural network hardware are fast multiplication, building a
large number of connections between neurons, and fast memory access of weight storage or
nonlinear function look up tables.
The most important part of a neuron is the multiplier, which performs high speed
pipelined multiplication of synaptic signals with weights. As the neuron has only one
multiplier the degree of parallelism is node parallelism. Each neuron has a local weight
ROM (as it performs the feed-forward phase of the back propagation algorithm) that stores,
2
as many values as there are connections to the previous layer. An accumulator is used to add
signals from the pipeline with the neurons bias value, which is stored in an own register.
The aim is to design and implement image compression using Neural network to
achieve better SNR and compression levels. The compression is first obtained by modeling
the Neural Network in MATLAB. This is for obtaining offline training.
3
3. THEORY
4
Other advantages include:
Adaptive learning: An ability to learn how to do tasks based on the data given for training
or initial experience.
Self-Organization: An ANN can create its own organization or representation of the
information it receives during learning time.
Real Time Operation: ANN computations may be carried out in parallel, and special
hardware devices are being designed and manufactured which take advantage of this
capability.
Fault Tolerance via Redundant Information Coding: Partial destruction of a
network leads to the corresponding degradation of performance. However, some network
capabilities may be retained even with major network damage
5
The most basic components of neural networks are modeled after the structure of the
brain. Some neural network structures are not closely to the brain and some does not have a
biological counterpart in the brain. However, neural networks have a strong similarity to the
brain and therefore a great deal of the terminology is borrowed from neuroscience.
All natural neurons have four basic components, which are dendrites, soma, axon, and
synapses. Basically, a biological neuron receives inputs from other sources, combines them
in some way, performs a generally nonlinear operation on the result, and then output the
final result. The figure below shows a simplified biological neuron and the relationship of
its four components. In the human brain, a typical neuron collects signals from others
through a host of fine structures called dendrites. The neuron sends out spikes of electrical
activity through a long, thin stand known as an axon, which splits into thousands of
branches. At the end of each branch, a structure called a synapse converts the activity from
the axon into electrical effects that inhibit or excite activity in the connected neurons. When
a neuron receives excitatory input that is sufficiently large compared with its inhibitory
input, it sends a spike of electrical activity down its axon. Learning occurs by changing the
effectiveness of the synapses so that the influence of one neuron on another changes
6
Fig 3.1 BIOLOGICAL NEURON
7
The various inputs to the network are represented by the mathematical symbol, x(n).
Each of these inputs are multiplied by a connection weight, these weights are represented by
w(n). In the simplest case, these products are simply summed, fed through a transfer
function to generate a result , and then output.
Even though all artificial neural networks are constructed from this basic building
block the fundamentals may vary in these building blocks and there are differences.
3.1.5 Design
The developer goes through a period of trial and error in the design decisions before
coming up with satisfactory design. The design issues in neural networks are complex and
are the major concerns of system developers.
3.1.6 Layers
8
layers, which are then connected to one another. How these layers connect may also vary.
Basically, all artificially neural networks have a similar structure of topology. Some of the
neurons interface the real world to receive its inputs and other neurons provide the real
world with the networks outputs. All the rest of the neurons are hidden form view.
As the figure above shows, the neurons are grouped into layers. The input layer
consists of neurons that receive input form the external environment. The output layer
consists of neurons that communicate the output of the system to the user or external
environment. There are usually a number of hidden layers between these two layers; the
figure above shows a simple structure with only one hidden layer.
When the input layer receives the input its neurons produce output, which becomes
input to the other layers of the system. The process continues until a certain condition is
satisfied or until layer is invoked and fires their output to the external environment.
To determine the number of hidden neurons the network should have to perform its
best, one are often left out to the method trial and error. If the hidden number of neurons are
9
increased too much an over fit occurs, that is the net will have problem to generalize. The
training set of data will be memorized, making the network useless on new data sets.
Fully connected
Each neuron on the first layer is connected to every neuron on the second layer.
Partially connected.
A neuron of the first layer does not have to be connected to all neurons on the
second layer.
Feed forward.
The neurons on the first layer send their output to the neurons on the second layer,
but they do not receive any input back form the neuron on the second layer.
Bi-directional.
There is another set of connections carrying the output of the neurons of the second
layer into the neurons of the first layer.
Feed forward and bi-directional connections could be fully-or partially connected.
Hierarchical.
If a neural network has a hierarchical structure, the neurons of a lower layer may
only communicate with neurons on the next level of layer.
10
Resonance.
The layers have bi-directional connections, and they can continue sending messages
across the connections a number of times until a certain condition is achieved.
Recurrent.
The neurons within a layer are fully or partially connected to one another. After
these neurons receive input form another layer, they communicate their outputs with one
another a number of times before they are allowed to send their outputs to another layer.
Generally some conditions among the neurons of the layer should be achieved before they
communicate their outputs to another layer.
On-center/off surround.
A neuron within a layer has excitatory connections to itself and its immediate
neighbors, and has inhibitory connections to other neurons. One can imagine this type of
connection as a competitive gang of neurons. Each gang excites itself and its gang member
and inhibits all member of other gangs. After a few rounds of signal interchange, the
neurons with an active output value will win, and is allowed to update its and its gang
members weights. (The are two types of connections between two neurons, excitatory or
inhibitory. In the excitatory connection, the output of one neuron increases the action
potential of the neuron to which it is connected. When the connection type between two
neurons is inhibitory, then the output of the neuron sending a message would reduce the
activity or action potential of the receiving neuron. One causes the summing mechanism of
the next neuron to add while the other causes it to subtract. One excites while the other
inhibits.
11
3.1.8 Learning.
The brain basically learns from experience. Neural networks are sometimes called
machine-learning algorithms, because changing of its connection weights (training) causes
the network to learn the solution to a problem. The strength of connection between the
neurons is stored as a weight-value for the specific connection. The system learns new
knowledge but adjusting these connection weights.
The learning ability of a neural network is determined by its architecture and by the
algorithmic method chosen for training.
1. Unsupervised learning.
Used no external teacher and is based upon only local information. It is also referred
to as self-organization, in the sense that it self-organizes data presented to the network and
detects their emergent collective properties. Paradigms of unsupervised learning are
Hebbian learning and competitive learning. From Human Neurons to Artificial Neuron
other aspect of learning concerns the distinction or not of a separate phase, during which the
network is trained, and a subsequent operation phase. We say that a neural network learns
off-line if the learning phase and the operation phase are distinct. A neural network learns
on-line if it learns and operates at the same time. Usually, supervised learning is performed
off-line, whereas unsupervised learning is performed on-line. The hidden neurons must find
a way to organize themselves without help from the outside. In this approach, no sample
outputs are provided to the network against which it can measure its predictive performance
for a given vector of inputs. This is learning by doing.
2. Reinforcement learning
This incorporates an external teacher, so that each output unit is told what its desired
response to input signals ought to be. During the learning process global information may
be required. Paradigms of supervised learning include error-correction learning,
reinforcement learning and stochastic learning. An important issue concerning supervised
learning is the problem of error convergence, i.e. the minimization of error between the
12
desired and computed unit values. The aim is to determine a set of weights which minimizes
the error. One well-known method, which is common to many learning paradigms is the
least mean square (LMS) convergence.
This method works on reinforcement from the outside. The connections among the neurons
in the hidden layer are randomly arranged, then reshuffled as the network is told how close
it is to solving the problem. Reinforcement learning is also called supervised learning,
because it requires a teacher. The teacher may be a training set of data or an observer who
grades the performance of the network results.
Both unsupervised and reinforcement suffers from relative slowness and inefficiency
relying on a random shuffling to find the proper connection weights.
3. Back propagation
This method is proven highly successful in training of multilayered neural nets. The
network is not just given reinforcement for how it is doing on a task. Information about
errors is also filtered back through the system and is used to adjust the connections between
the layers, thus improving performance. A form of supervised learning.
Off-line or On-line
One can categorize the learning methods into yet another group, off-line or on-line.
When the system uses input data to change its weights to learn the domain knowledge, the
system could be in training mode or learning mode. When the system is being used as a
decision aid to make recommendations, it is in the operation mode, this is also sometimes
called recall.
Off-line
In the off-line learning methods, once the systems enters into the operation mode, its
weights are fixed and do not change any more. Most of the networks are of the off-line
learning type.
13
On-line
In on-line or real time learning, when the system is in operating mode (recall), it
continues to learn while being used as a decision tool. This type of learning has a more
complex design structure.
Hebb Rule
The first and the best known learning rule was introduced by Donald Hebb. The
description appeared in his book The organization of Behavior in 1949. This basic rule is:
If a neuron receives an input from another neuron, and if both are highly active
(mathematically have the same sign), the weight between the neurons should be
strengthened.
Hopfield Law
This law is similar to Hebbs Rule with the exception that it specifies the magnitude
of the strengthening or weakening. It states, if the desired output and the input are both
active or both inactive, increment the connection weight by the learning rate, otherwise
decrement the weight by the learning rate. (Most learning functions have some provision
for a learning rate, or learning constant. Usually this term is positive and between zero and
one.)
14
minimized the mean squared error of the network. The error is back propagated into
previous layers one layer at a time. The process of back-propagating the network errors
continues until the first layer is reached. The network type called Feed forward, Back-
propagation derives its name from this method of computing the error term. This rule is also
referred to as the Windrow-Hoff Learning Rule and the Least Mean Square Learning Rule.
The Kohonen rule does not require desired output. Therefore it is implemented in
the unsupervised methods of learning. Kohonen has used this rule combined with the on-
center/off-surround intra-layer connection to create the self-organizing neural
network,which has an unsupervised learning method.
The most common use for neural networks is to project what will most
likely happen. There are many areas where prediction can help in setting priorities. For
example, the emergency room at a hospital can be a hectic place; to know who need the
most critical help can enable a more successful operation. Basically, all organizations must
establish priorities, which govern the allocation of their resources. Neural networks have
been used as a mechanism of knowledge acquisition for expert system in stock market
forecasting with astonishingly accurate results. Neural networks have also been used for
bankruptcy prediction for credit card institutions.
Although one may apply neural network systems for interpretation, prediction
diagnosis, planning, monitoring, debugging, repair, instruction, and control, the most
15
successful applications of neural networks are in categorization and pattern recognition.
Such a system classifies the object under investigation (e.g. an illness, a pattern, a picture, a
chemical compound, a work, and the financial profile of a customer) as one of numerous
possible categories that, in return, may trigger the recommendation of an action (such as
treatment plan or a financial plan.
A company called Nestor, have used neural network for financial risk assessment for
mortgage insurance decision, categorizing the risk of loans as good or bad. Neural networks
has also been applied to convert text to speech, NET talk is one of the systems developed
for this purpose. Image processing and pattern recognition form an important area of neural
networks, probably one of the most actively research areas of neural networks.
One of the best-known applications is the bomb detector installed in some U.S.
airports. This device called SNOOPE, determine the presence of certain compounds from
the chemical configurations of their components.
In a document from International Joint conference, one can find reports on using
neural networks in areas ranging from robotics, speech, signal processing, vision, character
recognition to musical composition, detection of heart malfunction and epilepsy, fish
detection and classification, optimization, and scheduling. Basically, most applications of
neural networks fall into the follwing five categories:
Prediction
Uses input values to predict some output e.g. pick the best stocks in the market,
predict weather, identify people with cancer risk.
16
Classification
Use input values to determine the classification e.g. is the input the letter A, is blob
of the video data a plane and what kind of plane is it.
Data association
Like classification but is also recognizes data that contains errors. E.g. not only
identify the character that were scanned but identify when the scanner is not working
properly.
Data Conceptualization
Analyze the inputs so that grouping relationships can be inferred. E.g. extract from
a database the names of those most likely to by a particular product.
Data Filtering
Smooth an input signal. e.g. take the noise out of a telephone signal.
The importance of visual communication has increased tremendously in the last few
decades. The progress in microelectronics and computer technology, together with the
creation of network operating with various channel capacities, is the bases of an
infrastructure for a new are of telecommunications. New applications are preparing a
revolution in the everyday life of our modern society. Communication based applications
include ISDN surveillance. Storage based audiovisual applications include Training,
Education, Entertainment, Advertising, Video mail and Document annotation. Essential for
the introduction of new communication services is low cost. Visual information is one of the
richest and most bandwidth consuming modes of communication.
The digital representation of raw video requires a large amount of data. The
transmission of this raw video data requires a large transmission bandwidth. To reduce the
transmission and storage requirements, the video must be handled in compressed formats.
17
To meet the requirements, the new applications, powerful data compression techniques are
needed to reduce the global bit rate drastically. Even in the presence of growing
communication channels offering increased bandwidth. The issue of quality is of prime
importance in most applications of compression. In fact, although most applications require
high compression ratios, this requirement is in general in conduction with desire for high
quality in the resulting pictures.
The standardization of video coding techniques has become a high priority because
only a standard can reduce the high cost of video compression codes and resolve the critical
problem of inter operability of equipment from different manufacturers. The existence of
the standards is often the trigger to the volume production of integrated (VLSI) necessary
for significant cost reductions. Bodies such as the international Standards Organization
(ISO) and International.
I. Image Enhancement,
II. Image Restoration,
III. Image Coding,
IV. Image Understanding.
18
Image enhancement is the use of image processing algorithms to remove certain
types of distortion in an image. Removing noise, making the edge structures in the image
stand out, enhances the image or any other operations that makes the image to look better.
The most widely used algorithms for enhancement are based on pixel functions that
are known as window operations. A window operation performed on an image is nothing
more than the process of examining the pixels in a certain region of the image, called the
window region, and computing same type of mathematical function derived from the pixels
in the window.
In image restoration, an image has been degraded in some manner and the objective
is to reduce or eliminate the degradation. The development of an image restoration system
depends on the type of degradation.
The objective of image coding is to represent an image with as few bits as possible
preserving certain level of image quality and intelligibility acceptable for a given
application. Image coding can be used in reducing the bandwidth of a communication
channel; when an image needs to be retrieved.
Image understanding differs from the other three areas in one major respect. In
image enhancement, restoration and coding both the input and the output are images and
signal processing has been the backbone of many successful systems of these areas. In
image understanding the input is an image, but the output is symbolic representation of the
19
contents of the image. Successful development of the systems in this area involve not only
signal processing but also other disciplines such as Artificial intelligence.
Also, most of the information in a given frame may be present in adjacent frames.
This temporal redundancy can also be removed, in addition to the within frame
redundancy by interframe coding.
The principles of image compression are based on information theory. The amount
of information that a source produce is Entropy. The amount of information one receives
from a source is equivalent to the amount of the uncertainty that has been removed.
A source produces a sequence of variables from a given symbol set. For each
symbol, there is a product of the symbol probability and its logarithm. The entropy is a
negative summation of the products of all the symbols in a given symbol set.
Compression algorithms are methods that reduce the number of symbols used to
represent source information, therefore reducing the amount of space needed to store the
20
source information or the amount of time necessary to transmit it for a given channel
capacity. The mapping from the source symbols into fewer target symbols is referred to as
Compression and Vice-versa Decompression.
Image compression refers to the task of reducing the amount of data required to
store or transmit an image. At the system input, the image is encoded into its compressed
from by the image coder. The compressed image may then be subjected to further digital
processing, such as error control coding, encryption or multiplexing with other data sources,
before being used to modulate the analog signal that is actually transmitted through the
channel or stored in a storage medium. At the system output, the image is processed step by
the step to undo each of the operations that were performed on it at the system input. At the
final step, the image is decoded into its original uncompressed form by the image decoder.
If the reconstructed image is identical to the original image the compression is said to be
lossless, otherwise, it is lossy.
1. Compression Efficiency
It is measured by compression ratio, which is defined as the ratio of the size (number
of Bits) of the original image data over the size of the compressed image data
2. Complexity
The number of data operations required performing bit encoding and decoding
processes measures complexity of an image compression algorithm. the data
21
operations include additions, subtractions, multiplications, division and shift
operations.
Digital images and digital video are normally compressed in order to save space on
hard disks and to speed up transmission. There are presently several compression standards
used for network transmission of digital signals on a network. Data sent by a camera using
video standards contain still image mixed with data containing changes, so that unchanged
data (for instance the background) are not sent in every image. Consequently the frame rate
measured in frames per second (fps) is much grater.
Still images are simple and easy to send. However it is difficult to obtain single
images from a compressed video signal. The video signal uses a lesser data to send or store
a video image and it is not possible to reduce the frame rate using video compression.
Sending single images is easier when using a modem connection or anyway with a narrow
bandwidth.
Main compression standard for still Main compression standards for video
image signal
JPEG M-JPEG (Motion.JPED)
22
Wavelet H.261,263etc.
JPEG 2000 MPEG1
GIF MPEG2
MPEG3
MPEG4
Popular compression standard used exclusively for still images. Each image is
divided in 8 x 8 pixels; each block is then individually compressed. When using a very high
compression the 8 x 8 blocks can be actually seen in the image. Due to the compression
mechanism, the decompressed image is not the same image which has been compressed;
this because this standard has been designed considering the performance limits of human
eyes. The degree of detail losses can be varied by adjusting compression parameters. It can
store up to 16 million colors.
Wavelet
Wavelets are functions used in representing data or other functions. They analyze the
signal at different frequencies with different resolutions. Optimized standard for images
with amount of data with sharp discontinuities. Wavelet compression transforms the entire
image differently from JPEG and is more natural as if follows the shape of the objects in the
picture. It is necessary to use a special software for viewing, being this a non-standardized
compression method.
JPEG2000
Based on Wavelet technology. Rarely used.
23
Graphic format used widely with Web images. It is limited to 256 colors and is a
good standard for images which are not too complex. It is not recommended for network
cameras being the compression ration too limited.
This is a not a separate. standard but rather a rapid flow of JPEG image that can be
viewed at a rate sufficient to give the illusion of motion. Each frame within the video is
stored as a complete image in JPEG format. Singe image do not interact among the selves.
Image are then displayed sequentially at a high frame rate. This method produces a high
quality video, but at a cost of large files.
24
2 r 1 (2n 1)(m)
X (u ) a (u ) x ( n ) cos
N n 0 2N
where C(u) = 0.707 for u = 0 and
= 1 otherwise.
In 1942 JPEG established the first international standard for still image compression
where the encoders and decoders are DCT-based. The JPEG standard specifies there modes
namely sequential, progressive, and hierarchical for lossy encoding, and one mode of
lossless encoding. The baseline JPEG CODER, which is the sequential encoding in its
simplest form, is briefly discussed here. Fig.3.1 and 3.2 shows the key processing steps in
such as encoder and decoder for grayscale images. Color image compression can be
approximately regarded as compression of multiple grayscale images, which are either
compressed entirely one at a time, or are compressed by alternately interleaving 8 x 8
sample blocks from each in turn. In this article, we focus on grayscale images only.
After output from the FDCT, each of the 64 DCT coefficients is uniformly
quantization in conjunction with a carefully designed 64-element Quantization Table (QT).
At the decoder, the quartered values are multiplied by the corresponding QT elements to
recover the original unquantized values. After quantization, all of the quantized coefficients
are ordered into the zigzag sequence as shown in. this ordering helps to facilitate entropy
encoding by placing low frequency non-zero coefficients before high frequency
25
coefficients. The DC coefficient, which contains a significant fraction of the total image
energy, is differently encoded.
26
6. Predictive Coding Neural Networks.
N
h j w ji xi 1 j K
i 1
for decoding.
K
x i w' ji h j 1 j N
j 1
Where x i [0,1] denotes the normalized pixel values for grey scale images with
grey levels [0,255]. The reason of using normalized pixel values is due to the fact that
neural networks can Operate more efficiently when both their inputs and outputs are limited
to a range of [0,1].
27
Figure 4.1 BACK PROPAGATION NEURAL NETWORK
The above linear networks can also be designed into non-linear if a transfer function
such as sigmoid is added to the hidden layer and the output layer to scale the summation
down in the above equations.
The second phase simply involves the entropy coding of the state vector hj at the
hidden layer. In cases that adaptive training is conducted, the entropy coding of those
coupling weights is also required in order to catch up with some input characteristics that
are not encountered at the training stage. The entropy coding is normally designed as the
simple fixed length binary coding although many advanced variable length entropy-coding
algorithms are available.
28
This neural network development, in fact, is in the direction of K-L transform
technology, which actually provides the optimum solution for all linear narrow channel type
of image compression neural networks. Equations (1) and (2) are represented in matrix
form:
[ h ] [ w ]T [ x ]
The K-L transform maps input images into a new vector space where all the
coefficients in the new space is de-correlated. This means that the covariance matrix of the
new vectors is a diagonal matrix whose elements along the diagonal are eigen-values of the
covariance matrix of the original input vectors. Let ej and j, i=1, 2n, be eigen-vectors and
eigen values of cx the covariance matrix for input vector x, and those corresponding eigen
values are arranged in a descending order so that
> +1, for I=1,2,..n-1.
To extract the principal components, K eigen vectors corresponding to the K largest
eigen-values in cx. In addition, all eigen-vactors in [AK]are ordered in such a way that the
first row of {AK} is the eigen-vector corresponding to the smallest eigen-value. Hence, the
forward K-L transform or encoding can defined as:
[ y] [A K ][ x ] [m x ]
[ x ] [ A K ]T [ y] [ m x ]
where [mx] is the mean value of [x] and [ x ] represents the reconstructed vectors or image
blocks. Thus the mean square error between x and [ x ] is given by the following equation:
29
1 n n n
e m E( x x ) 2 (x g x k ) i i
2
m J 1 j1 j k 1
where the statistical mean value E{.} is approximated by the average value over all the
input vector samples which, in image coding, are all the non-overlapping blocks of 4x4 or
8x8 pixels.
For the comparison between the equation pair (3-4) and the equation pair (5-6), it
can be concluded that the linear neural network reaches the optimum solution whenever the
following condition is satisfied:
[ W ' ][ W ]T [ A K ]T [A K ]
Under this circumstance, the neuron weights from input to hidden and from hidden
to output can be described respectively as follows:
Where [U] is an arbitrary K x K matrix and [U] [U]-1 gives an identity matrix of K x
K. Hence, it can be seen that the liner neural network can achieve the same compression
performance as that of K-L transform without necessarily obtaining its weight matrices
being equal to [AK]T and [AK].
30
decomposer layer. The structure can be shown in Figure 4.2. The idea is to exploit
correlation between pixels by inner hidden layer and to exploit correlation between blocks
of pixels by outer hidden layers. From input layer to combiner layer and decombiner layer
to output layer, local connections are designed, which has the same effect as M fully,
connected neural sub-networks.
Training such a neural network can be conducted in terms of : (i) Outer Loop Neural
Network 9OLNN) Training; (ii) Inner Loop Neural Network (ILNN) Training; and (iii)
Coupling weight allocation for the Overall Neural Network.
31
the input image blocks in age blocks into a few sub-sets with different features according to
their complexity measurement. A fine tuned neural network then compresses each sub-set.
Training of such a neural network can be designed as : (a) parallel training (b) serial
training; and (c) activity based training;
The parallel training scheme applies the complete training set simultaneously to all
neural networks and use S/N (signal-to-noise) ratio to roughly classify the image blocks into
the same number of sub-sets as the of neural networks. After this initial coarse classification
is completed, each neural network is then further trained by its corresponding refined sub-
set of training blocks.
32
neurons, hmin, the neural network is roughly trained by all the image blocks. The S/N ratio,
further training is started to the next neural network with the number of hidden neurons
increased and the corresponding threshold readjusted for further classification. This process
is repeated until the whole training set is classified into a maximum number of sub-sets
corresponding to the same number of neural networks established.
In the next two training schemes, extra two parameter, activity A(P j) and four
directions are defined to classify the training set rather than using the neural networks.
Hence the back propagation training of each neural network can be completed in one phase
by its appropriate sub-set.
A ( Pi ) A j ( Pi (i, j))
even i , j
and
1 1
A(Pi (i, j)) ( p i (i, j) Pi (i r, j s)) 2
r 1 s 1
where AP(Pi(I,j)) is the activity of each pixel which concerns its neighboring 8 pixels as r
and s vary from 1 to +1 in equation (11).
Prior to training, all image blocks are classified into four classes according to their
activity values, which are, identified as very low, low, high and very high activities. Hence
four neural networks are designed with increasing number of hidden neurons to compress
the four different sub-sets of input images after the training phase is completed.
On top of the high activity parameter, further feature extraction technique is applied
by considering four main directions presented in image details, i.e., horizontal, vertical and
the two diagonal directions. These preferential direction features can be evaluated by
calculating the values of mean squared differences among neighboring pixels along the four
directions.
33
For the image patterns classified as high activity, further four neural network
corresponding to the above directions are added to refine their structure and tune their
learning processes to the preferential orientations of the input. Hence the overall neural
network system is designed to have six neural networks among which two correspond to
low activity and medium activity sub-sets and other four networks correspond to the high
activity and four direction classifications.
W ( t ) h 1 ( t ) X ( t )
Wi ( t 1)
Wi ( t ) h 1 ( t )X( t )
where, Wi(t+1) = {Wi1, Wi2,.WiN}- the ith new coupling weight vector in the next cycle
(t+1); 1 < I < M and M is the number of output neurons.
- learning rate; hi(t)- ith output value; X(t)-input vector, corresponding to each
individual image block.
11-Euclidean norm used to normalize the updated weights and make the learning
stable.
From the basic learning rule, a number of variations have been developed in the
existing research.
34
4.5 Vector Quantization Neural Networks
Since neural networks are capable of learning from input information and
optimizing itself to obtain the appropriate environment for a wide range of tasks, a family of
learning algorithms have been developed for vector quantization. The input vector is
constructed from a K-dimensional space. M neurons are designed to compute the vector
quantization code-book in which each neuron relates to one code-word vitas compling
weights. The coupling weights. The coupling weight, {Wij}, associated with the Ith neuron
is eventually trained to represent the code-word ci in the code-book. As the neural network is
being trained, all the coupling weights will be optimized to represent the best possible
partition of all the input vectors. To train the network, a group of image samples known to
both encoder and decoder is often designated as the training set, and the first M input
vectors of the training data set are normally used to initialize all the neurons. With this
general structure, various learning algorithms have been designed and developed such as
Kohones self-organising feature mapping, competitive learning, frequency sensitive
competitive learning fuzzy competitive learning, general learning, and distortion equalized
fuzzy competitive learning and PVQ (predictive VQ) neural networks.
Let Wi(t) be the weight vector of the Ith neuron at the Ith iteration, the basic
competitive learning algorithm can be summarized as follows:
where d(x, Wi(t)) is the distance in L2 metric between input vector x and the coupling
weight vector Wi(t)= { wi1,wi2.Wik}; K=p x p ; is the leering rate, and z i is its output.
35
A so called under utilization problem occurs in competitive learning which means
some of the neurons are left out of the learning process and never win the competition.
Various schemes are developed to tackle this problem. Kohonen self-organising neural
network overcomes the problem by updating the winning neuron as well as those in its
neighborhood.
d ( x , w ( t )i ) d ( x , Wi ( t )) xui( t )
Where ui(t) is the total number of winning times for neuron I up to the tth training
cycle. Hence, the more the Ith neuron wins the competition, the greater its distance from
the next input vector. Thus, the change of winning the competition diminishes. This way of
tackling the under-utilization problem does not provide interactive solutions in optimizing
the code-book.
36
Predictive coding has been proved a powerful technique in de-correlating input data
for speech compression and image compression where a high degree of correlation is
embedded among neighboring data samples. Although general predictive coding is
classified into various models such as AR and ARMA etc., auto-regressive model (AR) has
been successfully applied to image compression. Hence, predictive coding in terms of
applications in image compression can be further classified into linear and non-linear AR
models. Conventional technology provides a mature environment and well developed
theory for predictive coding which is represented by LPC (linear predictive coding) PCM
(pulse code modulation), DPCM (delta PCM) or their modified variations. Non-linear
predictive coding, however, is very limited due to the difficulties involved in optimizing the
coefficients. extraction to obtain the best possible predictive values. Under this
circumstance, neural network provides a very promising approach in optimizing non-linear
predictive coding.
With liner AR model, predictive coding can be described by the following equation:
N
X n iX n i Vn p v n
i 0
where p represents the predictive value for the pixel X n which is to be encoded in the next
step. its neighboring pixels, Xn-1, Xn-2 .Xn-N, are used by the linear model to produce the
predictive value. vn stands for the errors between the input pixel and its predictive value. vn
can also be modeled by a set of zero-mean independent and identically distributed random
variables.
Based on the above liner AR model, a multi-layer perception neural network can be
constructed to achieve the design of its corresponding non-liner predictor as shown in
Fig.1.4. For the pixel Xn which is to be predicted, its N neighboring pixels obtained from its
predictive pattern are arranged into one dimensional input vector x{Xn-1 , Xn-2.Xn-N} for the
neural network. A hidden layer is designed to carry out back propagation learning for
training the neural network. The output of each neuron, say the jth neuron, can be derived
from the equation given below:
37
N
h f f () f ( Wji X n 1 )
i0
1
where 1 e r is a sigmoid transfer function.
f(v)
To predict those drastically changing features inside image such as edges, contours
etc., and high-order terms are added to improve the predictive performance. This
corresponds to a non-linear AR model expressed as follows:
X n a i X n i a jX n 1X n j a jk X n i X n jX n k
i i j i j k
Hence , another so called functional link type neural network can be designed to
implement this type of non-liner AR model with high-order terms. The structure of the
network is illustrated in Fig 1.5. it contains only two layer of neurons one for input and the
other for output. Coupling weights, {w i}, between input layer and output layer are trained
towards minimizing the residual energy which is defined as:
38
2
en ( X n x n )
RE n n
39
5. PROPSED IMGAE COMPRESSION USING NEURAL
NETWORK
A two layer feed-forward neural network and the Levenberg Marquardt algorithm
was considered. Image coding using a feed forward neural network consists of the following
steps:
An image, F, is divided into rxc blocks of pixels. Each block is then scanned to form
a input vector x (n) of size p=rxc
It is assumed that the hidden layer of the layer network consists of L neurons each
with P synapses, and it is characterized by an appropriately selected weight matrix Wh.
All N blocks of the original image is passed through the hidden layer to obtain the
hidden signals, h(n), which represent encoded input image blocks, x(n) If L<P such coding
delivers image compression.
It is assumed that the output layer consists of m=p=rxc neurons, each with L
synapses. Let Wy be an appropriately selected output weight matrix. All N hidden vector
h(n), representing an encoded image H, are passed through the output layer to obtain the
output signal, y(n). The output signals are reassembled into p=rxc image blocks to obtain a
reconstructed image, Fr.
There are two error matrices that are used to compare the various image
compression techniques. They are Mean Square Error (MSE) and the Peak Signal-to-Noise
Ratio (PSNR). The MSE is the cumulative squared error between the compressed and the
original image whereas PSNR is the measure of the peak error.
40
m n
[ I ( x, y ) I ' x, y ] 2 5.1
MN
MSE I
y 1 x 1
The quality of image coding is typically assessed by the Peak signal-to-noise ratio (PSNR)
defined as
Once the weight matrices have been appropriately selected, any image can be
quickly encoded using the Wh matrix, and then decoded (reconstructed) using the Wy matrix.
Basic Algorithm:
Consider the form of Newtons method where the performance index is sum of
squares. The Newtons method for optimizing a performance index F(x) is
41
n
F x j F x / S j 2 Vi x vi x / x j 5.5
i 1
Next the Hessian matrix is considered. The k.j element of Hessian matrix would be
2
F x kj 2 F x / x k x j
One problem with the Gauss-Newton over the standard Newtons method is that the
matrix H=JTJ may not be invertible. This can be overcome by using the following
modification to the approximate Hessian matrix:
G = H + I.
Or
Xk =- [JT (Xk) J ( Xk)+kI]-1 JT (Xk) V(Xk)
42
this algorithm has the very useful feature that as k is increased it approaches the
steepest descent algorithm with small learning rate.
The iterations of the Levenberg- Marquardt back propagation algorithm (LMBP) can
be summarized as follows:
1.
Present all inputs to the network and compute the corresponding network outputs
and the errors eq = tq a Mq. Compute the sum of squared errors over all inputs. F(x).
2.
F (x) = eq T eq =(ej.q )2 = (vi)2
Compute the Jacobian matrix. Calculate the sensitivities with the recurrence relation.
Augment the individual matrices into the Margquardt sensitivities.
3.
Obtain Xk.
4.
Recompute the sum of squared errors using xk + Xk.. If this new sum of squares is
smaller than that computed in step 1 then divide by v, let Xk+1 = Xk + Xk and go back
to step 1. if the sum of squares is not reduced, then multiply by v and go back to step
3.
Training Procedure
1. The first step is to convert a block matrix F into a matrix X of size P x N containing
training vectors, x(n), formed from image blocks.
That is:
P= r.c and p.N = R.C
2. The target data is made equal to the data, that is:
43
D=X
3. The network is then trained until the mean squared error, MSE, is sufficiently small.
The matrices Wh and Wy will be subsequently used in the image encoding and
decoding steps.
Image Encoding
The hidden-half of the two-layer network is used to encode images. The Encoding
procedure can be described as follows:
FX, H= (Wh. X)
Where X is an encoded image of F.
Image Decoding
The image is decoded (reconstructed) using the output-half the two-layer network.
The decoding procedure is described as follows:
Y = (Wy. H), YF
These steps were performed using MATLAB (Matrix laboratory). The compression
so obtained was though offline learning. In the off-line learning methods, once the systems
enters into the operation mode, its weights are fixed and do not change any more.
44
LEVENBERG-MARQUARDT ALGORITHM
45
6. IMPLEMENTATION OF IMAGE COMPRESSION USING
MATLAB
A sample image was taken as the input to be compressed. At each instance (1:64,
1:64) pixels were considered. Now using blkM2vc function the matrix was arranged column
wise. The target was made equal to the input and the matrix was scaled down. The network
was developed using 4 neurons in the first layer (compression) and 16 neurons in the second
layer (decompression).
The first layer used tangent sigmoid function and the linear function in the second layer.
Then the training was performed using Levenberg-Marquardt Algorithm. The training goal
was set to le-3 and epochs were used. The following functions were used for this purpose
net.traniparam.goal = le-3,
net.trainparam.epochs=100.
After this the network was simulated and its output was plotted against the target.
Rearranging of the matrix was done using function vc2blkM followed by scaling up.
46
MATLAB CODE
comp.m
I = imread('J:\matlab\toolbox\images\imdemos\autumn.tif');
size(I)
image(I)
in1=I(1:64,1:64);
figure(1)
r=4;
imshow(in1)
in2=blkM2vc(in1,[r r]);
in3=in2/255;
in4=in3;
net_c=newff(minmax(in3),[4 16],{'tansig','purelin'},'trainlm');
net.trainparam.show=5;
net.trainparam.epochs=300;
net.trainparam.goal=1e-5;
[net_s,tr]=train(net_c,in3,in4);
a=sim(net_s,in3);
fr=vc2blkM(a,r,64);
asc=fr*255;
az=uint8(asc)
figure(2)
imshow(az)
disp('training is achieved');
disp('consider a new image to be compressed')
II = imread('J:\matlab\toolbox\images\imdemos\fabric.png');
a1=II(1:64,1:64);
figure(5)
imshow(a1)
a2=blkM2vc(a1,[r r]);
a3=a2/255;
out=sim(net_s,a3);
a4=vc2blkM(out,r,64);
a5=a4*255;
a6=uint8(a5);
figure(6)
imshow(a6);
47
blkM2vc.m
48
vc2blkM.m
49
Functions used in MATLAB program:
newff
Syntax
net = newff
Description
The transfer functions TFi can be any differentiable transfer function such as tansig, logsig,
or purelin.
The training function BTF can be any of the backprop training functions such as trainlm,
trainbfg, trainrp, traingd, etc. Caution: trainlm is the default training function because it is
very fast, but it requires a lot of memory to run. If you get an "out-of-memory" error when
training try doing one of these: Slow trainlm training, but reduce memory requirements by
setting net.trainParam.mem_reduc to 2 or more. (See help trainlm.) Use trainbfg, which is
slower but more memory-efficient than trainlm. Use trainrp, which is slower but more
memory-efficient than trainbfg.
The learning function BLF can be either of the backpropagation learning functions such as
learngd or learngdm.
The performance function can be any of the differentiable performance functions such as
mse or msereg.
Algorithm
50
Feed-forward networks consist of Nl layers using the dotprod weight function, netsum net
input function, and the specified transfer functions.
The first layer has weights coming from the input. Each subsequent layer has a weight
coming from the previous layer. All layers have biases. The last layer is the network output.
Adaption is done with trains, which updates weights with the specified learning function.
Training is done with the specified training function. Performance is measured according to
the specified performance function.
trainParam
This property defines the parameters and values of the current training function.
net.trainParam
The fields of this property depend on the current training function (net.trainFcn). Evaluate
the above reference to see the fields of the current training function.
train
Syntax
[net,tr,Y,E,Pf,Af] = train(net,P,T,Pi,Ai,VV,TV)
Description
train(NET,P,T,Pi,Ai,VV,TV) takes,
P -- Network inputs
51
and returns,
Y -- Network outputs
E -- Network errors.
sim
Syntax
sim(MPCobj,T,r)
sim(MPCobj,T,r,v)
sim(MPCobj,T,r,SimOptions) or sim(MPCobj,T,r,v,SimOptions)
[y,t,u,xp,xmpc,SimOptions]=sim(MPCobj,T,...)
Description
The purpose of sim is to simulate the MPC controller in closed-loop with a linear time-
invariant model, which, by default, is the plant model contained in MPCobj.Model.Plant. As
an alternative sim can simulate the open-loop behavior of the model of the plant, or the
closed-loop behavior in the presence of a model mismatch between the prediction plant
model and the model of the process generating the output data.
sim(MPCobj,T,r) simulates the closed-loop system formed by the plant model specified in
MPCobj.Model.Plant and by the MPC controller specified by the MPC object MPCobj, and
plots the simulation results. T is the number of simulation steps. r is the reference signal
array with as many columns as the number of output variables.
sim(MPCobj,T,r,v) also specifies the measured disturbance signal v, that has as many
columns as the number of measured disturbances.
uint8
52
Return the stored integer value of a fi object as a built-in uint8
Syntax
Description
The stored integer is the raw binary number, in which the binary point is assumed to be at
the far right of the word.
uint8(a) returns the stored integer value of fi object a as a built-in uint8. If the stored integer
word length is too big for a uint8, or if the stored integer is signed, the returned value
saturates to a uint8
53
MATLAB Results:
6.1 Training procedure till the MSE becomes less than e-5
54
ORIGINAL IMAGE
COMPRESSED IMAGE
DECOMPRESSED IMAGE
55
7. CONCLUSION
The computing world has a lot to gain from neural networks. Their ability to learn
by example makes them very flexible and powerful. Furthermore there is no need to devise
an algorithm in order to perform a specific task; i.e. there is no need to understand the
internal mechanisms of that task. They are also very well suited for real time systems
because of their fast response and computational times which are due to their parallel
architecture. Neural networks also contribute to other areas of research such as neurology
and psychology. They are regularly used to model parts of living organisms and to
investigate the internal mechanisms of the brain. Perhaps the most exciting aspect of neural
networks is the possibility that some day 'conscious' networks might be produced. There is a
number of scientists arguing that consciousness is a 'mechanical' property and that
'conscious' neural networks are a realistic possibility.
Even though neural networks have a huge potential we will only get the best of them
when they are integrated with computing, AI, fuzzy logic and related subjects. Neural
networks are performing successfully where other methods do not, recognizing and
matching complicated, vague, or incomplete patterns.
56
FUTURE SCOPE:
57
BIBILIOGRAPHY
1. H.Demuth and M. Beale. Neural Network TOOLBOX Users Guide. For use with
MATLAB. The Math Works lne.. (1998)
2. Henrieue Ossoinig. Erwn Reisinger., Reinhold Weiss Design and FPGA-
Implementation of a Neural Network
3. Kiamal Z. Pekmestzi. Multiplexer-Based Array Multipliers. IEEE
TRANSACTIONS ON COMPUTERS, VOL, 48 , JANUARY( 1998)
4. Hennessy. J.L and Patterson. D.A. Compter Architecture: A quantitative Approach.
Morgan Kaufmanns, (1990)
5. J. Jiang. Image compression with neural networks. Signal Processing: Image
Communication 14 (1999) 737-760
58