Journal of Computer
Engineering
Technology (IJCET),
ISSN 0976-6367(Print),
INTERNATIONAL
JOURNAL
OFand
COMPUTER
ENGINEERING
&
ISSN 0976 - 6375(Online), Volume 6, Issue 2, February (2015), pp. 54-74 IAEME
TECHNOLOGY (IJCET)
IJCET
IAEME
ABSTRACT
Handwritten character recognition has been vigorous and tough task in the field of pattern
recognition. Considering its application to various fields, a lot of work is done and is being
continuing to improve the results through various methods. In this paper we have proposed a system
for individual handwritten character recognition using multilayer feed-forward neural networks. For
the experimental purpose we have taken 15 samples of lower & upper case handwritten English
alphabets in scanned image format i.e. 780 different handwritten character samples. There are two
methods of feature extraction are used to construct the pattern vectors for training set. This training
set is presented to the six different feed-forward neural networks namely newff, newfit, newpr,
newgrnn, newrb and newrbe. The test pattern set is used to evaluate the performance of these neural
networks models. The results are compared to find the accuracy in recognition of the respective
models. The number of hidden layer, number of neurons in hidden layer, validation checks and
gradient factors of the neural networks models are taken into consideration during the training.
Keywords: Character Recognition, multilayer feed-forward
Backpropagation, Handwriting recognition, Pattern Classification
Artificial
Neural
Network,
1. INTRODUCTION
These days computer have been penetrated in every field and the work is being done at a
higher speed with greater accuracy. Pattern recognition through computer is a challenging task and
this task becomes more critical if the pattern is in the form of handwritten curve script. Pattern
54
recognition, as a subject, spans a number of scientific disciplines, uniting them in search for a
solution to the common problem of recognizing the pattern of a given class and assigning the name
of identified class. Pattern recognition is the categorization of input data into identifiable classes
through the extraction of significant attributes of the data from irrelevant background detail. A
pattern class is a category determined by some common attributes. It is true that the older
handwritten documents are digitized but the 100% automation of work cannot be achieved. The
handwriting recognition has helped a lot to the advancement of automation process [1]. The
handwriting recognition systems are broadly classified into two types, namely online and offline
handwritten recognition. In online approach, the two dimensional coordinates of the consecutive
points are symbolize as a function of time. Also the sequence of the strokes made by the writer is on
hand. Whereas in the case of off-line handwriting recognition approach the written script is captured
with the help of devices like scanner and the whole script is available as an image [2]. When both
these approaches are compared, it has been found that due the temporal information available with
the online approach, it is superior to that of off line approach [3]. On the other hand in the off-line
systems, the neural networks have been productively used for capitulate comparably high recognition
accuracy levels [1]. A number of applications such as document analysis, mailing address
interpretation, bank processing etc. require offline handwriting recognition system [1, 4]. Thus, the
off-line handwriting recognition enjoys the first choice by many researchers in order to investigate
and discover the novel methods that would get better recognition correctness. It is widely used in
image processing, pattern recognition, and artificial intelligence.
During the last few years the researchers have proposed many mathematical approaches to
solve the pattern recognition problems. Recognition strategies heavily depend on the nature of the
data to be recognized. In the cursive case, the problem is made complex by the fact that the writing is
fundamentally ambiguous as the letters in the word are generally linked together, poorly written and
may even be missing. On the contrary, hand printed word recognition is more related to printed word
recognition, the individual letters composing the word being usually much easier to isolate and to
identify. As a consequence of this, methods working on a letter basis (i.e., based on character
segmentation and recognition) are well suited to hand printed word recognition while cursive scripts
require more specific and/or sophisticated techniques. Inherent ambiguity must then be compensated
by the use of contextual information.
Neural network computing has been expected to play a significant role in a computer-based
system of recognizing handwritten characters. This is because a neural network can be trained quite
readily to recognize several instances of a written letter or word, and can then be generalized to
recognize other different instances of that same letter or word. This capability is vital to the
realization of robust recognition of handwritten characters or scripts, since characters are rarely
written twice in exactly the same form. There have been reports of successful use of neural networks
for the recognition of handwritten characters [11, 12], but we are not aware of any general
investigation which might shed light on the systematic approach of a complete neural network
system for the automatic recognition of cursive character. The techniques of artificial neural
networks are widely used for pattern recognition task over the conventional approaches to handle
such type of problem due to the following reasons:
1.
The same alphabet character written by the same person can vary in shape, size and style.
2.
It is not only the case with same person but also the shape, size and style of the same character
can vary from person to person.
3.
Character image scanned in offline method might have poor quality due to noise present within
it.
4.
As there are no pre defined rules about the look of the visual character, the rules should be
heuristically deduced form the set of sample data. The human brain by its very nature does the
same thing using the features discussed in the following two points.
55
5.
The human brain can read handwritings of various people having different fashion of writing
because it is adaptive to slight variations and errors in pattern.
6.
It can take hold of new styles present in the character due to its ability of learning from
experiences with no time.
J. Praddep, E.Srinivasan,and S.Himavathi [1] have proposed a handwritten character
recognition system using neural network by means of diagonal based feature extraction method.
They have stated with the binarization of the image which results in binary image, which further
undergoes the edge detection and dilation and then segmentation. In the process of segmentation a
series of characters is decomposed into sub image of each individual character, each of which is
converted into 90x60 pixels for classification and recognition process. Each character image
obtained in such a way that it is divided into 54 equal zones, each of size 10 x10 pixels and
then features are extracted from each zone pixels by moving along its diagonals. They have
ended up with 54 features for each of the character. Another feature extraction method gives them 69
features by averaging the values placed in zones row wise and column wise. A feed forward back
propagation neural network having two hidden layers with architecture of 54-100-100-38 is used to
perform the classification with both the features with vertical, horizontal and diagonal orientation
and have found 92.69, 93.68 , 97.80 percent accuracy and 92.69, 94.73, 98.54 percent accuracy,
respectively.
Kauleshwar Prasad, Devvrat C. Nigam, Ashmika Lakhotiya and Dheeren Umre [3] have
converted the character image into a binary image, and then apply character extraction algorithm in
which it has empty traverse list initially. A row is scan pixel by pixel and on getting black pixel, it is
checked if it is already in the traverse list. It is checked that if it is already there then it is ignored,
otherwise added to the traverse list using edge detection algorithm. They have claimed to have good
results by using feed-forward Backpropagation neural network and also stated that poorly chosen
feature extraction method gives poor results.
Ankit Sharma and Dipti R Chaudhary [4] have achieved the accuracy of 85%, using feed
forward neural network. The special form of Reduction is used which includes the noise removal and
edge detection for the feature extraction of grayscale images.
Chirag I Patel, Ripal Patel, Palak Patel [5] have achieved the accuracy of 91%, 889%, 91%,
91%, 94%, 94% using different models of Backpropagation neural networks. After character
extraction and edge detection from the document, it goes under the process of normalization where
the images having various sizes are normalized to a uniform size. The resultant image is applied with
Line Fitting, a skew detection technique to correct this skewedness, in which it is rotated by an
angle . The constructed pattern from this method is further used for the training by Backpropagation
algorithm of feed-forward multilayer neural networks.
Anita Pal and Dayashankar [7] have used a multilayer perceptron with one hidden layer to
recognize Handwritten English Character. Boundary tracing along with Fourier Descriptor is used to
extract the feature from the handwritten character. By analyzing its shape and judge against its
features, a character is identified. Test result was found to have fine recognition accuracy of 94%
for handwritten English characters with less training time.
The genetic algorithm is used with feed forward neural network architecture as the hybrid
evolutionary algorithm [27] for the recognition of handwritten English alphabets. In this paper each
character is considered as the gray level image and divided into sixteen parts. The mean of each part
is considered as one feature of the pattern. Thus, there are sixteen features in real numbers are used
as the pattern vector for each image. The trained network performed well for the pattern
classification for test patterns.
In this paper we consider the two approaches of feature extraction from the images of
handwritten capital and small letters of English alphabets. The first method of feature extraction uses
the row wise mean value of the pixels for a processed image of size n x n. The second method
56
consider the each pixel value of the dilated image of size n x n. These features are used to construct
the pattern vectors. The two training sets are formed from these samples examples of pattern vectors.
There are six different feed forward neural networks models are used with six different learning
methods. The performances of these neural networks with different learning rules are analyzed. The
rate of recognition for patterns from the test pattern set is also evaluated. The performance evaluation
indicates that the Radial bias function (RBF) neural network architecture performs better than other
neural network models for both the methods of feature extraction. The rate of recognition for test
pattern set in RBF is found better with respect to other neural network models.
Rest of the paper is containing 6 sections. Section 2 of the paper describes the feature
extraction methods for handwritten English characters. The section 3 discusses about feed forward
neural networks and Backpropagation learning and Radial basis function. The section 4 describes the
experiment and simulation design. The Section 5 presents the simulated results and discussion.
Section 6 describes the conclusion followed by the references.
2. FEATURE EXTRACTION
Feature extraction and selection can be defined as extracting the most representative
information from the raw data, which minimizes within class. The pattern variability while
enhancing between class pattern variability so that, a set of feature are extracted from each class that
helps to distinguish it from other classes, while remaining invariant to characteristic differences
within the class. Here we are considering the feature extraction from the input stimuli with two
methods namely the row wise mean of pixel from a scanned image and each pixel value of the
image. In our approaches we have considered the input data in the form of fifteen different set of
each hand written capital and small English characters by five different peoples. It is quite natural
that the five different people considered the different hand writing and different writing style for
every character. So, in this way we have total 780 samples. Among these 780 samples we used 520
samples for training and the remaining 260 samples were used in test pattern set. Now to prepare our
training set of input output pattern pairs, we consider each scanned hand written character as a color
bit map image. This color bitmap image of a character is now changed into gray level image and then
into binary image as shown in figure 1.
Now we obtain the images after the edge detection and dilation for both the methods of
feature extraction. The edged and dilated images can show in figure 2.
Fig 2 (a)
edged image
57
Hence to obtain the uniform pattern vector for every input stimulus we make the dilated
images of equal sizes by resizing the images into the size of 30 x 30 as shown in figure 3.
The neural approach applies biological concepts to machines for pattern recognition. The
outcome of this effort is invention of artificial neural networks. Neural networks can be viewed as
massively parallel computing systems consisting of an extremely large number of simple processors
with many interconnections. Neural network models attempt to use some organizational principles
(such as learning, generalization, adaptively, fault tolerance, distributed representation, and
computation) in a network of weighted directed graphs in which the nodes are artificial neurons and
directed edges (with weights) are connections between neuron outputs and neuron inputs. The main
characteristics of neural networks are that they have the ability to learn complex nonlinear inputoutput relationships, use sequential training procedures, and adapt themselves to the data. The most
commonly used family of neural networks for pattern classification tasks [13] is the feed-forward
network, which includes multilayer perceptron and Radial-Basis Function (RBF) networks. These
networks are organized into layers and have unidirectional connections between the layers. The
learning process involves updating network architecture and connection weights so that a network
can efficiently perform a specific pattern recognition task. The increasing popularity of neural
network models to solve pattern recognition problems has been primarily due to their seemingly low
dependence on domain-specific knowledge (relative to model-based and rule-based approaches) and
due to the availability of efficient learning algorithms. Neural networks provide a new suite of
nonlinear algorithms for feature extraction (using hidden layers) and classification (e.g., multilayer
perceptron). In spite of the seemingly different underlying principles, most of the well known neural
58
network models are implicitly equivalent or similar to classical statistical pattern recognition
methods. Ripley [14] and Anderson et al. [15] also discuss the relationship between neural networks
and statistical pattern recognition. Despite these similarities, neural networks do offer several
advantages such as, unified approaches for feature extraction & classification and flexible procedures
for finding good, moderately nonlinear solutions. The advantages of neural networks are their
adaptive-learning, self-organization and fault-tolerance capabilities. For these outstanding
capabilities, neural networks are used for pattern recognition applications. The goal in pattern
recognition is to use a set of example solutions to some problem to infer an underlying regularity
which can subsequently be used to solve new instances of the problem. In the case of feed-forward
networks, the set of example solutions (called a training set), comprises sets of input values together
with corresponding sets of desired output values. The training set is used to determine an error
function in terms of the discrepancy between the predictions of the network, for given inputs, and the
desired values of the outputs given by the training set. A common example of an error function
would be the squared difference between desired and actual output, summed over all outputs and
summed over all patterns in the training set. The learning process then involves adjusting the values
of the parameters to minimize the value of the error function. This kind of error Backpropagation
would be used to reconstruct the input patterns and make them free from error which increases the
performance of the neural networks. However, effective learning algorithms were only known for the
case of networks in which at most one of the layers comprised adaptive interconnections. Such
networks were known variously as perceptron [16] and Adaline [17], and were seriously limited in
their capabilities [18].
The feed forward neural network consists of an input layer of units, one or more hidden
layers, and an output layer. Each node in the layer has one corresponding node in the next layer,
thus creating the stacking effect. The input layers nodes have output functions that deliver data to
the first hidden layer nodes. The hidden layer(s) are the processing layer, where all of the actual
computation takes place. Each node in hidden layer computes a sum based on its input from the
previous layer (either the input layer or another hidden layer). The sum is then compacted by a
sigmoid function (a logistic transfer function), which changes the sum to a limited and manageable
range. The output sum from the hidden layers is passed on to the output layer, which produces the
final network. The feed-forward networks may contain any number of hidden layers, network with
a single hidden layer can learn any set of training data that a network with multiple layers can
learn, depends upon the complexity of the problem [19]. In feed forward neural network an input
may be either a raw/preprocessed signal or image. Alternatively, some specific features can also be
used. If specific features are used as input, there number and selection is crucial and application
dependent. Weights are connected between an input and a summing node and affect to the
summing operation. The Bias or threshold value is considered as a weight with constant input 1 i.e.
x0=1 and w0=, usually the weight are randomized in the beginning [20, 21].
The neuron is the basic information processing unit of a neural network. It consists of: A set of
links, describing the neuron inputs, with weights, w1,w2,w3.wn , An adder function (linear
combiner) for computing the weighted sum as:
m
v = wj x j
(3.1)
j =1
And activation function (squashing function) for limiting the amplitude of the neuron output as
shown in figure 3.1
y = (v + b )
(3.2)
59
where,
m
(3.3)
v = wj x j
j =0
b = wo
The output at every node can finally calculates by using sigmoid function
y = f ( x) =
1
1 + e Kx
(3.4)
Bias
x1
w1
Local
Activation
Field
()
x2
Function Output
.
xm
wm
Summing
Function
Weights
1 J
(T j S j ) 2
2 j =1
(3.5)
where (T j S j ) 2 is the squared difference between the actual output of the network on the output
layer for the presented input pattern P and the target output pattern vector for the pattern P. All the
network parameters W (m 1) and m , m = 2 M, can be combined and represented by the matrix
W = [wij ] . So that, the error function E can be minimized by applying the gradient-descent procedure
as:
W =
E
W
(3.6)
where is a learning rate or step size, provided that it is a sufficiently small positive number.
Applying the chain rule the equation (3.6) can express as
E
u (jm +1)
=
wij(m ) u (jm +1) wij(m )
E
while
and
u (jm +1)
=
wij(m )
wij(m )
E
u (jm +1)
(3.7)
( w (
m ) (m )
j o
(3.8)
( m +1)
E o j
E
= (m +1)
(j m +1) u (jm +1)
(m + i ) =
o j
u j
o (jm +1)
(3.9)
(3.10)
= ej
m+2
E
=
mj+1
(
m +1)
m+2
o j
= 2 u
(3.11)
(3.12)
u (pm )
61
for m = m = 2,3,M. By substituting (3.7), (3.11), and (3.12) into (3.9), we finally obtain the
following equations:
For the output units, m = M 1,
( )
(3.13)
( ) (
m+ 2)
m +1
(3.14)
=1
Equations (3.13) and (3.14) provide a recursive method to solve (jm +1) for the whole network. Thus,
W can be adjusted by
E
(3.15)
ij
For the activation transfer functions, we have the following relations for the logistic function
(u ) = (u )[1 (u )]
(3.16)
(u ) = 1 2 (u )
(3.17)
The update for the biases can be done in two ways. The biases in the (m+1)th layer i.e. (m+1)
can be expressed as the expansion of the weight W(m), that is, (m +1) = 0(m,1) ,................... 0(m, J)m +1 .
Accordingly, the output o(m) is expanded into o (m ) = 1, o1(m ) ,............., o J(mm ) . Another way is to use a
(m)
gradient-descent method with regard to , by following the above procedure. Since the biases can
be treated as special weights, these are usually omitted in practical applications. The algorithm is
convergent in the mean if 0 < <
max
the vector x, denoted as C [23]. When is too small, the possibility of getting stuck at a local
minimum of the error function is increased. In contrast, the possibility of falling into oscillatory traps
is high when is too large. By statistically preprocessing the input patterns, namely, de-correlating
the input patterns, the excessively large eigenvalues of C can be avoided and thus, increasing can
effectively speed up the convergence. PCA preconditioning speeds up the BP in most cases, except
when the pattern set consists of sparse vectors. In practice, is usually chosen to be 0 < < 1 so that
successive weight changes do not overshoot the minimum of the error surface. The BP algorithm can
be extended or improved by adding a momentum term [24] and known as Gradient Descent with
momentum term. As per this learning rule the weight update between output layer and hidden layer
is represented by following weight updating equations as:
H
w ho (s + 1) =
i =1
E
1
+ w ho (s ) +
w ho
1 (w ho (s ))
(3.18)
62
Whereas the weight update between hidden layer and input layer can be represent as:
N
w ih (s + 1 ) =
i =1
E
1
+ w ih (s ) +
w ih
1 ( w ho (s ))
(3.19)
=
|
|
(3.1.1)
Where is the Gaussian Function, x is the input to the neuron i, i is the basis of neuron i and i
is the amplitude of neuron i. The input layer has i nodes, the hidden and the output layer have k and j
neurons, respectively. Each input neuron corresponds to a component of an input vector x. Each node in
the hidden layer uses an RBF as its non linear activation function and performs a non-linear transform of
the input. The output layer is a linear combiner, mapping the nonlinearity into a new space. The RBFMLP can achieve a global optimal solution to the adjustable weights in the minimum MSE by using the
linear optimization method. Therefore for an input pattern x, the output of the jth node of output layer can
be defined as:
63
y j ( x) = wkjk ( xi k ) + w0 j
(3.1.2)
k =1
Where y j (x ) is the jth output of the RBF-MLP, wkj as the connection weight from the kth hidden
unit to the jth output unit , w0 j is the threshold or network bias term, k is the prototype or centre of the kth
hidden unit.
The RBF (x ) is typically selected as the Gaussian function as:
k ( x) = exp(
xi k
2 k2
(3.1.3)
For k = 1, 2 K where k represents the width of the neuron. Where x is the N- dimensional
input vector and k is the vector determining the centre of the Radial Basis function k . The
weight vector between the input layer and the kth hidden layer neuron can be interpreted as the
centre k Therefore for an input pattern x, the Error of the network can be defined as same in
equation (3.5).
The error function has been considered in equation (3.5) is the least mean square (LMS). This
error will minimize along with the decent gradient of error surface in the weight space between hidden
layer and the output layer. The same error will be minimized with the Gaussian Radial Basis functions
parameter as defined in equation (3.1.3). Now, we obtain the expression for the derivatives of the error
function with respect to the weights Radial Basis function parameters for the set of P pattern pairs (xp,
yp) as; where p=1 to P.
wik = 1
E p
wik
(3.1.4)
k = 2
E p
k
(3.1.5)
and k = 3 E
(3.1.6)
(3.1.7)
Where W ik (t ) the state of weight matrix at iteration t is, Wik (t + 1) is the state of weight matrix at next
iteration, Wik (t 1) is the state of weight matrix at previous iteration, Wi k (t ) is current change/
modification in weight matrix, is standard momentum variable to accelerate learning process and
is the learning rate of the network.
Since is the outcome of Radial Basis Function used and gradient for the network is given by partial
differentiation of this error with respect to different parameters. Hence from the equation (3.5) we have,
=
"
!
!
(3.1.8)
64
& "
'
(
|& "
'
|
$(+
"
!
!
"
!
!
(3.1.9)
(3.1.20)
We have from equations (3.1.8), (3.1.9) & (3.1.20) the expressions for change in weight vector &
Radial basis function parameters to accomplish the learning in supervised way. The setting of the Radial
Basis function parameters with supervised learning represents a non linear optimization problem which
will typically be computationally intensive and may be proven to find local minima of the error function.
Thus, for reasonable well localized RBF, an input will generate a significant activation in a small region
and the opportunity of getting stuck at a local minimum is small. Hence, the training of the network for L
pattern pair i.e. (xl, yl) will accomplish in iterative manner with the modification of weight vector.
4.
In this paper we have implemented two feature extraction methods on six different artificial
neural network models in Matlab, namely feed forward network (newff), fitting network (newfit),
generalized regression (newgrnn), pattern recognition (newpr), radial basis network (newrb) and
exact radial basis network (newrbe) with Levenberg-Marquardt Backpropagation and Radial bias
functions . In this simulation design for each neural network model we have created 2 networks, one
for lower case another for upper case characters which consume the input retrieved from first feature
extraction method. Similarly another two networks are created for the same models of neural
networks those use data generated from second method of feature extraction. Thus, there are four
neural networks created for each model of neural network. The architectural detail of the each model
is presented in table 1, 2, 3, 4, 5 and 6 respectively.
(1)
65
Network 2
2
21-11
5
30
tansig- tansig
trainlm
1.0000e-003
1000
0
10
5
(2)
Network 4
2
21-11
5
30
tansig- tansig
trainlm
1.0000e-003
1000
0
10
5
66
(5)
(6)
Therefore six neural network models are used with eight neural network architectures. The
two different supervised learning methods are used i.e. Levenberg-Marquardt learning and Radial
Basis function approximation. The simulation results are obtained from all these networks for both
the feature extraction methods.
5. RESULT AND DISCUSSION
The simulated results are obtained from both the methods of feature extraction with all the six
models of neural networks by using Levenberg-Marquardt Backpropagation learning and Radial
Basis approximation. The training set consists with handwritten English capital and small alphabets.
The performance of neural network model for training and testing is presented with regression value
and regression line for the simulated output values of the neural network models. The performance
of all the six neural network models for training and testing is presented in table 7, 8, 9, 10, 11 & 12
and figure 5, 6, 7, 8, 9, 10, 11 & 12.
Table 7: Simulated Results for Newff model with Levenberg-Marquardt learning rule
Description
Network 1 using Feature
Extraction method 1
Network 1 using Feature
Extraction method 2
Network 2 using Feature
Extraction method 1
Network 2 using Feature
Extraction method 2
0.33743
0.211826
0.562268
0.201676
0.50037
0.20738
0.24005
0.000335
67
0.44335
0.215392
0.20738
0.198132
0.48689
0.211249
0.07361
0.00471
68
0.556283
0.408253
0.72463
0.485805
0.343696
0.846857
0.396131
Figure 10: Performance of Network6 for both the feature extraction methods
69
Table 11: Simulated Results for Newrbe model with Radial Basis Function Approximation
Description
Network 7 using Feature Extraction
method 1
Network 7 using Feature Extraction
method 2
0.403733
0.112004
Figure 11: Performance of Network7 for both the feature extraction methods
Table 12: Simulated Results for Newrb model with Radial Basis Function Approximation
Description
Average of regression
value test data samples
0.303037
0.112487
Figure 12: Performance of Network 8 for both the feature extraction methods
The simulation results of training are indicating that the performance of network models with
Radial Basis function approximation is better than network models with Levenberg-Marquardt
70
Backpropagation learning technique for the second feature extraction method i.e. each pixel value of
the resize and processed image. Now we evaluate the performance of these trained neural network
models for reorganization of handwritten English capital and small alphabets, those did not present
during the training. The performances of these networks are presented in table 13 and table 14. The
table 13 is presenting the performance of all the six neural network models for the prototype input
patterns processed with first method of feature extraction whereas the table 14 is presenting the
performance of all he six neural network models for the same input patterns processed with second
method of feature extraction. The first row of both the tables is representing the rate of correct
recognition for the presented input patterns. The second row of both the tables is presenting the
correct number of recognized pattern among the presented arbitrary patterns.
Table 13: Performance of all the six models for pattern recognition of presented prototype input
patterns using first method of feature extraction
Description
% of characters
recognized
Total no. of
characters
recognized
Presented
Prototype
Patterns
e
j
k
m
n
p
q
t
u
v
B
E
H
J
K
L
R
X
Y
Z
newff
newfit
newgrnn
newpr
newrbe
newrb
10
10
20
25
30
newfit
newgrnn
newpr
netrbe
newrb
From the table 13 it can observer that the performance of Radial Basis function neural
network is better than the other neural networks models. Its performance is even better than the exact
radial basis function network. It correctly recognized 6 out of 20 prototype arbitrary input patterns of
handwritten English alphabets. These patterns did not use in the training set and selected as the
samples of test patterns.
71
Table 14: Performance of all the six models for pattern recognition of presented prototype input
patterns using second method of feature extraction
newff newfit newgrnn newpr newrbe newrb
%. of
characters
recognized
Total no.
of
characters
recognized
e
j
k
m
n
p
q
t
u
v
B
E
H
J
K
L
R
X
Y
Z
85
15
15
17
From the table 14 it can observer that the performance of generalized neural network model
trained with Radial Basis function approximation is better than the other neural networks models. Its
performance is even better than the exact radial basis function network and Radial basis Network. It
correctly recognized 17 out of 20 prototype arbitrary input patterns of handwritten English alphabets.
It is quit noticeable that the performance of neural network is better for second method feature
extraction i.e. each pixel value of the resize image only for generalized neural network with radial
basis function approximation whereas the performance of other neural network models is better for
first method of feature extraction i.e. mean value of pixel of processed image.
6. CONCLUSION
This paper presented the performance evaluation of six different models of feed forward
neural networks trained with Levenberg-Marquardt Backpropagation learning technique and Radial
basis function approximation for the handwritten curve script of capital and small English alphabets.
There are two feature extractions method used. In the first method the row wise mean of the
proceeded image of alphabets is considered and in second method each pixel value of the resize and
precede image is considered. The simulated results are indicating that the generalized neural
network trained with radial basis function approximation for second method of feature extraction
72
yields the highest rate of recognition i.e. 85% for randomly chosen 10 lower case and 10 uppercase
characters. The remaining models of neural networks are showing poor performance irrespective of
the feature extraction method. The following observations are considered from the simulation of
performance evaluation:
1.
First method of feature extraction uses 30 features for each character whereas second method
of feature extraction uses 100 features for the each character. Thus, it seems that more the
number of features more is the accuracy level as far as generalized neural network model is
concern.
2.
In the training process the regression value for Radial basis network is found perfect but during
the validation for the test pattern the performance degrades rapidly. Thus the network is well
tuned for the training set but not able to generalize the behavior. It is working as good
approximation and bad generalization.
3.
The second method of feature extraction is providing more feature values in the pattern
information with respect to the first method of feature extraction. Therefore, the performance
of each neural network model is found better for the second feature extraction method.
7. REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.