Anda di halaman 1dari 20

WRITER IDENTIFICATION OF HANDWRITTEN ORIYA SCRIPT AND

HANDWRITTEN CHARACTER RECOGNITION








Barid Baran Nayak
111ei0250
B-Tech,2011-15













ELECTRONICS AND INSTRUMENTATION
NATIONAL INSTITUTE OF TECHNOLOZY
ROURKELA-769008, ODISHA, INDIA
Abstract
In handwriten writer identification and character recognition we have
done a image based analysis,where a scanned digital image containing
handwriten script is taken as input, then system translate it into an
machine editable readable digital text format.oriya language present great
challenges due to the large number of letters in alphabet set,the
sophisticated ways in which they combine and many letters are roundish
and similar to look .
In this project an attempt is made to recognize the oriya characters by use
of HISTOGRAM OF GRADIENT features of character image. The features so
obtained are passed through the HMM code which gives out the
identification result.

Keywords: character recognition.writer identification,histogram of
gradient,Hidden Markov Model(HMM)








OUTLINE:-
1. Abstract
2. Objective
3. Introduction
4. Proposed approach
5. Pre-processing
Otsu Binarization
Line segmentation
Word segmentation
Zone segmentation of words
Character segmentation
6. Feature extraction
Local Gradient Histogram Feature(H.O.G)
7. Identification
Hidden Markov Model (HMM)
8. Results and outputs
9. Discussion
10. Conclusion
11. Bibliography
OBJECTIVE:-

1. Identification of writer by scanning handwritten oriya documents.
Compare the results of writer identification with zone
segmentation and without zone segmentation.
2. Recognition of each character written in the document.
Identify which Oriya character written and then convert them to
corresponding English letter based on a dictionary.

INTRODUCTION:-
Oriya ( oi), officially spelled Odia is an Indian language belonging to
the Indo-Aryan branch of the Indo-European language family. It is the
predominant language of the Indian states of Odisha, where native
speakers comprise 80% of the population, and it is spoken in parts of West
Bengal, Jharkhand, Chhattisgarh and Andhra Pradesh. Oriya is one of the
many official languages in India; it is the official language of Odisha and
the second official language of Jharkhand.
Since it is an old language there are various old documents present whose
writers are unknown. My project deals with this problem. Its main aim is
to identify who is the writer. And the Other part of the project is to
identify each character written.
Due to the presence of complex features such as headline, vowels,
modifiers, etc., character segmentation in Oriya script is not easy. Also, the
position of vowels and compound characters make the segmentation task
of words into characters very complex. To take care of this problem we
tried a novel method considering a zone wise break up of words and next
HMM based recognition. In particular, the word image is segmented into 3
zones, upper, middle and lower, respectively. The components in middle
zone are modelled using HMM. By this zone segmentation approach we
reduce the number of distinct component classes compared to total
number of classes in Oriya character set. Once the middle zone portion is
recognized, HMM based forced alignment is applied in this zone to mark
the boundaries of individual components. The segmentation paths are
extended later to other zones Next, the residue components, if any, in
upper and lower zones in their respective boundary are combined to
achieve the final word level recognition.
Earlier template based approach was followed for recognition purpose. In
this approach an unknown pattern was superimposed on the ideal
template is done, and then the degree of correlation between the two was
used for the classification. But this approach became ineffective because
of noises and changes in hand writing. Hence now a days feature based
approach is used.











PROPOSED APPROACH:-



Pre-Processing
(Otsu binarization, line words &
character segmentation)
Feature Extraction
(Histogram Of gradient{HOG})
Recognition
(Hidden Markov Model{HMM})
Pre-Processing :
(Binarization/Thresholding)
Binarization is a process in which a graythresh, RGB, BMP etc. images are
converted into a binary image.
Lets consider a graythresh image. A graythresh image consists of pixels
each of which has a depth/height of 256 bits. This height represents the
intensity level of each pixel.
The thresholding method of binarization is basically to determine a
threshold or index. Once this threshold is obtained, we divide the image
into two classes. Intensities above this threshold fall under white class
and below this threshold as black class.

For binarization a single threshold is selected. There are various methods.
Mostly two of them are preferred
Automatic thresholding
Otsu binarization
Otsu binarization is the optimal thresholding technique. Mostly this
thresholding is preferred as it finds the threshold based on the inter-class
variance.
Problem with the automatic thresholding is that whenever the valley
between the two classes is small then the threshold obtained is erroneous.
Hence Otsu binarization is used.
Otsus Thresholding Method
Based on a very simple idea: Find the threshold that minimizes the
weighted within-class variance.
This turns out to be the same as maximizing the between-class variance.
Operates directly on the gray level histogram [e.g. 256 numbers, P(i)], so
its fast (once the histogram is computed).
Method
In Otsu's method we exhaustively search for the threshold that minimizes
the intra-class variance (the variance within the class), defined as a
weighted sum of variances of the two classes:

Weights are the probabilities of the two classes separated by a threshold
and variances of these classes.
Otsu shows that minimizing the intra-class variance is the same as
maximizing inter-class variance:

which is expressed in terms of class probabilities and class means .
The class probability is computed from the histogram as :

while the class mean is:

where is the value at the center of the th histogram bin. Similarly, you
can compute
and on the right-hand side of the histogram for bins greater than .
The class probabilities and class means can be computed iteratively. This
idea yields an effective algorithm.
Algorithm
1. Compute histogram and probabilities of each intensity level
2. Set up initial and
3. Step through all possible thresholds maximum intensity
1. Update and
2. Compute
4. Desired threshold corresponds to the maximum
5. You can compute two maxima (and two corresponding thresholds).
is the greater max and is the greater or equal maximum
6. Desired threshold =

Line Segmentation:
For line segmentation, we divide the text into vertical stripes and
determine horizontal histogram projections of these stripes. The
relationship of the peakvalley points of the histograms is used to segment
text lines. Based on vertical projection profiles and structural features of
Oriya characters, lines are segmented into words.

The global horizontal projection method computes the sum of all black
pixels on every row and constructs the corresponding histogram. Based on
the peak/valley points of the histogram, individual lines are generally
segmented. Although this global horizontal projection method
is applicable for line segmentation of printed documents, it cannot be used
in unconstrained handwritten documents because the characters of two
consecutive text-lines may touch or overlap. For example, see the 4th and
5th text lines of the document shown in figure below. Here,



these two lines are mostly overlapping. To take care of unconstrained
handwritten documents, we use here a piece-wise projection method as
below. Here, at first, we divide the text into vertical stripes of width W
(here we assume that a document page is in portrait mode). Width of the
last stripe may differ from W. If the text width is Z and the number of stripe
is N, the width of the last stripe is [Z W (N 1)].
Computation of W is discussed later. Next, we compute piece-wise
separating lines (PSL) from each of these stripes. We compute the row-wise
sum of all black pixels of a stripe. The row where this sum is zero is a PSL.
We may get a few consecutive rows where the sum of all black pixels is
zero. Then the first row of such consecutive rows is the PSL. The PSLs
of different stripes of a text are shown in figure 2a by horizontal lines. All
these PSLs may not be useful for line segmentation. We choose some
potential PSLs as follows. We compute the normal distances between two
consecutive PSLs in a stripe. So if there are n PSLs in a stripe we get n 1
distances. This is done for all stripes. We compute the statistical mode
(MPSL) of such distances. If the distance between any two consecutive
PSLs of a stripe is less than MPSL, we remove the upper PSL of these two
PSLs. PSLs obtained after this removal are the potential PSLs. The
potential PSLs obtained from the PSLs of figure 2a are shown
in figure 2b. We note the left and right co-ordinates of each potential PSL
for future use. By proper joining of these potential PSLs, we get individual
text lines. It may be noted that sometimes because of overlapping or
touching of one component of the upper line with a component of the lower
line, we may not get PSLs in some regions. Also, because of some modified
characters of Oriya (e.g. ikar, chandrabindu) we find some extra PSLs in a
stripe. We take care of them during PSL joining, as explained next. Joining
of PSLs is done in two steps. In the first step, we join PSLs from right to
left and, in the second step, we first check whether line-wise PSL joining is
complete or not. If for a line it is not complete, joining from left to right is
done to obtain complete segmentation. We say PSLs joining of a line is
complete if the length of the joined PSLs is equal to the column (width) of
the document image. This two-step approach is done to get good results
even if two consecutive text lines are overlapping or connected. To join a
PSL of the ith stripe, say Ki , to a PSL of (i 1)th stripe, we check whether
any PSL, whose normal distance from Ki is less than MPSL,, exists or not
in the (i 1) stripe. If it exists, we join the left co-ordinate of Ki with the
right co-ordinate of the PSL in the (i 1)th stripe. If it does not exist, we
extend the Ki horizontally in the left direction until it reaches the left
boundary of the (i 1)th stripe or intersects a black pixel of any component
in the (i 1)th stripe. If the extended part intersects the black pixel of a
component of the (i 1)th stripe, we decide the belongingness of the
component in the upper line or lower line. Based on the belongingness of
this component, we extend this line in such a way that the component falls
in its actual line. Belongingness of a component is decided as follows.
We compute the distances from the intersecting point to the topmost and
bottommost point of the component. Let d1 be the top distance and d2 the
bottom distance. If d1 < d2 and d1 < (MPSL/2) then the component
belongs to the lower line. If d2 d1 and d2 < (MPSL/2) then the
component belongs to the upper line. If d1 > (MPSL/2) and d2 > (MPSL/2)
then we assume the component touches another component of the lower
line. If the component belongs to the upper-line (lower-line) then the line is
extended following the contour of the lower part (upper part) of the
component so that the component can be included in the upper
line (lower line). The line extension is done until it reaches the left
boundary of the (i 1)th stripe. If the component is touching, we detect
possible touching points based on the structural shape of the touching
component. From the experiment, we notice that in most of the touching
cases there exist junction/crossing shapes or there exist some obstacle
points in the middle E-SEGM) is as follows.

Word segmentation:
For word segmentation from a line, we compute vertical histograms of the
line. In general, the distance between two consecutive words of a line is
greater than the distance between two consecutive characters in a word.
Taking the vertical histogram of the line and using the above distance
criteria we segment words from lines. For example, see figure 3a.
A very simple algorithm can be followed. Vertical smoothing can be done
similar to the one explained in zone segmentation (horizontal smoothing).
Then clear valley of histogram is obtained in between the words as a result
we make a division of words at these valley positions.




Zone Segmentation:
A word in Oriya can be divided into three zones, the upper zone, middle
zone and lower zone. The segmentation of a word into corresponding
three regions is shown in figure 4.
The modifier like ekar is in upper zone.
The vowels and consonants are in the middle zone.
Lastly the ukar or rukar etc lie in the lower zone.



Figure 4: (a) Original Word. (b) Zone segmented word (upper,mid,lower).


Why to do segmentation?
In the upper and lower regions mostly the modifiers are written. While
writing the modifier in maximum cases the writer makes touching
characters, makes irregular shapes. So it is found that writer identification
and character recognition with zone segmentation gave a better result
than that of without zone segmentation.

Algorithm
1. A window (w=length of character) is traversed in the row direction.
2. For each window a smoothing of character is done.
If the distance between two black pixel is less than the
threshold then all the horizontal pixels between them are made
black.
Near the starting and ending of the image smoothing is not
obtained.
3. A horizontal histogram is plotted for each window.
4. Based on the valley and mountain obtained in histogram zone
segmentation is obtained.




Character segmentation:
The identification of writer is done word wise while the character
recognition is done character by character, so character segmentation is
required.
The middle zone is taken then vertical smoothing is done then based on
the valley and mountain the character segmentation is done.

Various other method are used for character segmentation like water
reservoir method, very effective in Hindi and Bengali, but is not effective
method in Oriya text.






Figure 5: character segmentation from words.

Feature extraction(HOG):
Feature Extraction: Local gradient histogram (LGH) [19] has been used
for feature extraction in our approach. Here, a sliding window traverses the
image from left to right in order to produce a sequence of overlapping sub-
images. Each window is sub-divided into 4 4 (4 rows and 4 columns)
regular cells and from all pixels in each cell a histogram of gradient
orientations is calculated.

The gradient vector is divided into in an L bin histogram. Each bin specifies
a particular octant in the angular radian space. Here we consider 8 bins
360/45 of angular information. The histogram is formed by adding up
m(x, y) to the bin indicated by quantized (x, y). The concatenation of the
16 histograms of 8 bins provides a 128-dimensional feature vector for each
sliding window position.


Figure 6: image gradient pointing towards the heist increase in slope.

Two steps for finding discrete gradient of a digital image:
Find the difference: in the two directions:


Find the magnitude and direction of the gradient vector:








Identification (Hidden Markov Model):
Hidden Markov Model: The feature vector sequence is processed using
left-to-right continuous density HMMs [11]. One of the important features
of HMM is the capability to model sequential dependencies. An HMM can
be defined by initial state probabilities , state transition matrix A = [

],
i
,
j
=1,2,,N, where

denotes the transition probability from state i to


state j and output probability

(

) modeled with continuous output


probability density function . The density function is written as

(),
where x represents k dimensional feature vector. A separate Gaussian
mixture model (GMM) is defined for each state of model. Formally, the
output probability density of state j is defined as


()

) ()
where,

is the number of Gaussians assigned to j. and


( ) denotes a Gaussian with mean and covariance matrix and


is the weight coefficient of the Gaussian component k of state j. For a
model , if O is an observation sequence O = (

,..,

) which is
assumed to have been generated by a state sequence Q= (Q
1
, Q
2
,.,Q
T
), of
length T, we calculate the observations probability or likelihood as
follows:
( | )

) ()


where

is initial probability of state 1.



In the training phase, the transcriptions of the middle zone of the word
images together with the feature vector sequences are used in order to train
the character models. The recognition is performed using the Viterbi
algorithm. For the HMM implementation, we used the HTK toolkit.




Results and output:


Writer identification without zone segmentation
Sample results:

#!MLF!#
"?/w1wd7.rec"
0 500000 w1 -14658.187500
.
"?/w1wd8.rec"
0 600000 w1 -14241.120117
.
"?/w1wd9.rec"
0 1100000 w2 -29309.158203
.
"?/w2wd7.rec"
0 800000 w2 -12532.899414
.
"?/w2wd8.rec"
0 900000 w2 -18292.097656
.
"?/w2wd9.rec"
0 500000 w1 -16671.017578
.
"?/w3wd7.rec"
0 800000 w3 -15888.551758
.
"?/w3wd8.rec"
0 1500000 w3 -29638.150391
.
"?/w3wd9.rec"
0 1100000 w3 -22744.312500
.
"?/w4wd7.rec"
0 600000 w4 -20115.078125
.
"?/w4wd8.rec"
0 600000 w1 -16832.888672
.
"?/w4wd9.rec"
0 700000 w3 -17952.589844
.
"?/w5wd7.rec"
0 1100000 w3 -21285.880859
.
"?/w5wd8.rec"
0 1100000 w4 -23600.558594
.
"?/w5wd9.rec"
0 500000 w2 -16885.208984
.



Writer identification with zone segmentation
Sample results:

#!MLF!#
"?/w2wd10.rec"
0 800000 w2 5093.161621
.
"?/w2wd11.rec"
0 600000 w2 3443.149170
.
"?/w2wd12.rec"
0 700000 w2 4561.872559
.
"?/w3wd10.rec"
0 1000000 w3 6964.748535
.
"?/w3wd11.rec"
0 700000 w3 3812.657227
.
"?/w3wd12.rec"
0 700000 w3 4499.316406
.
"?/w4wd10.rec"
0 400000 w4 2198.743408
.
"?/w4wd12.rec"
0 600000 w4 3866.389648
.
"?/w5wd10.rec"
0 700000 w5 4680.817383
.
"?/w5wd11.rec"
0 500000 w4 3532.066162
.
"?/w5wd12.rec"
0 900000 w5 5889.884766
.






Discussion:
so in here we can see clearly see that the results obtained for writer
identification is better in case of zone segmented image rather than
without zone segmented images.

Undergoing work:
The writer identification part is complete but the recognition of oriya text
is under process. The recognition of Oriya text document in continuing.
The completion of this will require more 3 months of continuous work.
Conclusion:
The writer identification of writer was successfully carried out and
significant results were obtained. A scheme for segmentation of
unconstrained Oriya handwritten text into lines, words and characters is
proposed in this paper. Here, at first, the text image is segmented into
lines, and then lines are segmented into individual words. Next, for
character segmentation from words, initially, isolated and connected
(touching) characters in a word are detected. Using structural, topological
and water reservoir concept-based features, touching characters of the word
are then segmented into isolated characters. To the best of our knowledge,
this is the first work of its kind on Oriya text. The proposed water reservoir-
based approach can also be used for other Indian scripts where touching
patterns show similar behavior.








Bibliography:
[1] U. Pal, B. B. Chaudhuri, "OCR in Bangla: an Indo-Bangladeshi Language", Proceedings of the 12th IAPR
International Conference on Pattern Recognition B:ComputerVision & Image Processing, 1994.
[2] Sukalpa Chanda, Katrin Franke, Umapada Pal and Tetsushi Wakabayashi, "Text Independent Writer
Identification for Bengali Script", Proc. 20th International Conference on Pattern Recognition, 2010,
pp.2005-2008.
[3] U. Pal, A. Belaid, and C. Choisy, "Touching numeral segmentation using water reservoir concept,"
Pattern Recognition Letters, pp. 261-272, 2003.
[4] J. M. White and G. D. Rohrer, "Image thresholding for optical character recognition and other
applications requiring character image extraction," IBM J. of Res. and Dev., vol. 27, pp. 400-411, 1983.
(Pubitemid 13591061)
[5] O. Tuzel, F. Porikli, and P. Meer, "Pedestrian detection via classification on riemannian manifolds, "
IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 10, pp. 1713-1727, 2008.
[6] L. R. Rabiner "A Tutorial on HMM and Selected Applications in Speech Recognition", IEEE
Proceedings, vol. 77, pp.257 -286 1989
[7] M. Chen , A. Kundu and S. N. Srihari "Variable Duration HMM and Morphological Segmentation for
Handwritten Word Recognition", IEEE Trans. on Image Proc., vol. 4, no. 12, pp.1675 -1688 1995
[8] A. Mohan, C. Papageorgiou, and T Poggio, "Example-based object detection in images by components, "
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, pp. 349-361, 2001.
[9] D. G. Lowe, "Distinctive image features from scale-invariant keypoints, " International Journal of
Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[10] J. Yen, F. Chang, and S. Chang, "A new criterion for automatic multilevel thresholding," IEEE Trans.
Image Processing, vol. 4, no. 3, pp. 370-378, 1995.
[11] B. B. Chaudhuri, U. Pal and M. Mitra, "Automatic recognition of printed Oriya script", Sadhana, Vol.27,
part 1. pp.23-34, February 2002
[12] U. Pal, N. Sharma, and F. Kimura, "Oriya offline handwritten character recognition", In Proc.
International Conference on Advances in Pattern Recognition, pp. 123-128, 2007.
[13] U. Pal and B. B. Chaudhuri, "Indian Script Character Recognition: A Survey", Pattern Recognition,
Vol.37, pp. 1887-1899, 2004.

Anda mungkin juga menyukai