Anda di halaman 1dari 5

Hand-Written Digit Recognition Using Deep Learning Algorithms

Introduction:
The field of machine learning is growing very fast. Due to rapidly developing algorithms in
the field of image processing and machine learning, efficient extraction of features from images of
peoples’ faces has been greatly improved. Identification of handwritten digits from images had been a
complex task due to the difficulty of visual pattern recognition. Computer programs for image
recognition are very difficult to write.

Hand Written digit recognition is the computer’s ability to recognize hand written inputs like
characters and digits from a variety of sources like images, newspaper, emails etc. A lot of research
has already been conducted in this area for the applications like bank check processing, signature
verification, postal address interpretation from envelops etc. Many classification techniques have been
developed for this kind of pattern recognition such as KNN, SVM etc. But these classifiers did not
come up with the highest accuracy rate when practically implemented. To achieve accuracy, machine
must have deeper knowledge about the task assigned to it. So in such cases “Deep Learning” comes
into action.

Over the past few years, deep learning has gained a lot of focus for applications like object detection,
image processing and handwritten digit recognition etc. scikit learn, scipy image, Keras, Tensor flow
etc. are some of the tools for Deep Learning for achieving accuracy and robustness.

Problem Statement:
In order to achieve accuracy in handwritten digit recognition application, the system must
have deeper knowledge of the task. The problem arises when an error occurs in character or digit
recognition. For example, while sending a letter to person named “Maria”, the system detects and
recognizes it as “Saria” and hence many letter delivery problems may arise. So accuracy in this
classification plays a very important role. Deep Learning Algorithms are a great source to optimize
the accuracy rate in Handwritten Digit Recognition.

The purpose of the study:


The project is aimed to implement variant algorithms of Deep Neural Networks such as
Logistic Regression, Shallow Neural Network, Deep Neural Network and making a comparison of all
three to see which is better for the purpose of Handwritten Digit Recognition and how better accuracy
is achieved using Deep Learning algorithms. Though the goal is to just to create a model which can
recognize the digits but it can be extended to letters and then a person’s handwriting. Through this
work, we aim to learn and practically apply the concepts of Machine Learning and Neural Networks.
Moreover, digit recognition is an excellent prototype problem for learning about neural networks and
it gives a great way to develop more advanced techniques like deep learning.
Literature Review:

Machine Learning Techniques:

There are a variety of machine learning techniques like such as logistic regression, k-nearest
neighbors, perceptron, support vector machine, k-means, and neural networks. Knowing that I am
facing with a classification problem, I have opted to choose a few algorithms to use for a comparison.
So In order to narrow our focus to a few methods, we needed to explore existing literature in the
domain of handwritten digit recognition.

Deep Learning:
Deep Learning is a sub domain of Machine Learning. The main idea of deep learning is referred as
unsupervised layerwise pre-training which means to learn feature’s hierarchy one level at a time. It is
based on training of each layer with an unsupervised learning algorithm. Input for the second level is
taken from the previous level and so ultimately becomes able to reconstruct original data. For
example, each iteration of unsupervised feature learning adds one layer of weights to a deep neural
network. Finally, the set of layers with learned weights could be stacked to initialize a deep
supervised predictor, such as a neural network classifier, or a deep generative model, such as a Deep
Boltzmann Machine.

Neural Networks:
Artificial Neural Networks are normally called as Neural Network. There are simple processing
elements in these systems called as neurons that are joined together to make a Neural Network. The
neurons in these systems work in parallel fashion depending upon the function determined by
network structure, strength among connections and the processing that occur at computing elements.
There is wide variety of Neural Networks. But these are mainly categorized as supervised and
unsupervised learning. This boundary lies in the method of their learning process.

 Supervised learning: In it, training of network is achieved with the existing examples and
keeping in mind the desired output.
 Unsupervised learning: In this kind of learning, the network works on organizing the input
data without the use of external feedback in a more useful way.

Neural Network takes a huge number of known handwritten digits as training examples and then the
developed system learns from those examples. Simply, Neural network uses these examples for the
creation of rules for hand written digit recognition. With the increased number of learning examples,
greater accuracy can be achieved.

Architecture of Neural Network:


Neural networks are generally setup in layers. A number of interconnected nodes are joined to make
a layer. There is an activation function in nodes that starts the layer.

 Input layer: It is the first layer of Neural Network and patterns are fed into NN through this
layer.
 Hidden layer: Input layer then communicates to one or more hidden layers depending upon
the depth of NN. Hidden layers have weighted connections and it is the place where actual
processing of the system takes place.
 Output Layer: The final layer of NN is output layer. Classified patterns are output via this
layer and answer is extracted from it.

Figure 1: Neural Network Schematic.

Small explanation of previous work:

Many methods have been proposed till date to recognize and predict the handwritten digits. Some of
the most interesting are those briefly described below. A wide range of researches has been performed
on the MNIST database to explore the potential and drawbacks of the best recommended approach.
The best methodology till date offers a training accuracy of 99.81% using the Convolution Neural
Network for feature extraction and an RBF network model for prediction of the handwritten digits [2].
According to [3] an extended research conducted for identifying and predicting the handwritten digits
attained from the Concordia University database, Mexican hat wavelet transformation technique was
used for preprocessing the input data. With the help of the back propagation algorithm, this input was
used to train a multilayered feed forward neural network and thereby attained a training accuracy of
99.17%. Although higher than the accuracies obtained for the same architecture without data
preprocessing, the testing for isolated digits was estimated to be just 90.20%. A novel approach based
on radon transform for handwritten digit recognition is reported in [4]. The radon transform is applied
on range of theta from -45 to 45 degrees. This transformation represents an image as a collection of
projections in various directions resulting in a feature vector. The feature vector is then fed as an input
to the classification phase. In this paper, authors are used the nearest neighbor classifier for digit
recognition. An overall accuracy of 96.6% was achieved for English handwritten digits, whereas
91.2% was obtained for Kannada digits.

Proposed Approach:
For the purpose of HDR implementation using Deep Learning algorithms, Softmax
Regression is opted to be the base frame work for this. Softmax Regression is general form of logistic
regression. It is used as a top layer on each classifier to reduce errors and to achieve maximum
accuracy. It can be used for multi-class classification assuming that classes are mutually exclusive.
However Logistic regression is used for binary classification tasks.

Figure 2: SoftMax Regression


In softmax regression, our target is to do multi-class classification instead of binary
classification, and so K different values can be assigned to the label y. Thus, in our training set
{(x(1),y(1)),…,(x(m),y(m))}, we now have that y(i)∈{1,2,…,K}.

Note that our convention will be to index the classes starting from 1, rather than from 0. For example,
in the MNIST digit recognition task, we would have K=10 different classes.

Following the above stated method, implantations for Logistic Regression, Shallow Neural Networks
and Deep Neural Networks will be carried out using tools like tensor flow and keras etc.

Data Set Used:


The dataset used in this research is of MNIST Data set. It is a database of handwritten digits
that is commonly used for training various image processing systems. The images in this data set is of
28 X 28 pixels. A standard spit of the dataset is used to evaluate and compare models, where 60,000
images are used to train a model and a separate set of 10,000 images are used to test it. As it is a digit
recognition task so for 0-9 digits, there are 10 classes to predict. Results are obtained using inverted
classification accuracy. Testing would be done using online data through webcam. For this purpose
OpenCv has to be employed as well.

Figure 3: MNIST Dataset

Anda mungkin juga menyukai