UTTAR PRADESH
A Project Report on
Recognition of human face, object, animal from live
video stream
Submitted to
Amity University Uttar Pradesh
CERTIFICATE
On the basis of Project Report submitted by Sarah Bal, student of B.Tech (CSE), I hereby
certify that the project report Recognition of human face, object, animal from live video
stream which is submitted to Department of Computer Science and Engineering, Amity
School of Engineering and Technology, Amity University Uttar Pradesh, Noida in partial
fulfillment of requirement for the award of the degree of Baster of Technology in
Computer Science and Engineering is an original contribution with existing
knowledge and faithful record of work carried out by her under my guidance and
supervision.
To the best of my knowledge this work has not been submitted in part or full for any
Degree or Diploma to this University or elsewhere.
Noida
18th March, 2014
Rishi Kumar
Assistant Professor
Department of Computer Science and Engineering
Amity School of Engineering & Technology, Noida
DECLARATION
I, Sarah Bal, student of B.Tech (CSE) hereby declare that the Dissertation titled
Recognition of human face, object, animal from live video stream which is submitted
by me to Department of Computer Science and Engineering, Amity School of Engineering
and Technology, Amity University Uttar Pradesh, Noida in partial fulfillment of requirement
for the award of degree of Baster of Technology in Computer Science and Engineering,
has not been previously formed the basis for the award of any degree, diploma or other
similar title or recognition.
Noida
18th March, 2014
BAL
SARAH
ABSTRACT
The project deals with the research over biometric authentication system, employing the
technique of video surveillance and to develop a working authentication system using
MATLAB for identification of a human face, animal or a blacklisted object. The report
develops an overview of theory behind the working of Facial Recognition, Edge
Detection Method, Video Surveillance, Frame Extraction, Graphical User Interface
Development and Fingerprint Recognition. The programs working principle is dependent
on the live feed generated from the surveillance camera along with the already developed
database. This initial database is utilized to train the program in a way such that whenever
the visual feed encounters a match from the database the program and the user are
alerted. In order to achieve this, a Graphical User Interface has been developed which
provides with a user free environment such that it can be used by a person without a
knack in the programming language. Initially an application was created that dealt with
the recognition of human face, object or an animal from a live video feed. Furthermore,
the inclusion of fingerprint authentication technique is an add-on to the reliability of the
program and streamlines the biometric authentication system to a higher level. As the
system of surveillance has become an innate need for every industry, this program thus
developed stands on to the expectations in serving the required level of validation of the
required object or living being.
CONTENTS
TITLE
PAGE NO.
Declaration
Certificate
Acknowledgment
Abstract
Contents
List of Figures
1 Introduction
2- Literature Review
o 2.1 Biometrics
o 2.1.1 How Biometric Technologies Work
o 2.1.2 Enrollment
2.1.3 Verification
2.1.4 Identification
2.1.5 Matches Are Based on Threshold Settings
o 2.2 Facial Recognition
o 2.3 Edge Detection
3-Project Design And Implementation
o 3.1 Software Used
o 3.2 Methodology
3.2.1 Video Surveillance
3.2.2 Face Recognition
3.2.3 Edge Detection
3.2.4 Fingerprint Recognition
4-Conclusion
5-Future Scope
6-References
iii
iv
v
vi
vii
viii
1
2
2
3
3
4
5
5
7
6
8
8
8
8
9
14
26
29
28
30
INTRODUCTION
The aim of video surveillance system is the safety of public, to detect and deter criminal
activities.[1]
These systems are being installed everywhere (in elevators, hallways, shops etc).
[2]Detecting objects and recognizing faces from a live video stream is the major step in
video analysis.[1]Motion detection is generally a software-based monitoring algorithm
which signals the surveillance camera to begin capturing the event when it detects
motions-be it a face or any object.[2]The video captured by the camera is processed by
Matlab.[2]
The research on the video based recognition and detection of faces or objects are
contemplated as continuation and extension of recognition in still images that has been
researched upon widely and many good results have been obtained too.[3]
The present paper focuses initially on how face recognition can be done on live video
stream (using a webcam) and then moves to object or animal detection in live streaming
of video. The live video is first checked for any human face. If a human face is detected,
a rectangular box is formed around the face. If no results are obtained, the video is
checked for presence of any object or any animal.
For face recognition the utmost important task is of face detection. Face color
information is also important feature while performing face detection. [3] Eye is also one
of the main features used for face detection. [3]
The rest of the paper is ordered as follow, section II describes working on face detection
on live webcam, section III is about the recognition of objects and animals using the edge
detection method. Sections IV describe the importance of video surveillance and how
tracking is useful. Section V includes the conclusion and future work.
LITERATURE REVIEW
2.1 Biometrics
In the world of computer security, biometrics refers to authentication techniques that rely
on measurable physiological and individual characteristics that can be automatically
verified. In other words, we all have unique personal attributes that can be used for
distinctive identification purposes, including a fingerprint, the pattern of a retina, and
voice characteristics. Strong or two-factor authenticationidentifying oneself by two of
the three methods of something you know (for example, a password), have (for example,
a swipe card), or is (for example, a fingerprint)is becoming more of a genuine standard
in secure computing environments. Some personal computers today can include a
fingerprint scanner where you place your index finger to provide authentication.
Biometrics is automated methods of recognizing a person based on a physiological or
behavioral characteristic. The past of biometrics includes the identification of people by
distinctive body features, scars or a grouping of other physiological criteria, such like
height, eye color and complexion. The present features are face recognition, fingerprints,
handwriting, hand geometry, iris, vein, voice and retinal scan. Biometric technique is now
becoming the foundation of a wide array of highly secure identification and personal
verification. As the level of security breach and transaction scam increases, the need for
well secure identification and personal verification technologies is becoming apparent.
Recent world events had led to an increase interest in security that will impel biometrics
into majority use. Areas of future use contain Internet transactions, workstation and
network access, telephone transactions and in travel and tourism. There have different
types of biometrics: Some are old or others are latest technology. The most recognized
biometric technologies are fingerprinting, retinal scanning, hand geometry, signature
verification, voice recognition, iris scanning and facial recognition. [19]
A biometric system can be either an 'identification' system or a 'verification'
(authentication) system, which are defined below.
Identification (1: n) One-to-Many: Biometrics can be used to determine a person's
identity even without his awareness or approval. Such as scanning a crowd with the help
2
of a camera and using face recognition technology, one can verify matches that are
already store in database.
Verification (1:1) One-to-One: Biometrics can also be used to verify a person's identity.
Such as one can allow physical access to a secure area in a building by using finger scans
or can grant access to a bank account at an ATM by using retina scan. [19]
extract features and encode and store information in the template is based on the system
vendors proprietary algorithms. Template size varies depending on the vendor and the
technology. Templates can be stored remotely in a central database or within a biometric
reader device itself; their small size also allows for storage on smart cards or tokens.
Minute changes in positioning, distance, pressure, environment, and other factors
influence the generation of a template. Consequently, each time an individuals biometric
data are captured, the new template is likely to be unique. Depending on the biometric
system, a person may need to present biometric data several times in order to enroll.
Either the reference template may then represent an amalgam of the captured data or
several enrollment templates may be stored. The quality of the template or templates is
critical in the overall success of the biometric application. Because biometric features can
change over time, people may have to reenroll to update their reference template.
2.1.3 Verification
In verification systems, the step after enrollment is to verify that a person is who he or
she claims to be (i.e., the person who enrolled). After the individual provides an
identifier, the biometric is presented, which the biometric system captures, generating a
trial template that is based on the vendors algorithm. The system then compares the trial
biometric template with this persons reference template, which was stored in the system
during enrollment, to determine whether the individuals trial and stored templates match.
Verification is often referred to as 1:1 (one-to-one) matching. Verification systems can
contain databases ranging from dozens to millions of enrolled templates but are always
predicated on matching an individuals presented biometric against his or her reference
template. Nearly all verification systems can render a matchno-match decision in less
than a second.
One of the most common applications of verification is a system that requires employees
to authenticate their claimed identities before granting them access to secure buildings or
to computers.
4
2.1.4 Identification
In identification systems, the step after enrollment is to identify who the person is. Unlike
verification systems, no identifier is provided. To find a match, instead of locating and
comparing the persons reference template against his or her presented biometric, the trial
template is compared against the stored reference templates of all individuals enrolled in
the system. Identification systems are referred to as 1: M (one-to-M, or one-to-many)
matching because an individuals biometric is compared against multiple biometric
templates in the systems database. There are two types of identification systems: positive
and negative. Positive identification systems are designed to ensure that an individuals
biometric is enrolled in the database.
Figure 2.1
"Biometrics" means "life measurement" but the term is generally coupled with the use of
unique physiological characteristics to identify a person, some other characteristics of
biometrics are: Universal: Every person must possess the characteristic. The trait must be
one that is universal and seldom lost to accident or disease. Invariance of properties:
They should be constant over a long time. The trait should not be focus to considerable
differences based on age either episodic or chronic disease. Measurability: This should be
suitable for capture without waiting time and must be easy to gather the attribute data
passively. Singularity: Each expression of the element must be distinctive to the person.
The characteristics should have adequate distinctive properties to distinguish one person
from other. Height, weight, hair and eye color are all elements that are unique assuming a
mostly accurate measure, but do not offer enough points of separation to be useful for
more than categorizing. Acceptance: The capturing should be possible in a manner
acceptable to a large fraction of the residents. Excluded are particularly persistent
technologies, such technologies which is require a part of the human body to be taken or
which (apparently) impair the human body. Reducibility: The captured data should be
able of being reduced to a file which is easy to handle. Reliability and tamper-resistance:
The attribute should be impractical to mask or modify. Process should make sure high
reliability and reproducibility. Privacy: This process should not break the privacy of the
individual. Comparable: They should be able to reduce the trait to a state that makes it is
digitally comparable from others. It has less probabilistic for similarity and more
dependable on the identification. Inimitable: The trait must be irreproducible by other
way. The less reproducible the trait, the more likely it will be reliable.
forming a matrix centered on a pixel chosen as the center of the matrix area. If the value
of this matrix area is above a given threshold, then the middle pixel is classified as an
edge. Examples of gradient-based edge detectors are Roberts, Prewitt, and Sobel
operators. All the gradient-based algorithms have kernel operators that calculate the
strength of the slope in directions which are orthogonal to each other, commonly vertical
and horizontal. Later, the contributions of the different components of the slopes are
combined to give the total value of the edge strength. [24]
3.2 METHODOLOGY
3.2.1 VIDEO SURVEILLANCE
The appearance of a vastly improved image processing technologies and increase in the
network bandwidth has escorted the rapid development of the video surveillance. The
video surveillance systems have been used widely for the security monitoring purposes. It
is crucial distinguishing one object from the other for the purpose of tracking and
analyzing the actions of these objects. The two of the approaches for moving object
classification are shape based and motion based methods. The shape based methodology
utilizes the 2D spatial details whereas the motion based uses the temporal characteristics
of objects for the grouping results. [1] The CCTV cameras need the continuous
monitoring whereas the modern surveillance systems automatically detect the faces of
humans, object and even animals and check for their authenticity too. [31] In general the
video surveillance systems do the monitoring of the various activities like detection and
authentication of threat posing objects by analyzing a recorded video, but this turns out to
10
be a very tiring task for the people doing the same. [31] Surveillance systems now a days
are used for the management and monitoring of public places for the obvious reasons of
security and safety. [31] For video surveillance tracking is a significant problem that rose
interest among various researchers. The notion behind tracking is the construction of
objects that corresponds to object parts between consecutive video frames. Tracking
provides temporal data regarding the mobile objects which may increase the lower level
processing like segmentation or may even enable recognition. [1]
The given figure shows the methods to be followed while doing the video surveillance
process. [1]
images from a live video stream (screenshot or frame extraction) by the use of camera
and preprocessing may or may not be dont on them for the enhancement procedure.
11
distance and angles between eyes, nose, mouth or facial templates such as nose width,
nose length, position of mouth and chin type. All these features are then used for
recognition of an unknown face by matching it to the nearest neighbor from the database
that has been stored previously. [14]
The statistical features are calculated by methods that involve algebraic methods. The
methods used for the same are Principal Component Analysis (PCA), Linear
Discriminant Analysis (LDA) etc. They find a mapping between the original face features
spaces to a lower dimension feature space. [14]
eigenface. The image of an eigenface looks very little like a face. Each face is depicted as
linear combinations of eigenvectors. [15][14]
The notion of using eigenfaces was developed by Sirovich and Kirby (1987) and Kirby
and Sirovich (1990) for effective representation of faces using the principal component
analysis. They had argued upon the fact that any group of face images can be roughly
reconstructed by storing a small collection of weights for each face and a small set of
pictures (standard ones). The weights that described the image was found by projecting
the face image onto each eigenpicture. [15]
Before we start with the process of recognition, we firstly need a database of images with
which the image would be matched.
Thus this eigenface approach of face recognition involves the following initial steps:
1. Obtain the initial set of images ( training set of face images). These images should
have been taken under the same lightening conditions and same camera
conditions for better results. The resolutions must be matched too.
2. Compute the eigenfaces from the given training set , keeping just the images
which have been corresponding to the highest eigenvalues. Here the average face
is calculated. Each face difference is then calculated from the average face.
3. These images now define the facespace. The eigenfaces can be further updated or
reconstructed.
4. Compute the corresponding dimensional weight pace distribution for each of the
individuals by projecting face image onto the face space.[15][14]
Once the initialization is done the steps given are used for recognition of new face
images:
1. Compute the set of weights based on the input images and the eigenfaces by
projecting the input image onto the eigenface.
2. Check if the image is that of face or not.
3. If the image is that of a face, categorize the weight pattern as either known face or
of unknown face.
14
4. Update the weight pattern as well as the eigen face. If there is an unknown face
coming up several times, incorporate its weight into the other total weight along
with other images.[15]
Eigenfaces represent the principal components of the face set. These are very crucial for
simplifying the recognition method of set of data. First take the mean subtracted images
in the database and then project these onto the face space. Going by logic, the faces of
same person will map closely to one another in the face space. Recognition can be said as
finding the closest image. [14]
Now, considering the situation where a new image is entered into the system, it can be
recognized by three ways. First is by checking if it is a known image that is the image is
previously present in the database, secondly image is a face but that of an unknown
person, and thirdly image is not at all a face. [14]
The recognition method in a video is done by creating an application in which first the
face
is
detected
by
creating
cascaded
detector.
The
inbuilt
function
cascadeobjectDetector is used which detects the face. The input device used is the
webcam. When a person sits in front of the camera, the application looks for a human
face. If a face like structure is detected by the application then the program goes forward
else the message that no face was confronted is shown.
The next step is to form a rectangular box on the face detected. The coordinates of the
tracked face are found. The hue and saturation for the image are also calculated. The
skin tone of the person is considered for the same. The skin tone for rgb and hvq are
measured from the nose. These images thus obtained are then saved in a database for
further use and detection. The running program can be seen in the figure 1.
15
Figure 2
The figure shows the detection of face from a live video stream using the webcam. The
first image shows the hue channel data, the second segment of figure 1 shows the live
streaming and the third segment(image) of figure 2 shows the detected face, with a
rectangular box highlighting the face. The program detects only the face by measuring
the facial features like eyes nose and mouth. If some other object except face is shown,
the program does not run.
The current application works on the principle of Cascade Face Detection method where
the inbuilt function CascadeObjectDetector( ) is used. Here three modules of cascade
face detection is presented. The three modules are- face skin verification, face symmetry
verification and eye template verification module. [3]
These three modules eliminate tilted faces, background noises, back of heads and other
non-face objects (i.e. the objects that are not human face). The images that contain just
the frontal face of human are sent to the engine that does the task of recognition. The
threshold can be set in the threshold modules and this can further help in better face
detection. In one way it can be said that the video based detection is a continuation of the
still image face recognition. [3]
16
information and reduces data. [21] Edge detection method turns out to be difficult in the
17
case of noisy images as noisy image and edges both contain high frequency content. [21]
Detection of edges can be used for various methods like data compression, image
segmentation, and image reconstruction. [22] An edge-detection filter can also be used to
improve the appearance of blurred or/and anti-aliased video streams. The basic operator
is the matrix area gradient operation that will find the level of variance between different
pixels. The edge detection matrix is calculated by creating a matrix which is centered on
a pixel that is chosen as center of matrix area. If the value of this area forms to be greater
than the given threshold value, then the middle pixel is declared as an edge. [20]
The different gradient based detectors are Prewitt, Robert, Sobel operators. These
gradient based algorithms have kernel operator that calculates the power of slope in the
direction that are orthogonal to one another ,i.e vertical and horizontal.[20]
These kernels are formed in a way such that they respond maximally to the edges running
in the horizontal and vertical direction relative to the pixel grid, provided one kernel for
each of the perpendicular orientations. These kernels can be applied to the input image
18
The Edge detection done by using this operator gives a fairly good result on still images.
19
The design of these kernels is in accordance for responding maximally to edges running
at 45 degrees to the pixel grid, one pixel each for the perpendicular orientations. These
pixels can be combined to give obsolute magnitude of the gradient at each point and the
points orientation. The magnitude is given as
Prewitts Operator
The Prewitts Operator is almost same as the Sobel Operator that is used for detecting the
vertical and horizontal edges in an image.[21]
The mask of this operator is as
Log Operator
20
Log operator is another of the edge detection operators that is linear and time invariant. It
finds the edge points by finding the spots which two-order differential coefficient is zero
in the image grey levels. The log operator is given as
The log operator is the phenomena of calculating and sifting differential coefficient for
the image. It finds the zero overlapping pose of filter result by making use of the
convolution of rotating symmetric log template and image. The template of this is
For the detection methodology of the log operator the pre smoothing of image is done
using the Gaussian low pass filter and after that the steep edge in image is found using
the log operator. After this the binarization of the image is carried out which results in a
closed and connected outline and has removed the internal spots of the image. The Log operator is mostly employed to find that edge pixels lie in either bright section or dark
section of the image. [24]
Canny Operator
Canny edge detection operator is comparatively a newer one in the vast sea of operators.
It finds a large application because of its good performance. The Canny operator edge
detection is to search for the partial maximum value of image gradient. The Canny
Detector uses two gradients for finding the strong and the weak edges respectively. If the
weak edge gets connected with the strong edge only then it gets shown in the output
value. [24]
21
image needs to be carried out in the way of enhancement, before evaluation we apply the
binarization process on the image of the fingerprint. [18] The fingerprint recognition
technology gets features from the impressions left by the ridges on the fingertips. The
fingertips could be rolled type or of flat type. A flat one captivates just an impression of
the central area between the fingertip and the first knuckle whereas a rolled one
captivates the ridges on both sides of the finger. A fingerprint can be defined as the flow
of ridge patterns on the fingers tip. The ridge flow contains abnormalities in local region
of fingertip. The orientation and position of these abnormalities are in turn used for
depicting and matching fingertips. [18]
23
Histogram Equalization
Histogram equalization means expanding the pixel value distribution of an image such
that it increases the perceptional information. The original histogram of a fingerprint
image and the histogram after the histogram equalization are shown in the given figure
24
Match
The matching of the given fingerprint is done with the fingerprint data previously being
stored in the computers database.
25
26
CONCLUSION
27
The Eigenface approach for Face Recognition process is fast and simple which works
well under constrained environment. It is one of the best practical solutions for the
problem of face recognition. Many applications which require face recognition do not
require perfect identification but just low error rate. So instead of searching large
database of faces, it is better to give small set of likely matches. By using Eigenface
approach, this small set of likely matches for given images can be easily obtained. For
given set of images, due to high dimensionality of images, the space spanned is very
large. But in reality, all these images are closely related and actually span a lower
dimensional space. By using eigenface approach, we try to reduce this dimensionality.
The eigenfaces are the eigenvectors of covariance matrix representing the image space.
The lower the dimensionality of this image space, the easier it would be for face
recognition. Any new image can be expressed as linear combination of these eigenfaces.
This makes it easier to match any two images and thus face recognition.
28
FUTURE PROSPECTS
The project consists of facial recognition, edge detection method which in the future can
be integrated with other biometric techniques as well to getter a stronger authentication
system. Edge detection works on the system in relation to identify animals or match
humans from a given database. This program on a larger scale can be incorporated by
defense forces in order to identify threat posing objects such as missile detection or
enemy warfare. Moreover, for places where guarding of something valuable is required,
these authentication techniques are very useful but due to non-availability of any Indian
development firm the use of this technique is vague.
29
REFRENCES
[1].Study of Moving Object Detection and Tracking for Video Surveillance,
International Journal of Advanced Research in Computer Science and Software
Engineering
[2].Real Time Motion Detection in Surveillance Camera Using MATLAB,
International Journal of Advanced Research in Computer Science and Software
Engineering, Iraqi National Cancer Research Center ,Baghdad University, Iraq
[3].A Video-based Face Detection and Recognition System using Cascade Face
Verification Modules, Ping Zhang, Department of Mathematics and Computer
Science, Alcorn State University, USA
[4].A Surveillance System based on Audio and Video Sensory Agent cooperating with
a Mobile Robot, The University of Padua, Italy
[5].Face Recognition using Eigenfaces, Mathew.A.Turk and Alex.P.Pentland, Vision
and Modeling Group, The Media Laboratory, Massachusetts Institute of
Technology
[6].Performance evaluation of object detection algorithms for video surveillance,
Jacinto Nascimento, Member, IEEE andJorge Marques
[7].Face recognition using multiple eigenface subspaces, P.Aishwarya and Karnan
Marcus, Journal of Engineering and Technology Research Vol. 2(8), pp. 139-143,
August 2010
[8].Development of a real-time face recognition system for access control, Desmond
E. van Wyk, James Connan, Department of Computer Science, University of
Western Cape, South Africa
[9].Face Recognition and Retrieval in Video, Caifeng Shan
[10].
Face Recognition: A Literature Survey, W.Zhao, R. Chellappa, P.J.Phillips
and A. Rosenfeld
[11].
Biometrics and Face Recognition Techniques, International Journal of
Advanced Research in Computer Science and Software Engineering, Renu Bhatia
[12].
Image-based Face Detection and Recognition: State of the Art, Faizan
Ahmad , Aaima Najam and Zeeshan Ahmed
[13].
OBJCUT for Face Detection, Jonathan Rihan, Pushmeet Kohli, and Philip
H.S. Torr, Oxford Brookes University, UK
[14].
[16].
[20].
Research Laboratory, 2800 Powder Mill Road, Adelphi, MD 207831197 University of Maryland Institute for Advanced Computer
_
[23].
31