Anda di halaman 1dari 12

IAETSD JOURNAL FOR ADVANCED RESEARCH IN APPLIED SCIENCES, VOLUME 4, ISSUE 1, JAN-JUNE /2017

ISSN (ONLINE): 2394-8442

SMART VAULT SECURITY SYSTEM USING SPEAKER


RECOGNITION AND IRIS DETECTION
Akshay Titkare [1], Shreeya Undale [2], Prof. Prajakta Thakare [3]
[1, 2, 3]
Department of Information Technology
Sinhgad Institute of Technology and Science, Narhe, Pune, Maharashtra, India
akshaytitkare28@gmail.com, shreeyaundale@gmail.com, prthakare_sits@sinhgad.edu

ABSTRACT.

With an increasing emphasis on security, automated personal identification based on biometrics has
been receiving extensive attention over the past decade. Biometric Technology is the recognition of the physical
characteristics of an individual. Examples are voice patterns, iris or retina differences, handwriting technique,
fingerprints etc. A good biometric is one which uses a feature that is highly unique. Biometric Identification
provides a valid alternative to traditional authentication mechanisms overcoming many of the shortfalls of
traditional methods. This paper gives an overview of Speaker Recognition and Iris Recognition technology utilized
at different stages of the biometric identification process and evaluates their performance. This paper also describes
different techniques that can be used in applications of speaker and iris recognition system.

Keywords Biometrics, Smart Vault, Speaker Recognition, Iris Recognition, Image processing, MFCC, RED.

I. INTRODUCTION
Security and the authentication of individuals is necessary for many different areas of our lives, with most people having to authenticate their
identity on a daily basis. Reliable automatic recognition of persons has long been an attractive goal [5]. In the realm of computer security,
Biometrics refers to authentication techniques that rely on measurable physiological and individual characteristics that can be automatically
verified. In other words, we all have unique personal attributes that can be used for distinctive identification purposes, including a fingerprint,
the pattern of a ret, and voice characteristics. Although the field of biometrics is still in its infancy, its inevitable that biometric systems will
play a critical role in the future of security. Biometrics refers to the automatic identification of a person based on his or her physiological or
behavioral characteristics. Biometric identifiers are the distinctive and measurable features that are used to label and describe individuals[6].
There are two categories of biometric identifiers namely physiological and behavioural characteristics. A biometric system usually functions by
first capturing a sample of the feature, such as capturing a digital colored image of a face to be used in facial recognition or a recording a
digitized sound signal to be used in voice recognition.

A. SMART VAULT

Smart Vault is first of-its-kind advanced locker service in India .We can safeguard our valuables while enjoying the convenience of
accessing them any time , any day. To provide us the best in convenience and safety, the Smart Vault is designed with state-of-the-art robotic
technology and high end security .It is equipped with most evolved and intelligent security systems.

B. SPEAKER RECOGNITION

Speaker Recognition is one of the most useful biometric recognition techniques in this world where insecurity is a major threat. Speaker
Recognition is a process of automatically recognizing who is speaking on the basis of the individual information included in speech waves.
Speech signal contains different levels of information. Speech signal can be used for speech recognition. Speaker recognition and speech
recognition are very closely related systems but these two systems are different. Speech recognition is the process of recognizing what is being
said and speaker recognition is the process of recognizing who is speaking.

To Cite This Article: Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT
SECURITY SYSTEM USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for
Advanced Research in Applied Sciences ;Pages: 496-507
497. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

Speaker Recognition

Speaker Verification Speaker Identification

Text Text Text Text


Dependent Independent Dependent Independent

Fig 1: The Scope of Speaker Recognition

Speaker Recognition mainly involves two modules namely feature extraction and feature matching. Feature extraction is the process that extracts
a small amount of data from the speakers voice signal that can later be used to represent that speaker. Feature matching involves the actual
procedure to identify the unknown speaker by comparing the extracted features from his/her voice input with the ones that are already stored in
our speech database[1]. Speaker recognition is classified as speaker identification and verification.

The main aim of speaker recognition is to identify the speaker by extraction, characterization and recognition of the information contained in
speech signal. Speaker recognition methods can be divided into text independent and text dependent methods. In a text independent system,
speaker models capture characteristics of somebodys speech which show up irrespective of what one is saying. In a text dependent system, the
recognition of the speakers identity is based on his or her speaking one or more specific phrases or words [2].

C. IRIS RECOGNITION

1) BIOLOGICAL DISCRIPTION:

The iris is a thin circular structure in the eye. Its function is to control the diameter and size of the pupil and hence it controls the amount of light
that progresses to the retina. To control the amount of light entering the eye, the muscles associated with the iris (sphincter and dilator) either
expand or contract the centre aperture of the iris known as the pupil [7].

Fig 2: Diagram of eye.

Iris recognition process includes various tasks like:


Image acquisition
Image Segmentation
Image Localization
Image Normalization
Encoding
Template Matching

Image Acquisition:

Acquisition means getting the information from the source. Iris image of the person is acquired by using optical lens, illuminators, image sensors
etc.

Image Segmentation:

Image Segmentation is the process of obtaining all the different segments of the eye like pupil diameter, eyelashes, and eyelid etc.

Iris Localization:

Localization focuses on obtaining biometric template for the various coordinates of the image that can be obtained by using number of
transformation functions.
498. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

Image Normalization:

It basically deals with obtaining the basic feature vector of different parts of the iris. It deals with obtaining the gray scale image parameters.

Image Encoding:

It deals with encoding of unique iris patterns obtained in form of bits code by various means like filters wavelets etc.

Image matching:

It deals with matching of the iris pattern code encoded with previously stored patterns in the database in the form of biometric templates.

II. EXISTING METHODS


There are several methods already present in the domain of speaker recognition and iris recognition. Each method has its own pros and cons.

A. FOR SPEAKER RECOGNITION

1) Linear Predictive Coding (LPC)

LPCs are used to calculate spectrum of the signal .This technique starts with the assumption that a speech signal is produced by a buzzer at the
end of a tube. By estimating the formants LPC analyzes the speech signal. It removes the effects of formants from the speech signal, and
estimates the intensity and frequency of the remaining buzz. The procedure used for removing the formants is called inverse filtering, and the
remaining signal is called the residue. Drawback of this technique is that performance degradation in presence of noise [4].

2) Linear Predictive Cepstral Coefficients (LPCC)

This technique is widely used to extract the features from speech signal. LPC parameters can effectively describe energy and frequency spectrum
for sound frames. LPCC gives smoother spectral envelop and stable representation as compare to LPC. Drawback of this technique is that
linearly spaced frequency band.

B. FOR IRIS RECOGNITION

1) Gabor Wavelet Method

Gabor Wavelet Method is typically used for analysing the human iris patterns and extracting feature points from them. To extract its phase
information using quadrature 2D Gabor wavelets. By identifying in which quadrant of the complex plane each resultant phasor lie when a given
area of iris is projected onto complex valued 2D Gabor wavelets it amounts to a patch wise phase quantization of the iris pattern. Only phase
information is used for recognizing irises because amplitude information is not very discriminating, and it depends upon extraneous factors such
as imaging contrast, illumination, and camera gain [8].

2) Linde ,Buzo & Gray Algorithm

In this algorithm iris colour feature extraction is done by using vector quantization for colour images of the iris .Each code vector consisting of 3
components R, G, B by selecting the local iris to obtain the image code vectors of that region. Average muse is obtained between all the code
vectors present in the codebook of the acquired image is used for comparison [10].

III. PROPOSED SYSTEM


In this project we tend to develop a Smart Vault Security System, based on two level security mechanisms ie Speaker recognition and Iris
recognition.

Fig 3: System Architecture


499. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

WORKING:

The system is proposed for secured bank locker opening .The system consists of Microcontroller 89S52, LCD Display, RF Module, MAX 232
IC, Relay driver ,Relay , LEDS. Whenever user want open his locker .First of all the speaker recognition is done with the help of MATLAB
software in the PC. After successful recognition of speaker the IRIS recognition of that particular user is done with the help of PC i.e. software
designed at PC side. After identification of that particular user the signal is transmitted to the microcontroller wirelessly via RF module. After
receiving identification signal from PC the Locker of that particular user will be opened through relay. This locker opening action is shown by
giving LED indication in the proposed Smart vault system.

Proposed Algorithms:

A. Mel Frequency Cepstral Coefficient (MFCC)

MFCC algorithm is based on the known variation of the human ears critical bandwidths with frequency. The technique makes use of two types
of filter, namely, linearly spaced filters and logarithmically spaced filters. To capture the phonetically important characteristics of speech, signal
is expressed in the Mel frequency scale. This scale has a linear frequency spacing below 1000 Hz and a logarithmic spacing above 1000 Hz.
Normal speech waveform may vary from time to time depending on the physical condition of speakers vocal cord. MFCCs are less susceptible
to the said variations [11].

Fig 4: MFCC Block diagram

MFCC Algorithm steps:

Frame Blocking: The continuous speech signal is blocked into frames of N samples with adjacent frames being separated by M(M<N).

Windowing: In this process window each individual frame so as to minimize the signal discontinuities at the beginning and end of each frame.

Fast Fourier Transform (FFT): It converts each frame of N samples from the time domain into frequency domain.

Mel-Frequency Wrapping: The Mel-frequency scale is linear frequency below 1000Hz and a logarithmic spacing above 1000Hz. It is used to
view each filter as an histogram bin(where bins have overlapped) in the frequency domain.

Cepstrum: The cepstrum is the Forward Fourier Transform of the spectrum. It is thus the spectrum of the spectrum and has certain properties
that make it useful in many types of signal analysis.

A. Ridge Energy Direction(RED)

Ridge Energy Detection algorithm is one of the most accurate and fast identification method to detect iris features .RED algorithm is applied to
rectangle iris that generated from normalization process. RED algorithm constructed a template contains the features of the iris by using two
types of filters(Horizontal And Vertical).The first rectangle iris template generated as common way in this field by taking rectangle iris template
while the other is a novel rectangle iris template . RED algorithm is used in order to compare the accuracy and time between them. The first and
foremost step is to collect the iris images. On these images various pre-processing steps are carried out. It includes conversion of colour image to
gray scale, histogram equalization and segmentation. Polar to rectangular conversion is applied and on this rectangular template RED algorithm
is applied which generates the template. These templates are match with the stored one using hamming distance and the match ID is displayed.
The flow of process is given below:
500. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

Fig 5: RED Block Diagram

Capture Image:

Generally iris images are captured using 3CCD camera working at near infrared (NIR) light. We collect different colour images of iris from
different sources on which the RED algorithm is applied.

Pre-processing:

Gray Images: The colour image is converted into gray scale image.

Histogram Equalization: Here the contrast of the images is uniformly distributed to enhance the quality of image.

Segmentation: In this the pupil is separated using canny edge detector. It detects the boundary of pupil and iris using gradient change concept.
Polar to rectangular conversion: After separating the pupil the polar to rectangular conversion is applied this generates the rectangular
template.

Feature Extraction: Feature extraction is based on the prominent direction of the ridges that appear on the image the polar coordinates are
converted into rectangular co-ordinates and transformed into an energy image.

Template Matching: The templates are compared with the stored templates using Hamming distance/Euclidean distance as the measure of
closeness.

IV. IMPLEMENTATION
Whenever user wants to open his locker. First of all the speaker recognition is done with the help of MATLAB software in the PC. After
successful recognition of speaker the Iris recognition of that particular user is done with the help of PC. After identification of that particular user
the signal is transmitted to the microcontroller wire-lessly via RF module. After receiving identification signal from PC the Locker of that
particular user will be opened through relay.

We have provided following features into our system:

a. Record input speech sample.


b. Template generation for the recorded sample.
c. Read input sample from the database.
d. Recognize and compare the input sample with the database.
e. Read and recognize input image.
f. Template generation for iris.
This system is divided into following three modules:
(a) GUI
(b)Speaker Recognition
(c) Iris Recognition
501. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

A. GUI
The graphical user interface allows user to interact with the system by authenticating the user identity by using username and password
mechanism.

Fig 5(a): Login page

Fig 5(b): User Authentication

Fig 5(c): Login Successful

Fig 5(d): Login Failed

Fig 5(e): Multilevel Security GUI


502. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

B. SPEAKER RECOGNITION

1.) Record Audio

For recording input speech sample and saving it in the database. Record and Save Input Audio Sample.

2.) Template Generation

Generation of templates of the corresponding audio sample. The method includes generating an interim template, generating a time alignment
path between the interim template and a token, mapping frames from the interim template.

Fig 6(a): Template Generation

3.) Reading Audio File

Reading the audio file stored in the corresponding audio database.

Fig 6(b): Reading Audio file

Fig 6(c): Audio Wave Generation of Corresponding Sample

4.) Recognising Audio

Comparing the input audio sample with the stored database samples and recognising the user.

Fig 6(d): Recognising User


503. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

C. Iris Recognition

1.) Read input Image

Input iris image is taken/read from standard iris database named CASIA.

Fig 7(a): Reading Iris Image From Database


Image Preprocessing Stages

Fig 7(b): RGB Image

Fig 7(c): RGB to GRAY Image

Fig 7(d): Enhanced Gray-scale Image


504. Priyadarshan Dhabe and Akshay Prakash Mahajan ,. MODIFIED FUZZY HYPER-LINE SEGMENT CLUSTERING
NEURAL NETWORK (MFHLSCNN) FOR PATTERN RECOGNITION AND ITS PARALLEL IMPLEMENTATION ON
GPU. Journal for Advanced Research in Applied Sciences; Pages: 490-495

Fig 7(e): Enhanced Gray-scale Image

Fig 7(f): Edge Detection Of Iris

Fig 7(g): Center Detection of Iris

Fig 7(h): Iris and Pupil Separation

Fig 7(i): Extracted Iris Pattern


505. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

V. RESULTS
The developed software application was Executed on the system and found to be operate as expected : The GUI and Smart Vault Hardware kit is
as shown in fig 6(a) and Fig 6(b) respectively. GUI enables the user to select the Input Audio file from database or at real time and Input Iris
Image is taken from the iris database. First of all speaker identification is done using the MFCC algorithm and the results are displayed on the
GUI with corresponding speaker id. After this the second phase is iris detection where the iris of the corresponding user is detected using the
RED algorithm. If the user passes both the security mechanisms then the validity of the user is displayed on both the Smart Vault Hardware kit
as well as the GUI of the system.

Fig 8(a): GUI

Fig 8(b): Smart Vault Hardware kit


1). For Speaker Recognition :

The Input is taken from the database or at real time and the templates are generated and finally user is recognised by applying the MFCC
algorithm.

2). For Iris Detection

The Input is taken from the database and the templates are generated and finally user is identified by applying the RED algorithm.

3).Hardware Kit

Signal is serially is transmitted from GUI to hardware wirelessly.

Fig 8(c): For Valid User


506. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

The Fig 8(c) displays recognised user/person id and also dispalys the validity of the user in the GUI.

Fig 8(d): LED Glows If Valid User

Fig 8(d) Displays Validity of the user (Valid user) on the LCD Display mounted on the smart vault hardware kit. LED glows if the user is
recognized as a valid user of the system.

Fig 8(e): Invalid User

Fig 8(e) Shows that If the user is not recognized successfully then the system displays invalid user on the GUI.

Fig 8(f): A Buzzer Gets Activated if Invalid User

Fig 8(f) Displays Validity of the user (Invalid user) on the LCD Display mounted on the smart vault hardware kit. A Buzzer Gets activated if
the user is identified as an invalid user of the system.

VI. CONCLUSION
After surveying and studying various security management systems and implementation of this project it has been concluded that Security
management using Image Processing is more suitable and effective.Speaker Recognition and Iris Recognition techniques represent some of the
major biometric tools for identification of a person.We reviewed feature extraction techniques of speaker recognition and found that Mel-
Frequency Cepstral Coefficients (MFCC) is most widely used technique for speaker recognition. Iris was found to be the perfect biometric for
authentication purposes as it is highly distinctive, stable. Iris recognition using RED algorithm proves to be very efficient and effective technique
as it gives accurate and reliable results. Each of these techniques carries its advantages and drawbacks which can be overcomed by adding some
features from other technologies. To reduce the execution time and matching the best possible outcomes, optimized techniques can be used.
507. Akshay Titkare, Shreeya Undale and Prof. Prajakta Thakare,. SMART VAULT SECURITY SYSTEM
USING SPEAKER RECOGNITION AND IRIS DETECTION. Journal for Advanced Research in Applied
Sciences; Pages: 496-507

REFERENCES
[1] Cemal Hanili, Tomi Kinnunen, Figen Erta, Rahim Saeidi, Jouni Pohjalainen, and Paavo Alku, Regularized All-Pole Models for Speaker
Verification Under Noisy Environments, IEEE Signal Processing Letters, Vol.19, No.3, March 2012.

[2]R.Shantha Selva Kumari, S. Selva Nidhyananthan, Anand.G, Fused Mel Feature sets based Text-Independent Speaker Identification using
Gaussian Mixture Model, International Conference on Communication Technology and System Design 2011, Volume 30, 13 March 2012,
Pages 319326.

[3] Campbell, J.P., Jr.; Speaker recognition: a tutorial Proceedings of the IEEE Volume85, Issue 9, Sept. 1997 Page(s):1437 1462.

[4] Seddik, H.; Rahmouni, A.; Sayadi, ; Text independent speaker recognition using the Mel frequency cepstral coefficients and a
neuralnetwork classifier First International Symposium on Control, Communications and Signal Processing, Proceedings of IEEE 2004
Page(s):631-634.

[5] Dr. Ekta Walia "Analysis of various biometric techniques" IJCSIT) International Journal of Computer Science and Information
Technologies, Vol. 2 (4) , 2011

[6] D. Lauber, "Biometrics: A Brief Overview" ,SANS Institute 2003.

[7] AziziA. and H. Reza, 2009. Efficient IRIS Recognition through Improvement of Feature Extraction and Intelligence. subset Selection.
(IJCSIS) International Journal of 12. Computer Science and Information Security, 2: 1.

[8] Mohamad-Ramli, N.A., M.S. Kamarudin and A. 2008 Iris Recognition for Personal Identification.

[9] A.Revathi1, R. Ganapathy and Y. Venkataramani ,Text Independent Speaker Recognition and SpeakerIndependent Speech Recognition
Using Iterative Clustering Approach, International Journal ofComputer science & Information Technology.

[10] C.R. Prashanth, D.R. Shashikumar, K.B. Raja, K.R. Venugopal, L.M. Patnaik, "High Security Human Recognition System using Iris
Images," ACEEE International Journal on Signal and Image Processing Vol 1, No. 1, Jan 2010.

[11] Vibha Tiwari, MFCC and its applications in speaker recognition, International Journal on Emerging Technologies, vol.1,issue.,pp.19-
22,February2010.