RESEARCH ARTICLE
OPEN ACCESS
[1],
Ms.V.Suganya [2]
[1]
ABSTRACT
Speech recognition has of late beco me a practical technology. It is used in real -world hu man language applications, such as
informat ion retrieval. It is the most common means of the communicat ion because the information contains the fundamental role
in conversation. Fro m the speech or conversation, it converts an acoustic signal that is captured by a microphone or a telephone, t o
a set of words. A cluster of words can either be the final result or it can then apply the synthesis to pronounce into text, wh ich
implies speech-to-text. It means that, speech recognition can serve as the input to further linguistic processing to achieve speech
understanding.This Paper analysis the types and algorithms of speech recognition.
Keywords:- Speech Recognition; Feature Extraction; MFCC; LPC; Hidden Markov Model; Neural Net work; Dynamic Time
Warping.
I.
INTRODUCTION
II.
ISSN: 2347-8578
Speaker Dependance
Speaker dependent systems are designed around a
specific speaker. They generally are more accurate for the
correct speaker, but much less accurate for other speakers.
They assume the speaker will speak in a consistent voice and
tempo. Speaker independent sys tems are designed for a variety
of speakers. Adaptive systems usually start as speaker
independent systems and utilize t rain ing techniques to adapt to
the speaker to increase their recognition accuracy.
Vocabularies
www.ijcstjournal.org
Page 350
International Journal of Computer Scie nce Trends and Technology (IJCS T) Volume 4 Issue 2 , Mar - Apr 2016
Vocabularies (or d ictionaries) are lists of words or
utterances that can be recognized by the SR system. Generally,
smaller vocabularies are easier for a co mputer to recognize,
while larger vocabularies are mo re difficult. Un like normal
dictionaries, each entry doesn't have to be a single word. They
can be as long as a sentence or two.
Accuract
The ability of a recognizer can be examined by
measuring its accuracy or how well it recognizes utterances.
This includes not only correctly identify ing an utterance but
also identifying if the spoken utterance is not in its vocabulary.
Good ASR systems have an accuracy of 98% o r mo re! The
acceptable accuracy of a system really depends on the
application.
Training
Some speech recognizers have the ability to adapt to
a speaker. When the system has this ability, it may allow
training to take place. An ASR system is trained by having the
speaker repeat standard or common phrases and adjusting its
comparison algorithms to match that particular speaker.
Training a recognizer usually improves its accuracy.
Train ing can also be used by speakers that have difficu lty
speaking, or pronouncing certain words. As long as the
speaker can consistently repeat an utterance, ASR systems
with training should be able to adapt.
III.
ISSN: 2347-8578
IV.
V.
Acoustic
modeling and Language
modeling are
important parts of modern statistically-based speech
recognition algorith ms. Hidden Markov models (HMMs) are
widely used in many systems. Language modeling is used in
natural language processing applications such as document
classification or statistical machine translation.
i)
www.ijcstjournal.org
Page 351
International Journal of Computer Scie nce Trends and Technology (IJCS T) Volume 4 Issue 2 , Mar - Apr 2016
ISSN: 2347-8578
www.ijcstjournal.org
Page 352
International Journal of Computer Scie nce Trends and Technology (IJCS T) Volume 4 Issue 2 , Mar - Apr 2016
Figure 5: The results of the cross -correlation, summation of
multiplications
viii) AUTO CORRELATION ALGORITHM:
The auto-correlation is the algorithm to measure how
the signal is self-correlated with itself.
The FIR Wiener Filter:
The FIR W iener filter is used to estimate the desired
signal d (n) fro m the observation process x (n) to get the
estimated signal d (n). It is assumed that d (n ) and x (n) are
correlated and jointly wide-sense stationary. And the error of
estimation is e (n) =d (n)-d (n). The purpose of Wiener filter
is to choose the suitable filter order and find the filter
coefficients with which the system can get the best estimation.
In other words, with the proper coefficients the system can
minimize the mean-square error
VI.
PERFORMANCE EVALUATION OF
ASR TECHNIQUES
ISSN: 2347-8578
www.ijcstjournal.org
Page 353
International Journal of Computer Scie nce Trends and Technology (IJCS T) Volume 4 Issue 2 , Mar - Apr 2016
VII.
(1)
Where S is the number of substitutions, D is the number of the
deletions, I is the nu mber of the insertions and N is the number
of words in the reference.
The speed of a speech recognition system is
commonly measured in terms of Real Time Factor (RTF). It
takes time P to process an input of duration I. It is defined by
the formula (2)
(2)
The comparison of the various speech recognition
research based on the dataset, feature vectors, and speech
recognition technique adopted for the particular language are
given in the Table 1.
ISSN: 2347-8578
CONCLUSION
www.ijcstjournal.org
Page 354
International Journal of Computer Scie nce Trends and Technology (IJCS T) Volume 4 Issue 2 , Mar - Apr 2016
create increasingly powerful systems, deployable on a
worldwide basis in future.
REFERENCE
[1] Bassam A. Q. A l-Qatab , Raja N. Ainon, Arabic
Speech Recognition Using Hidden Markov Model
Toolkit(HTK), 978-1-4244-6716-711 0 2010
IEEE.
[2] M. Chandrasekar, M. Ponnavaikko, Tamil speech
recognition: a co mp lete model, Electronic Journal
Technical Acoustics 2008, 20.
[3] Co rneliu Octavian DUMITRU, Inge GA VAT, A
Co mparative Study of Feature Ext raction Methods
Applied to Continuous Speech Recognition in
Ro manian
Language,
48th
International
Symposiu m ELMAR-2006, 07-09 June 2006, Zadar,
Croatia.
[4] DOUGLA S OSHAUGHNESSY, Interacting With
Co mputers
by
Voice: Automatic
Speech
Recognition and Synthesis, Proceedings of the
IEEE, VOL. 91, NO. 9, September 2003, 00189219/03 2003 IEEE.
[5] Ghu lam Muhammad, Yousef A. A lotaibi, and
Mohammad Nurul Huda , Automatic Speech
Recognition for Bangia Digits, Proceedings of
2009 12th International Conference on Co mputer
and Information Technology (ICCIT2009) 21-23
Dec-2009,
Bangladesh,
978-1-4244-62841/092009 IEEE.
[6] A.P.Henry Charles & G.Devaraj, A laigal-A Tamil
Speech Recognition, Tamil Internet 2004,
Singapore.
[7] Meysam Mohamad pour, Fardad Farokh i, An
Advanced Method for Speech Recognition, World
Academy of Science, Engineering and Technology
49 2009.
[8] Santosh K.Gaikwad, Bharti W.Gawali and Pravin
Yannawar, A Rev iew on Speech Recognition
Technique, International Journal of Co mputer
Applications (0975 8887) Vo lu me 10 No.3,
November 2010.
ISSN: 2347-8578
www.ijcstjournal.org
Page 355