Optimum Data Length To Train Isolated Speaker Dependent Indonesian Digit Recognizer
Optimum Data Length To Train Isolated Speaker Dependent Indonesian Digit Recognizer
ISSN : 2302-9579
VOLUME 6, NOMOR 1, Juni 2016
Penanggung Jawab
Dr. Sammy Saptenno, SE., M.Si
Ketua Penyunting
Vicky Salamena, SST., MT
Redaktur
Aleksander A Patty, ST., MT
Penyunting Pelaksana
Luwis H. Laisina, ST., MT
Paulus F. Picauly, ST., M.Eng
Graciadiana I. Huka, ST., MT
Reynold P. J. V. Nikijuluw, S.Pd., M.Ed
Desain Grafis
Ridolf Kermite, ST
Tata Usaha
Wa Hauli
ANALISIS PAPARAN LOGAM Pb PADA IKAN ASAP YANG DIJUAL DI KOTA AMBON
31 – 38
(MUHAMMAD SAID KARYANI)
ii
JURNAL SIMETRIK VOL 6, NO. 1 JUNI 2016, ISSN : 2302-9579
Abstract
The performance of isolated digit recognition for Indonesian language with local accent will be measured.
The software set to be used is Hidden Markov Toolkit (HTK). A set of very minimal time length of training sound is
to be measured. The result will be a plot of time length against the word error rate.
1
JURNAL SIMETRIK VOL 6, NO. 1 JUNI 2016, ISSN : 2302-9579
mark boundary of sequences of phonemes, as shown in digit recognition, as can be seen in figure 2:
figure 1. Phonemes are a subword used by HTK and can $NUMS=(NUM_0
NUM_0 k o s o ng
NUM_1 satu
NUM_3 tiga
NUM_4 ampat
NUM_5 lima
NUM_6 anam
NUM_7 tujuh
Figure 1: labelling sound files NUM_8 lapan
NUM_9 sembilan
2. Recognition Process or testing process will examine NUM_2 duwa
HTK is a set of ready to use shell scripts and 2. Speech Feature Model: MFCC_E_D_A
programming library to train and to test ASR system. In 3. Phone context: monophonic
order to work, a few configuration files must be written 4. Phones: \k\, \o\, \s\, \ng\, \sil\
explicitly. These configurations point to specific 5. HMM-GMM: Diagonal covariances, 5 states with 3
format or method of feature extraction, HMM-GMM excitation states, enter state and exit state.
HTK Grammar
This is the language grammar use to perform isolated
2
JURNAL SIMETRIK VOL 6, NO. 1 JUNI 2016, ISSN : 2302-9579
Figure 4: Single Digit Recognition Results Figure 7: Four Digits Recognition Results
Two Digits Recognition Figure 8 plot 5 digits result with this criterion:
For the next system we test, how to to recognize two 1. Phones: \k\, \o\, \s\, \ng\, \sil\, \s\,\a\,\t\,\u\, \i\, \g\, \m\,
Indonesian digit “kosong” and “satu” plus silence \p\. \l\
boundary. 2. Words: silence, 0, 1, 3, 5 and 4.
1. Data format: PCM 16 bit, mono, 16000Hz LENGTH Accuracy
2. Speech Feature Model: MFCC_E_D_A 35.69 seconds 100%
3. Phone context: monophonic
4. Phones: \k\, \o\, \s\, \ng\, \sil\, \s\,\a\,\t\,\u\ Figure 8: Four Digits Recognition Results
5. HMM-GMM: Diagonal covariances, 5 states with 3
excitation states, enter state and exit state. Figure 9 plot 6 digits result with this criterion:
6. Words: silence and 1 (“satu”) and 0 (“kosong”), 1. Phones: \k\, \o\, \s\, \ng\, \sil\, \s\,\a\,\t\,\u\, \i\, \g\, \m\,
\p\. \l\, \n\
The result is plot in figure 5: 2. Words: silence, 0, 1, 3, 5, 6 and 4.
LENGTH Accuracy LENGTH Accuracy
13.8 seconds 100% 41.8 seconds 100%
Figure 5: Two Digits Recognition Results Figure 9: Six Digits Recognition Results
Figure 6 plot 3 digits result with this criterion: Figure 10 plot all Indonesian digits result with
1. Phones: \k\, \o\, \s\, \ng\, \sil\, \s\,\a\,\t\,\u\, \i\, \g\ constraint:
2. Words: silence, 0, 1 and 3. 1. Phones: \a\, \b\, \d\, \e\, \g\, \h\, \i\, \j\, \k\, \l\, \m\, \n\,
\ng\, \o\, \p\, \s\, \sil\, \t\, \u\, \w\
LENGTH Accuracy 2. Words: silence, 0, 1, 3, 5, 6, 7, 8, 9, 2 and 4.
21.0 seconds 97.59%
29.0 seconds 98.80% LENGTH Accuracy
38.1 seconds 100% 82.35 seconds 95.59%
Figure 6: Three Digits Recognition Results Figure 10: Indonesian Digits Recognition Results
Figure 7 plot 4 digits result with this criterion: If we summarize the number of digits to recognize
1. Phones: \k\, \o\, \s\, \ng\, \sil\, \s\,\a\,\t\,\u\, \i\, \g\, \m\, versus data length required for a certain level of
\p\. performance, Accuracy we get table on figure 11 and
2. Words: silence, 0, 1, 3 and 4. figure 12:
3
JURNAL SIMETRIK VOL 6, NO. 1 JUNI 2016, ISSN : 2302-9579
60
No.2, February 1989.
data length (s)