In this post, I am going to explain speech to text conversion using java speech API.
Here is the agenda of this post:
A Brief Idea
Speech Recognition is the process of converting spoken input to digital output, such as text.
Speech recognition systems provide computers with the ability to listen to user speech
and determine what is said.
The Speech Recognition process can be divided into these four steps:
I am going to use a third party java speech recognizer engine TalkingJava SDK . It is the full
implementation of Sun's Java Speech API providing Text-To-Speech and Speech-Recognition engines.
Setup
Verify Setup
I have written a small java class to verify whether the speech recognition engine has
been installed successfully or not.
package com.sarf.talkingjava;
import java.util.Locale;
import javax.speech.Central;
import javax.speech.EngineList;
import javax.speech.recognition.RecognizerModeDesc;
try
{
Central.registerEngineCentral
("com.cloudgarden.speech.CGEngineCentral");
RecognizerModeDesc desc =
new RecognizerModeDesc(Locale.US,Boolean.TRUE);
EngineList el = Central.availableRecognizers(desc);
if(el.size()<1){
System.out.println("Recognition Engine is not
available");
System.exit(1);
}else{
System.out.println("Recognition Engine is
available");
System.exit(1);
}
}catch(Exception exception)
{
exception.printStackTrace();
}
}
}
package com.sarf.talkingjava;
import javax.speech.Central;
import javax.speech.recognition.*;
import java.io.FileReader;
import java.util.Locale;
recognizer.waitEngineState(Recognizer.FOCUS_ON);
recognizer.forceFinalize(true);
recognizer.waitEngineState(Recognizer.DEALLOCATED);
} catch (Exception e) {
e.printStackTrace();
System.exit(0);
}
}
}
Little Grammar
#JSGF V1.0;
grammar com.sarf.talkingjava.example;
Limitation