Here is a complete example using C# and System.Speech
The code can be divided into 2 main parts:
configuring the SpeechRecognitionEngine object (and its required elements)
handling the SpeechRecognized and SpeechHypothesized events.
Step 1: Configuring the SpeechRecognitionEngine
_speechRecognitionEngine = new SpeechRecognitionEngine();
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
_dictationGrammar = new DictationGrammar();
_speechRecognitionEngine.LoadGrammar(_dictationGrammar);
_speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
At this point your object is ready to start transcribing audio from the microphone. You need to handle some events though, in order to actually get access to the results.
Step 2: Handling the SpeechRecognitionEngine Events
_speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized);
_speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);
_speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized);
_speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);
private void SpeechHypothesizing(object sender,
SpeechHypothesizedEventArgs e) {
///real-time results from the engine
string realTimeResults = e.Result.Text; }
private void SpeechRecognized(object sender, SpeechRecognizedEventArgs
e) {
///final answer from the engine string finalAnswer =
e.Result.Text; }
That’s it. If you want to use a pre-recorded .wav file instead of a microphone, you would use
_speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile);
instead of
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
There are a bunch of different options in these classes and they are worth exploring in more detail.
http://ellismis.com/2012/03/17/converting-or-transcribing-audio-to-text-using-c-and-net-system-speech/