Firstly, in order to clarify my goal: I am using the CSCore library and capturing background audio using the WasapiLoopbackCapture
class, and I intend to use that as a real time input for a System.Speech.Recognition
recognition engine. That class either outputs the data to a .WAV file or to a Stream. I then tried doing this:
private void startButton_Click(object sender, EventArgs e)
{
_recognitionEngine.UnloadAllGrammars();
_recognitionEngine.LoadGrammar(new DictationGrammar());
LoadTargetDevice();
StartStreamCapture(); // Here I am starting the capture to _stream (MemoryStream type)
_stream.Position = 0; // Without setting this, I get a stream format exception.
_recognitionEngine.SetInputToWaveStream(_stream);
_recognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
}
The result is that I don't get an exception but I also don't get the SpeechRecognized
or SpeechDetected
events firing. I suspect this is because the System.Speech.Recognition
assembly does not support real time streams. I searched online and someone reports implementing a custom Stream
type as a workaround, but I was unable to follow the instructions on the post which were unclear (see Dexter Morgan's reply here).
I am aware this problem is best solved by using a different library or an alternate approach, but I would like to know how to do this makeshift implementation specifically, mostly for knowledge purposes.
Thanks!