Speech recognition using a real time stream
Asked Answered
T

1

9

Firstly, in order to clarify my goal: I am using the CSCore library and capturing background audio using the WasapiLoopbackCapture class, and I intend to use that as a real time input for a System.Speech.Recognition recognition engine. That class either outputs the data to a .WAV file or to a Stream. I then tried doing this:

    private void startButton_Click(object sender, EventArgs e)
    {
        _recognitionEngine.UnloadAllGrammars();
        _recognitionEngine.LoadGrammar(new DictationGrammar());

        LoadTargetDevice();
        StartStreamCapture(); // Here I am starting the capture to _stream (MemoryStream type)

        _stream.Position = 0; // Without setting this, I get a stream format exception.

        _recognitionEngine.SetInputToWaveStream(_stream);
        _recognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
    }

The result is that I don't get an exception but I also don't get the SpeechRecognized or SpeechDetected events firing. I suspect this is because the System.Speech.Recognition assembly does not support real time streams. I searched online and someone reports implementing a custom Stream type as a workaround, but I was unable to follow the instructions on the post which were unclear (see Dexter Morgan's reply here).

I am aware this problem is best solved by using a different library or an alternate approach, but I would like to know how to do this makeshift implementation specifically, mostly for knowledge purposes.

Thanks!

Trondheim answered 22/1, 2018 at 14:47 Comment(6)
@justcarty , have you tried any thing?Insomuch
@Webruster From communicating with the OP, it seems as though the format they were using is unsupported. See this image and this oneLashundalasker
@JustCarty so what are you expecting? and what is your point of question than the OP questionInsomuch
@Webruster The OP created a program that listened to real-time audio and would convert to text. The program was not crashing but would not retrieve the audio or firing the subsequent events. The OP wanted to know why that was happening.Lashundalasker
@JustCarty check my solutionInsomuch
Possible duplicate of Streaming input to System.Speech.Recognition.SpeechRecognitionEngineChemoreceptor
I
5

@Justcarty thanks for the clarification, here is my Explanation why Code of OP wont work and what need to be done in order to make it work.

In C# for the speech recongintion and synthesis , you probably confused by the documentation where we are having two Speech DLL's
1. Microsoft Speech DLL (Microsoft.speech.dll) 2. System Speech DLL (System.Speech.Dll)

System.speech dll is a part of the windows OS . The two libraries are similar in the sense that the APIs are almost, but not quite, the same. So, if you’re searching online for speech examples , from the code snippets you get you may not tell whether they explaining to System.Speech or Microsoft.Speech.

So for Adding a Speech to the C# application you need to use the Microsoft.Speech library, not the System.Speech library.

Some of the key differences are summarized belows

|-------------------------|---------------------|
|  Microsoft.Speech.dll    | System.Speech.dll  |
|-------------------------|---------------------|
|Must install separately  |                     |
|                         | Part of the OS      |
|                         |  (Windows Vista+)   |
|-------------------------|---------------------|
|Must construct Grammars  | Uses Grammars or    |
|                           free dictation      |
| ------------------------|--------------------|

For more Read the Following Article , it explains the correct way to implement

Insomuch answered 1/2, 2018 at 10:0 Comment(1)
Is there no way to do so without the using another DLL?Lashundalasker

© 2022 - 2024 — McMap. All rights reserved.