webkitSpeechRecognition is "lagging" behind when gathering results
Asked Answered
C

1

6

Had an itch to try out the Web Speech API. I copied the code exactly from the article, and I'm having an issue where you speak, but nothing happens until you speak AGAIN.

[Fiddle: http://jsfiddle.net/w75v2tm5/]

JS:

if (!('webkitSpeechRecognition' in window)) {
    //handle error stuff here...
} else {
    var recognition = new webkitSpeechRecognition();
    recognition.continuous = true;
    recognition.interimResults = false;

    recognition.start();

    var final_transcript = '';

    recognition.onresult = function (event) {
        var interim_transcript = '';
        if (typeof (event.results) == 'undefined') {
            recognition.onend = null;
            recognition.stop();
            upgrade();
            return;
        }
        for (var i = event.resultIndex; i < event.results.length; ++i) {
            if (event.results[i].isFinal) {
                final_transcript += event.results[i][0].transcript;
            } else {
                interim_transcript += event.results[i][0].transcript;
            }
        }
        document.getElementsByTagName('div')[0].innerText = final_transcript;
    };

}

For example, if I were to say "Hello world", the <div> I have set up to display the results would not display "Hello world" until I said something else, or made a sound. But if I said something else, THAT would not be displayed until I said something else AGAIN.

The variable "final_transcript" is holding the PREVIOUS result, and not what I just said. It's off by just 1.

To give you a better idea...

Me: "Hello world"

final_transcript = '';

[Wait...]

Me: "Test"

final_transcript = 'Hello world'

And this just continues. The code is failing to transcribe what I am saying AS I am saying it. Very weird.

Any thoughts as to why this could be?

Clathrate answered 10/8, 2014 at 3:0 Comment(0)
T
8

There is some kind of built in timeout, after which you will get the result even if there is no more input (seems to be around 5-10 seconds).

In this case you will get the final onresult event, as well as the onend event. You will have to call recognition.start() again if you wish to keep accepting input.

Also, if you set

recognition.interimResults = true;

you will get onresult events with non final results, and you can decide if you want to display them before you get the final ones.

The other option is to turn off continuous with

recognition.continuous = false;

you will get a result shortly after the input (audio) stopped. You will also get the onend event.
If you wish to continue the recognition you will have to call again

recognition.start();

in the onend event handler.
On a non HTTPS page, this will cause the permission bar to pop up again.

see example

Taught answered 20/8, 2014 at 21:14 Comment(1)
I have the same experience as Ron Harlev: non-lagging results are best achieved by setting recognition.continuous = false and restarting the recognition when the onend event occurs.Shrieval

© 2022 - 2024 — McMap. All rights reserved.