Issues with Web Speech API in Android Chrome

B

2

6

I'm trying to make use of the SpeechRecognition interface of the Web Speech API. It works fine on the desktop version of Chrome but I can't get it to detect any audio on the Android version. After failing to get my own code to work I tested this demo as well as this other demo on two different Android devices (one running LineageOS Nougat, one running LineageOS Pie, both with Chrome 79) but neither demo worked on either device.

I'm not sure what's wrong here... can anyone else get these demos working on Android? I am serving my test page over https and I can record audio from the microhpone on these devices just fine using navigator.mediaDevices.getUserMedia so it doesn't seem to be a hardware, permission, or security issue.

The specific symptoms I'm seeing are as follows:

The start event fires after initially starting the recognition as expected but the subsequent audiostart,soundstart, speechstart and result events which should follow it never do.
Attempting to call SpeechRecognition.stop seems to have no effect — the end event does not get fired. Calling SpeechRecognition.start after a stop attempt throws Uncaught DOMException: Failed to execute 'start' on 'SpeechRecognition': recognition has already started.
Calling SpeechRecognition.abort does fire the end event and allows the recognition to be restarted.

Here's some test code based on the example from MDN.

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title> Web Speech API Test </title>
    <style>
      * { box-sizing: border-box; }

      html {
        height: 100%;
        width: 100%;
      }

      body {
        height: 100%;
        width: 100%;
        padding: 0;
        margin: 0;
        display: grid;
        grid-template-columns: 1fr;
        grid-template-rows: 1fr 10fr 1fr;
        font-family: sans-serif;
      }

      h1 {
        margin: 0;
        padding: 0.5rem;
        background-color: dodgerblue;
        text-align: center;
      }

      #output {
        margin: 0;
        padding: 0.5em;
        border: 0;
        background-color: transparent;
      }

      #start {
        display: block;
        background-color: dodgerblue;
        border: 0;
        color: navy;
        font-weight: bold;
        font-size: 1.2em;
      }
    </style>
  </head>
  <body>
    <h1> Web Speech API Test </h1>
    <textarea id="output"></textarea>
    <button id="start"> START </button>
    <script>
      let SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
      let SpeechGrammarList = window.SpeechGrammarList || window.webkitSpeechGrammarList;
      let SpeechRecognitionEvent = window.SpeechRecognitionEvent || window.webkitSpeechRecognitionEvent;

      let grammar = '#JSGF V1.0; grammar colors; public <color> = aqua | azure | beige | bisque | black | blue | brown | chocolate | coral | crimson | cyan | fuchsia | ghostwhite | gold | goldenrod | gray | green | indigo | ivory | khaki | lavender | lime | linen | magenta | maroon | moccasin | navy | olive | orange | orchid | peru | pink | plum | purple | red | salmon | sienna | silver | snow | tan | teal | thistle | tomato | turquoise | violet | white | yellow ;';

      let recognition = new SpeechRecognition();
      let speechRecognitionList = new SpeechGrammarList();
      speechRecognitionList.addFromString(grammar, 1);

      recognition.grammars = speechRecognitionList;
      recognition.continuous = false;
      recognition.lang = 'en-US';
      recognition.interimResults = false;
      recognition.maxAlternatives = 1;

      let startButton = document.getElementById('start');
      let output = document.getElementById('output');
      output.value += 'Initializing...';

      let listening = false;

      startButton.addEventListener('click', event => {
        if (listening == false) {
          recognition.start();
          startButton.innerHTML = 'STOP';
          listening = true;
        } else {
      //    recognition.stop();
          recognition.abort();
        }
      });

      console.dir(recognition);
      output.value += 'ready.';

      recognition.onstart = event => {
        output.value += '\nRecognition started';
      };

      recognition.onaudiostart = event => {
        output.value += '\nAudio started';
      };

      recognition.onsoundstart = event => {
        output.value += '\nSound started';
      };

      recognition.onspeechstart = event => {
        output.value += '\nSpeech started';
      };

      recognition.onspeechend = event => {
        output.value += '\nSpeech ended';
        recognition.stop();
      };

      recognition.onsoundend = event => {
        output.value += '\nSound ended';
      };

      recognition.onaudioend = event => {
        output.value += '\nAudio ended';
      };

      recognition.onend = event => {
        output.value += '\nRecognition stopped';
        startButton.innerHTML = 'START';
        listening = false;
      };

      recognition.onresult = event => {
        let color = event.results[0][0].transcript;
        let confidence = event.results[0][0].confidence;
        document.body.style.backgroundColor = color;
        output.value += '\nResult recieved: ' + color;
        output.value += '\nConfidence: ' + confidence;
      };

      recognition.onnomatch = event => {
        output.value += '\nColor not recognised';
      };

      recognition.onerror = event => {
        output.value += '\nERROR: ' + event.error;
      };
    </script>
  </body>
</html>

Any ideas as to what the problem could be would be appreciated.

UPDATE 2021-01-08:

I modified the example code so that it outputs log messages to a textarea element instead of the console in order to eliminate the need for remote debugging. I also published a live version on my domain. I then tested it using Chrome Canary 89 on LineageOS Oreo and found that it still did not work there. However, I then found that this example DOES work perfectly on a Razer Phone running it's official version of Android Pie and Chrome 87! So it would seem that my WebSpeech implementation is fine and possibly there is some other issue with LineageOS that has existed for multiple versions.

This question has recieved a fair number of views so I imagine others must be having similar issues. To those people, I suggest you try the live test on few different devices and report your findings back here. Maybe we can narrow down the conditions that are causing it to fail on some devices but not others. Possibly this has nothing to do with LineageOS at all but is another issue altogether.

Bertie answered 28/2, 2020 at 4:46 Comment(2)

Just want to add that I also tried the above test code in Chrome Canary 82 with the same failing results. – Bertie 28/2, 2020 at 21:8

Hi! The official SpeechRecognition demos don't give accurate result, but I can see the live demo on your domain works perfectly. Can you share it's code please? – Willingham 14/5 at 17:1

G

1

The Web Speech API on Android uses a third-party service that is usually implemented by Google (Play Services) and/or the manufacturer (e.g. Samsung). Most likely this service is missing or disabled in LineageOS since it usually connects to a cloud server for transcription.

Graham answered 15/7, 2021 at 22:48 Comment(3)

That does seem like a very likely scenario. Makes me curious if installing a larger package from opengapps.org would help. I usually install the pico package on my devices which is pretty minimal. – Bertie 16/7, 2021 at 2:19

Unfortunately I have no experience yet with that but would be interested in the result ^^. I think I've seen people with LineageOS that had it working. – Graham 16/7, 2021 at 16:23

Well right now I'm to busy with work to be messing with my phone but I'll definitely post the results if/when I get around to trying it. These comments alone might spur somebody else to give it shot before me though. I'll definitely accept your answer if we can prove that this is the issue. – Bertie 16/7, 2021 at 22:23

M

0

You control the recognition object via the variable "listening", so please set "listening" to false, after recognition.stop().

recognition.onspeechend = event => {
    console.log('speechend');
    recognition.stop();
    listening = false;
};

Mofette answered 4/1, 2021 at 11:2 Comment(1)

This is a helpful tweak to the example code but does not solve the root issue here which is that the speech recognition does not successfully start. I have updated the original post with some new example code and test results. – Bertie 9/1, 2021 at 1:27

Recommended topics

Hot tags