How do I play audio returned from an XMLHTTPRequest using the HTML5 Audio API
Asked Answered
M

1

11

I'm failing to be able to play audio when making an "AJAX" request to my server side api.

I have backend Node.js code that's using IBM's Watson Text-to-Speech service to serve audio from text:

var render = function(request, response) {
    var options = {
        text: request.params.text,
        voice: 'VoiceEnUsMichael',
        accept: 'audio/ogg; codecs=opus'
    };

    synthesizeAndRender(options, request, response);
};

var synthesizeAndRender = function(options, request, response) {
    var synthesizedSpeech = textToSpeech.synthesize(options);

    synthesizedSpeech.on('response', function(eventResponse) {
        if(request.params.text.download) {
            var contentDisposition = 'attachment; filename=transcript.ogg';

            eventResponse.headers['content-disposition'] = contentDisposition;
        }
    });

    synthesizedSpeech.pipe(response);
};

I have client side code to handle that:

var xhr = new XMLHttpRequest(),
    audioContext = new AudioContext(),
    source = audioContext.createBufferSource();

module.controllers.TextToSpeechController = {
    fetch: function() {
        xhr.onload = function() {
            var playAudio = function(buffer) {
                source.buffer = buffer;
                source.connect(audioContext.destination);

                source.start(0);
            };

            // TODO: Handle properly (exiquio)
            // NOTE: error is being received
            var handleError = function(error) {
                console.log('An audio decoding error occurred');
            }

            audioContext
                .decodeAudioData(xhr.response, playAudio, handleError);
        };
        xhr.onerror = function() { console.log('An error occurred'); };

        var urlBase = 'http://localhost:3001/api/v1/text_to_speech/';
        var url = [
            urlBase,
            'test',
        ].join('');

        xhr.open('GET', encodeURI(url), true);
        xhr.setRequestHeader('x-access-token', Application.token);
        xhr.responseType = 'arraybuffer';
        xhr.send();
    }
}

The backend returns the audio that I expect, but my success method, playAudio, is never called. Instead, handleError is always called and the error object is always null.

Could anyone explain what I'm doing wrong and how to correct this? It would be greatly appreciated.

Thanks.

NOTE: The string "test" in the URL becomes a text param on the backend and and ends up in the options variable in synthesizeAndRender.

Monies answered 19/5, 2015 at 16:11 Comment(4)
Are you sure the audio format is supported?Debarath
I believe it must be. I originally tested the same backend code directly with the same Chrome browser via a url and it would play fine.Monies
Actually, the test was done on Chromium and Gnu/Linux. I believe it should be the same with Chrome in OSX where I am writing this code now, but I am not certain.Monies
UPDATE: I've run the folliwing query in the same brower I'm using to develop this code: localhost:3001/api/v1/text_to_speech/this%20is%20a%20test <-- This was done with my authentication code commented out and it rendered a builtin audio player and played the expected audio. Now I can say with certainty that the audio type is accepted. My only guess at my problem is the how I'm doing the headers on the server side above. The attachment part strikes me as potentially an issue.Monies
F
15

Unfortunately, unlike Chrome's HTML5 Audio implementation, Chrome's Web Audio doesn't support audio/ogg;codecs=opus, which is what your request uses here. You need to set the format to audio/wav for this to work. To be sure it's passed through to the server request, I suggest putting it in the query string (accept=audio/wav, urlencoded).

Are you just looking to play the audio, or do you need access to the Web Audio API for audio transformation? If you just need to play the audio, I can show you how to easily play this with the HTML5 Audio API (not the Web Audio one). And with HTML5 Audio, you can stream it using the technique below, and you can use the optimal audio/ogg;codecs=opus format.

It's as simple as dynamically setting the source of your audio element, queried from the DOM via something like this:

(in HTML)

<audio id="myAudioElement" />

(in your JS)

var audio = document.getElementById('myAudioElement') || new Audio();
audio.src = yourUrl;

Your can also set the audio element's source via an XMLHttpRequest, but you won't get the streaming. But since you can use a POST method, you're not limited to the text length of a GET request (for this API, ~6KB). To set it in xhr, you create a data uri from a blob response:

    xhr.open('POST', encodeURI(url), true);
    xhr.setRequestHeader('Content-Type', 'application/json');
    xhr.responseType = 'blob';
    xhr.onload = function(evt) {
      var blob = new Blob([xhr.response], {type: 'audio/ogg'});
      var objectUrl = URL.createObjectURL(blob);
      audio.src = objectUrl;
      // Release resource when it's loaded
      audio.onload = function(evt) {
        URL.revokeObjectURL(objectUrl);
      };
      audio.play();
    };
    var data = JSON.stringify({text: yourTextToSynthesize});
    xhr.send(data);

As you can see, with XMLHttpRequest, you have to wait until the data are fully loaded to play. There may be a way to stream from XMLHttpRequest using the very new Media Source Extensions API, which is currently available only in Chrome and IE (no Firefox or Safari). This is an approach I'm currently experimenting with. I'll update here if I'm successful.

Frugal answered 25/5, 2015 at 20:35 Comment(3)
Eric answered my question with the statement about compatibility and a link to the Chromium issue and elaborated on possible work around which is greatly appreciated.Monies
Am struggling from past 2 days. Could you please look on this stackoverflow.com/questions/32163749Rennarennane
AAC format will work in all browsers, BTW. You're not limited to using WAV (which is huge).Kwh

© 2022 - 2024 — McMap. All rights reserved.