speex splitted audio data - WebAudio - VOIP

Asked 2/3, 2015 at 4:7 Answered 5/3, 2015 at 2:45

javascript audio voip web-audio-api speex

Im running a little app that encodes and decodes an audio array with the speex codec in javascript: https://github.com/dbieber/audiorecorder

with a small array filled with a sin waveform

for(var i=0;i<16384;i++)
    data.push(Math.sin(i/10));

this works. But I want to build a VOIP application and have more than one array. So if I split my array up in 2 parts encode>decode>merge, it doesn't sound the same as before.

Take a look at this:

fiddle: http://jsfiddle.net/exh63zqL/

Both buttons should give the same audio output.

How can i get the same output in both ways ? Is their a special mode in speex.js for split audio data?

Collett answered 2/3, 2015 at 4:7 Comment(1)

On your "Original" button, you're playing from verg, not decV, both have the same distortion after being encoded-decoded, though there is a noticeable break in the middle of the merged audio. You're also not looping all of your arrays properly, but I don't believe fixing this stops the audio break in the middle. – Dappled 5/3, 2015 at 2:50

Note that Speex is a lossy codec. So, by definition, it can't give same result as the encoded buffer. Besides, it designed to be a codec for voice. So the 1-2 kHz range will be the most efficient as it expects a specific form of signal. In some way, it can be compared to JPEG technology for raster images.

I've modified slightly your jsfiddle example so you can play with different parameters and compare results. Just providing a simple sinusoid with an unknown frequency is not a proper way to check a codec. However, in the example you can see different impact on the initial signal at different frequency.

buffer1.push(Math.sin(2*Math.PI*i*frequency/sampleRate));

I think you should build an example with a recorded voice and compare results in this case. It would be more proper.

In general to get the idea in detail you would have to examine digital signal processing. I can't even provide a proper link since it is a whole science and it is mathematically intensive. (the only proper book for reading I know is in Russian). If anyone here with strong mathematics background can share proper literature for this case I would appreciate.

EDIT: as mentioned by Kuroi Neko, there is a trouble with the boundaries of the buffer. And seems like it is impossible to save decoder state as mentioned in this post, because the library in use doesn't support it. If you look at the source code you see that they use a third party speex codec and do not provide full access to it's features. I think the best approach would be to find a decent library for speex that supports state recovery similar to this

Unstriped answered 5/3, 2015 at 2:38 Comment(5)

Thanks for your answer. I have googled around a lot, and used the orginal libary as well before posting this question. I think this is the only lib for speex in Javascript. If this lib dont support this, i have to build an solution on myself wich is hard. – Collett 5/3, 2015 at 13:40

The one I posted the link to seems to have init accepting speex_bits. However I haven't tried it on my own. Speex is an open codec, I can't believe there are so few is libraries. At least there must be one with full support – Unstriped 5/3, 2015 at 13:44

Yes. Coder class, whatever it is called in the library of use, is supposed to have something similar to Coder.init(speex_bits) called before decoding next buffer, where speex_bits is a previous chunk of speex data. The library you use to record microphone simply doesn't expose advanced features of the third-party speex codec they use. I failed to figure which they use so it would be simpler for i guess to find a good one speex only library – Unstriped 6/3, 2015 at 1:22

I changed the fiddle a little and used the orginal speex directly without recorder.js lib as you sad. jsfiddle.net/exh63zqL/5 so i can call init before decode. (Look at the new decode/encode function) But still not working fine. Do you have an idea ? – Collett 8/3, 2015 at 15:12

I talked to a friend of mine who has experience in VoIP and he said click at the end of buffer occurs when codec process priority is not high (not our case in browser) and size of buffer doesn't fit the chunk of data being processed by codec. In his case it was 120ms. It is hard to see the chunk size in this js codec. I tried to play with buffer size however didn't get any improvement :( – Unstriped 12/3, 2015 at 3:6

Speex is a lossy codec, so the output is only an approximation of your initial sine wave.

Your sine frequency is about 7 KHz, which is near the upper codec 8KHz bandwith and as such even more likely to be altered.

What the codec outputs looks like a comb of dirach pulses that will sound like your initial sinusoid as heard through a phone, which is certainly different from the original.

See this fiddle where you can listen to what the codec makes of your original sine waves, be them split in half or not.

//Generate a continus sinus in 2 arrays
var len = 16384;
var buffer1 = [];
var buffer2 = [];
var buffer = [];
for(var i=0;i<len;i++){
    buffer.push(Math.sin(i/10));
    if(i < len/2)
        buffer1.push(Math.sin(i/10));
    else
        buffer2.push(Math.sin(i/10));
}
//Encode and decode both arrays seperatly
var en = Codec.encode(buffer1);
var dec1 = Codec.decode(en);

var en = Codec.encode(buffer2);
var dec2 = Codec.decode(en);

//Merge the arrays to 1 output array
var merge = [];
for(var i in dec1)
    merge.push(dec1[i]);

for(var i in dec2)
    merge.push(dec2[i]);

//encode and decode the whole array
var en = Codec.encode(buffer);
var dec = Codec.decode(en);

//-----------------
//Down under is only for playing the 2 different arrays
//-------------------
var audioCtx = new window.AudioContext || new window.webkitAudioContext;
function play (sound)
{
    var audioBuffer = audioCtx.createBuffer(1, sound.length, 44100);
    var bufferData = audioBuffer.getChannelData(0);
    bufferData.set(sound);

    var source = audioCtx.createBufferSource();
    source.buffer = audioBuffer;
    source.connect(audioCtx.destination);
    source.start();
}

$("#o").click(function() { play(dec); });
$("#c1").click(function() { play(dec1); });
$("#c2").click(function() { play(dec2); });
$("#m").click(function() { play(merge); });

If you merge the two half signal decoder outputs, you will hear an additional click due to the abrupt transition from one signal to the other, sounding basically like a relay commutation.
To avoid that you would have to smooth the values around the merging point of your two buffers.

Gifted answered 5/3, 2015 at 2:45 Comment(5)

Yeah good catch. Also mind decoding separate chunk of data. Need to keep decoder state as described in comments here – Unstriped 5/3, 2015 at 2:50

Indeed, but I assume the idea was to stitch together two independently encoded signals. – Gifted 5/3, 2015 at 3:47

the idea is to transfer voice data in realtime so I have to split the signal into parts. Thanks for your answer @kuroineko I will try to smooth it if there is now build in solution in the speex lib. – Collett 5/3, 2015 at 13:44

well the buitin solution is to keep the codec context alive and feed it a continuous stream. If you're using it to transmit a single source point to point, that should be doable. – Gifted 5/3, 2015 at 15:47

@kuroineko where is the speex context ? and how can I keep it alive? – Collett 5/3, 2015 at 18:57

Recommended topics

Hot tags