Note that Speex is a lossy codec. So, by definition, it can't give same result as the encoded buffer. Besides, it designed to be a codec for voice. So the 1-2 kHz range will be the most efficient as it expects a specific form of signal. In some way, it can be compared to JPEG technology for raster images.
I've modified slightly your jsfiddle example so you can play with different parameters and compare results. Just providing a simple sinusoid with an unknown frequency is not a proper way to check a codec. However, in the example you can see different impact on the initial signal at different frequency.
buffer1.push(Math.sin(2*Math.PI*i*frequency/sampleRate));
I think you should build an example with a recorded voice and compare results in this case. It would be more proper.
In general to get the idea in detail you would have to examine digital signal processing. I can't even provide a proper link since it is a whole science and it is mathematically intensive. (the only proper book for reading I know is in Russian). If anyone here with strong mathematics background can share proper literature for this case I would appreciate.
EDIT: as mentioned by Kuroi Neko, there is a trouble with the boundaries of the buffer. And seems like it is impossible to save decoder state as mentioned in this post, because the library in use doesn't support it. If you look at the source code you see that they use a third party speex codec and do not provide full access to it's features. I think the best approach would be to find a decent library for speex that supports state recovery similar to this
verg
, notdecV
, both have the same distortion after being encoded-decoded, though there is a noticeable break in the middle of the merged audio. You're also not looping all of your arrays properly, but I don't believe fixing this stops the audio break in the middle. – Dappled