Why does encoding wav file to base64 with python and online webapp give different results?
Asked Answered
P

1

8

I am making a simple web app that needs to play a some audio files, using howler.js. howler.js accepts base64 URI as input, so I wanted to try that out. To test it, I took a sample audio file and used an online audio-to-base64 encoder to get the base64 URI. I added the data description ("data:audio/wav;base64,") the front of the base64 string and copy and pasted into the following JS function...:

function playSound() {
    var data = "";
    var sound = new Howl({
      src: [data],
      loop: false
    });
    sound.play();
}

...and it worked perfectly. Since I would be dealing with a fair number of audio files, I figured I'd use a short python script to convert them all to the base64. To test, I converted the same audio to a base64 string with the following python code:

import base64
with open("0.wav", "rb") as f1,open("b64.txt", "w") as f2:
    encoded_f1 = base64.b64encode(f1.read())
    f2.write("data:audio/wav;base64,")
    f2.write(str(encoded_f1))

I noticed the base64 string was different front the one I got from the website earlier. I pasted this into the JS function shown earlier, but when I attempt to play the sound, I get the following error:

Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.

There seems to be some sort of difference in the way python is encoding to base64. What could the cause for this be?

Parament answered 7/10, 2017 at 17:19 Comment(5)
Where is atob() called?Arnst
It's called the howler.js code itself. Check it out here: github.com/goldfire/howler.js/blob/master/dist/… (the code is in a single line, but a CTRL+F will show you where atob() is called).Parament
Can you create a jsfiddle jsfiddle.net or plnkr plnkr.co including data URI created at python?Arnst
Note: modified Base64 for URL variants exist, where the '+' and '/' characters of standard Base64 are respectively replaced by '-' and '_'.Caresse
Can't reproduce. Upon closer inspection: both the service you cited and the Python's base64 encoder use the same (default) alphabet (+/ for last 2 chars). The only difference is the service wraps its output to 76 characters per line, but that shouldn't matter. There something you're not telling us.Caresse
P
9

Came back to this after a while and the problem became apparent. It was just a problem with the block of code I mentioned in OP (the second block) that I used to write the base64 encoding to a file.

base64.b64encode(f1.read()) returns a bit string, which in Python, is symbolized with the following notation (i.e. when you print/write it, you'll see it like this): b'string goes here'. So the issue was just that the b' ' was wrapped around my actual base64 string, and I was using that. All I had to do get rid of the b' ' which I did by converting the bitstring to ASCII like this: str(encoded_f1,'ascii', 'ignore').

Really silly mistake, but hopefully it helps someone out.

Parament answered 10/10, 2017 at 23:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.