Python 3 and base64 encoding of a binary file
Asked Answered
D

2

24

I'm new to Python and I do have an issue that is bothering me.

I use the following code to get a base64 string representation of my zip file.

with open( "C:\\Users\\Mario\\Downloads\\exportTest1.zip",'rb' ) as file:
    zipContents = file.read()
    encodedZip = base64.encodestring(zipContents)

Now, if I output the string it is contained inside a b'' representation. This for me is not necessary and I would like to avoid it. Also it adds a newlines character every 76 characters which is another issue. Is there a way to get the binary content and represent it without the newline characters and trailing and leading b''?

Just for comparison, if I do the following in PowerShell:

$fileName = "C:\Users\Mario\Downloads\exportTest1.zip"
$fileContentBytes = [System.IO.File]::ReadAllBytes($fileName)
$fileContentEncoded = [System.Convert]::ToBase64String($fileContentBytes) 

I do get the exact string I'm looking for, no b'' and no \n every 76 chars.

Debug answered 21/6, 2016 at 12:44 Comment(2)
@PadraicCunningham thus? Isn't that what is suggested to be done while managing binary files? Do you have an example on how to correctly encode a binary file and represent it as base64 string?Debug
docs.python.org/3.5/library/base64.html#base64.encodebytes, if you don't want bytes you will have to decodeMuro
T
35

From the base64 package doc:

base64.encodestring:

"Encode the bytes-like object s, which can contain arbitrary binary data, and return bytes containing the base64-encoded data, with newlines (b"\n") inserted after every 76 bytes of output, and ensuring that there is a trailing newline, as per RFC 2045 (MIME)."

You want to use

base64.b64encode:

"Encode the bytes-like object s using Base64 and return the encoded bytes."

Example:

import base64

with open("test.zip", "rb") as f:
    encodedZip = base64.b64encode(f.read())
    print(encodedZip.decode())

The decode() will convert the binary string to text.

Typesetting answered 21/6, 2016 at 12:54 Comment(1)
This solves the problem with \n but I still do see the leading b' character. Should I just remove it manually with [:-1] or there is a cleaner way?Debug
L
17

Use b64encode to encode without the newlines and then decode the resulting binary string with .decode('ascii') to get a normal string.

encodedZip = base64.b64encode(zipContents).decode('ascii')
Lichee answered 21/6, 2016 at 12:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.