Encoding and decoding binary data for inclusion into JSON with Python 3

I need to decide on a schema for including binary elements into a message object so that it can be decoded again on the receiving end (In my situation a consumer on an Rabbit MQ / AMQP queue).

I decided against multipart MIME encoding over JSON mostly because it seems like using Thor's hammer to push in a thumb tack. I decided against manually joining parts (binary and JSON concatenated together) mostly because every time a new requirement arises it is a whole re-design. JSON with the binary encoded in one of the fields seems like an elegant solution.

My seemingly working (confirmed by comparing MD5-sum of sent and received data) solution is doing the following:

def json_serialiser(byte_obj):
    if isinstance(byte_obj, (bytes, bytearray)):
        # File Bytes to Base64 Bytes then to String
        return base64.b64encode(byte_obj).decode('utf-8')
    raise ValueError('No encoding handler for data type ' + type(byte_obj))


def make_msg(filename, filedata):
    d = {"filename": filename,
         "datalen": len(filedata),
         "data": filedata}
    return json.dumps(d, default=json_serialiser)

On the receiving end I simply do:

def parse_json(msg):
    d = json.loads(msg)
    data = d.pop('data')
    return base64.b64decode(data), d


def file_callback(ch, method, properties, body):
    filedata, fileinfo = parse_json(body)
    print('File Name:', fileinfo.get("filename"))
    print('Received File Size', len(filedata))

My google-fu left me unable to confirm whether what I am doing is in fact valid. In particular I am concerned whether the line that produces the string from the binary data for inclusion into JSON is correct, eg the line return base64.b64encode(byte_obj).decode('utf-8')

And it seems that I am able to take a shortcut with the decoding back to binary data as the base64.b64decode() method handles the UTF-8 data as if it is ASCII - As one would expect it to be coming from the output of base64.b64encode() ... But is this a valid assumption in all cases?

Mostly I'm surprised at not being able to find any examples online of doing this. Perhaps my google patience are still on holiday!

Recommended topics

Hot tags