How to convert a byte array to string?
Asked Answered
S

1

6

I just finished creating a huffman compression algorithm . I converted my compressed text from a string to a byte array with bytearray(). Im attempting to decompress my huffman algorithm. My only concern though is that i cannot convert my byte array back into a string. Is there any built in function i could use to convert my byte array (with a variable) back into a string? If not is there a better method to convert my compressed string to something else? I attempted to use byte_array.decode() and I get this:

print("Index: ", Index) # The Index


# Subsituting text to our compressed index

for x in range(len(TextTest)):

    TextTest[x]=Index[TextTest[x]]


NewText=''.join(TextTest)

# print(NewText)
# NewText=int(NewText)


byte_array = bytearray() # Converts the compressed string text to bytes
for i in range(0, len(NewText), 8):
    byte_array.append(int(NewText[i:i + 8], 2))


NewSize = ("Compressed file Size:",sys.getsizeof(byte_array),'bytes')

print(byte_array)

print(byte_array)

print(NewSize)

x=bytes(byte_array)
x.decode()

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 0: invalid start byte

Stratovision answered 21/11, 2018 at 6:45 Comment(3)
You can convert it to a string by calling the bytearray.decode() method and supplying an encoding. For example: byte_array.decode('ascii'). If you leave the decoding argument out, it will default to 'utf-8'.Signorino
Hey, I got this when i added your code: byte_array.decode('ascii') UnicodeDecodeError: 'ascii' codec can't decode byte 0x88 in position 0: ordinal not in range(128). When I removed the 'ascii' part I got:UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 0: invalid start byteStratovision
That means the data in your byte array doesn't contain valid characters in those encodings. You need to find an acceptable one. There's some here in documentation—'hex' might be good. You can also use 'latin1' which maps the code points 0–255 to the bytes 0x0–0xff. Doing so will allow you to convert the result back to bytes later by using the_string.encode('latin1'). I first heard about doing this in this answer to a unrelated question (to solve a different problem).Signorino
I
5

You can use .decode('ascii') (leave empty for utf-8).

>>> print(bytearray("abcd", 'utf-8').decode())
abcd

Source : Convert bytes to a string?

Itol answered 21/11, 2018 at 7:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.