Switching endianness in the middle of a struct.unpack format string

I have a bunch of binary data (the contents of a video game save-file, as it happens) where a part of the data contains both little-endian and big-endian integer values. Naively, without reading much of the docs, I tried to unpack it this way...

struct.unpack(
    '3sB<H<H<H<H4s<I<I32s>IbBbBbBbB12s20sBB4s',
    string_data
)

...and of course I got this cryptic error message:

struct.error: bad char in struct format

The problem is that struct.unpack format strings do not expect individual fields to be marked with endianness. The actually correct format-string here would be something like

struct.unpack(
    '<3sBHHHH4sII32sIbBbBbBbB12s20sBB4s',
    string_data
)

except that this will flip the endianness of the third I field (parsing it as little-endian, when I really want to parse it as big-endian).

Is there an easy and/or "Pythonic" solution to my problem? I have already thought of three possible solutions, but none of them is particularly elegant. In the absence of better ideas I'll probably go with number 3:

I could extract a substring and parse it separately:

(my.f1, my.f2, ...) = struct.unpack('<3sBHHHH4sII32sIbBbBbBbB12s20sBB4s', string_data)
my.f11 = struct.unpack('>I', string_data[56:60])

I could flip the bits in the field after the fact:

(my.f1, my.f2, ...) = struct.unpack('<3sBHHHH4sII32sIbBbBbBbB12s20sBB4s', string_data)
my.f11 = swap32(my.f11)

I could just change my downstream code to expect this field to be represented differently — it's actually a bitmask, not an arithmetic integer, so it wouldn't be too hard to flip around all the bitmasks I'm using with it; but the big-endian versions of these bitmasks are more mnemonically relevant than the little-endian versions.

t=np.dtype('>u4,<u4') # Compound type with two 4-byte unsigned int with different byte order a=np.zeros(shape=1, dtype=t) # Create an array of length one with above type a[0][0]=1 # Assign first uint a[0][1]=1 # Assign second uint bytes=a.tobytes() # bytes should be b'\x01\x00\x00\x00\x00\x00\x00\x01' b=np.frombuffer(buf, dtype=t) # should yield array[(1,1)] c=np.frombuffer(buf, dtype=np.uint32) # yields array([ 1, 16777216]

Recommended topics

Hot tags