Unpack format characters in Python
Asked Answered
D

2

5

I need the Python analog for this Perl string:

unpack("nNccH*", string_val)

I need the nNccH* - data format in Python format characters.

In Perl it unpack binary data to five variables:

  • 16 bit value in "network" (big-endian)
  • 32 bit value in "network" (big-endian)
  • Signed char (8-bit integer) value
  • Signed char (8-bit integer) value
  • Hexadecimal string, high nibble first

But I can't do it in Python

More:

bstring = ''
while DataByte = client[0].recv(1):
    bstring += DataByte
print len(bstring)
if len(bstring):
    a, b, c, d, e = unpack("nNccH*", bstring)

I never wrote in Perl or Python, but my current task is to write a multithreading Python server that was written in Perl...

Dynode answered 7/2, 2012 at 12:34 Comment(6)
I can find the equivalent of everything except for H*, for which I would assume you would play with p or s.Stotts
You will need to calculate the string size, this answer could be helpful. https://mcmap.net/q/1922406/-python-struct-unpackKagoshima
"while DataByte = client[0].recv(1):" is not Python. This can never work.Tolu
@SenthilKumaran: AFAIR * just means "as many elements as are left", so he can unpack everything before the H*, and then just grab the rest without unpackTiercel
By the way, Sir D, thanks for editing and clarifying the question. The last code snippet makes little sense though, as S.Lott noticedTiercel
See THIS question for how to repeatedly apply a format string!Vaivode
S
8

The Perl format "nNcc" is equivalent to the Python format "!HLbb". There is no direct equivalent in Python for Perl's "H*".

There are two problems.

  • Python's struct.unpack does not accept the wildcard character, *
  • Python's struct.unpack does not "hexlify" data strings

The first problem can be worked-around using a helper function like unpack.

The second problem can be solved using binascii.hexlify:

import struct
import binascii

def unpack(fmt, data):
    """
    Return struct.unpack(fmt, data) with the optional single * in fmt replaced with
    the appropriate number, given the length of data.
    """
    # https://mcmap.net/q/1922407/-auto-repeat-flag-in-a-pack-format-string
    try:
        return struct.unpack(fmt, data)
    except struct.error:
        flen = struct.calcsize(fmt.replace('*', ''))
        alen = len(data)
        idx = fmt.find('*')
        before_char = fmt[idx-1]
        n = (alen-flen)//struct.calcsize(before_char)+1
        fmt = ''.join((fmt[:idx-1], str(n), before_char, fmt[idx+1:]))
        return struct.unpack(fmt, data)

data = open('data').read()
x = list(unpack("!HLbbs*", data))
# x[-1].encode('hex') works in Python 2, but not in Python 3
x[-1] = binascii.hexlify(x[-1])
print(x)

When tested on data produced by this Perl script:

$line = pack("nNccH*", 1, 2, 10, 4, '1fba');
print "$line";

The Python script yields

[1, 2, 10, 4, '1fba']
Scorpaenid answered 7/2, 2012 at 13:31 Comment(3)
An alternative to binascii.hexlify() is str.encode("hex").Munsey
If you want Python 3 compatibility, you'll need // when calculating n, otherwise str(n) produces '16.0' and breaks the format string.Same
In Python 3.4 and newer, there is struct.iter_unpack. Here is a demonstration.Vaivode
T
7

The equivalent Python function you're looking for is struct.unpack. Documentation of the format string is here: http://docs.python.org/library/struct.html

You will have a better chance of getting help if you actually explain what kind of unpacking you need. Not everyone knows Perl.

Tiercel answered 7/2, 2012 at 12:37 Comment(4)
Thanks. I already read the perl and python unpack docs. But so far i don't understand some moments.Dynode
@Eli -there could minor trouble in direct translation. For e.g how would one do H* in python? I guess, the user could have worded the question better.Stotts
@SenthilKumaran: note that the user has edited the question after my answer. Before the edit he didn't lay out the meanings of the format chars in PerlTiercel
@SirD: "so far i don't understand some moments". Please be specific on what you do not understand. Please update the question to say what you do not understand.Tolu

© 2022 - 2024 — McMap. All rights reserved.