How to hash int/long using hashlib in Python?

Asked 21/1, 2018 at 12:59 Answered 4/1, 2023 at 11:35

I'm developing a set of cryptographic algorithms / protocols for educational purposes. Specifically, I am currently working on OAEP encoding.

OAEP involves use of cryptographic hash functions; therefore I wanted to use the hashlib library, provided in the standard of Python3.

Let's say, I have a 128-bit integer, for which I want to get the SHA256 digest of. How can I do this in Python? All I could found was how to hash strings (or b-strings) with hashlib.sha256().

Rodrickrodrigez answered 21/1, 2018 at 12:59 Comment(0)

Hashes work on bytes, a sequence of integer values in the range 0-255; this is independent of the implementation language. You'd have to convert your 128-bit integer into a series of bytes representing that value. That's why the hashlib module only accepts bytes objects ("b values").

How you do this is entirely dependent on the use case; you'd need to see how the specific OAEP standard specifies how such an integer is represented.

For example, you could take the string representation of the decimal integer value; a sequence of ASCII digits; this is not a very efficient method as that can take up to 39 bytes:

>>> import hashlib
>>> 2 ** 128 - 1  # largest 128-bit value
340282366920938463463374607431768211455
>>> len(str(2 ** 128 - 1))
39
>>> str(2 ** 128 - 1).encode('ASCII')  # ascii bytes
b'340282366920938463463374607431768211455'
>>> hashlib.sha256(str(2 ** 128 - 1).encode('ASCII')).hexdigest()
'f315ff319bf588e202110ab686fb8c3dbca12b4df9fbd844615b566a2fff3e75'

A much more efficient method would be to take those 128 bits, divide them into 16 bytes and hash those. You then need to decide on byte order (little or big endian). If you need to hash multiple integer values, then the Python struct module can help with producing the bytes, but only for integer values up to 8 bytes (you'd have to split up your larger numbers first). For single values, just use the int.to_bytes() method:

>>> (2 ** 128 - 1).to_bytes(16, 'little', signed=False)
b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
>>> hashlib.sha256((2 ** 128 - 1).to_bytes(16, 'little', signed=False)).hexdigest()
'5ac6a5945f16500911219129984ba8b387a06f24fe383ce4e81a73294065461b'

Timely answered 21/1, 2018 at 13:3 Comment(2)

I understand that; however, an integer is still a series of bytes. I.e. 3 = 00000011 or 16 = 00010000 (or longer integers are just longer series of bits which can be divided into blocks of 8). What I don't understand is, how I should feed this integer into Python's hashlib.sha256() implementation. I do not understand what it expects as argument. – Airwaves 21/1, 2018 at 13:9

@OranCanÖren: that's what my answer tries to convey to you. You represent the numeric value as bytes, and how you do that depends on your use case. The standard would detail what method to use. – Timely 21/1, 2018 at 13:11

-1

I used https://pycryptodome.readthedocs.io/en/latest/src/hash/keccak.html keccak library to deal with integers. Example :

keccak_hash = keccak.new(digest_bits=256)
bytes_len = (len(hex(integer val))-2) // 2
keccak_hash.update(<integer val>.to_bytes(bytes_len, byteorder ='big'))
return keccak_hash.digest()

Once the leaves (integer) are hashed, subsequent calls to keccak_hash.update() can be directly fed with previous hash output (which is in bytes)

keccak_hash.update(<bytes>)

Murky answered 4/1, 2023 at 11:35 Comment(0)

Recommended topics

Hot tags