Decrypting AES CBC in python from OpenSSL AES
Asked Answered
S

1

11

I need to decrypt a file encrypted on OpenSSL with python but I am not understanding the options of pycrypto.

Here what I do in OpenSSL

  1. openssl enc -aes-256-cbc -a -salt -pbkdf2 -iter 100000 -in "clear.txt" -out "crypt.txt" -pass pass:"mypassword"

  2. openssl enc -d -aes-256-cbc -a -pbkdf2 -iter 100000 -in "crypt.txt" -out "out.txt" -pass pass:"mypassword"

I tried (which obviously won't work)

obj2 = AES.new("mypassword", AES.MODE_CBC)
output = obj2.decrypt(text)

I just want to do the second step in python, but when looking at the sample:

https://pypi.org/project/pycrypto/

obj2 = AES.new('This is a key123', AES.MODE_CBC, 'This is an IV456')
obj2.decrypt(ciphertext)

I don't need IV, How do I specify the salt? the pbkdf2 hash? I am also looked at this thread

How to decrypt OpenSSL AES-encrypted files in Python?

but did not help.

Can someone show me how to do this using python?

thank you.

Salvation answered 29/9, 2020 at 19:48 Comment(0)
M
9

The OpenSSL statement uses PBKDF2 to create a 32 bytes key and a 16 bytes IV. For this, a random 8 bytes salt is implicitly generated and the specified password, iteration count and digest (default: SHA-256) are applied. The key/IV pair is used to encrypt the plaintext with AES-256 in CBC mode and PKCS7 padding, s. here. The result is returned in OpenSSL format, which starts with the 8 bytes ASCII encoding of Salted__, followed by the 8 bytes salt and the actual ciphertext, all Base64 encoded. The salt is needed for decryption, so that key and IV can be reconstructed.

Note that the password in the OpenSSL statement is actually passed without quotation marks, i.e. in the posted OpenSSL statement, the quotation marks are part of the password.

For the decryption in Python the salt and the actual ciphertext must first be determined from the encrypted data. With the salt the key/IV pair can be reconstructed. Finally, the key/IV pair can be used for decryption.

Example: With the posted OpenSSL statement, the plaintext

The quick brown fox jumps over the lazy dog

was encrypted into the ciphertext

U2FsdGVkX18A+AhjLZpfOq2HilY+8MyrXcz3lHMdUII2cud0DnnIcAtomToclwWOtUUnoyTY2qCQQXQfwDYotw== 

Decryption with Python is possible as follows (using PyCryptodome):

from Crypto.Protocol.KDF import PBKDF2
from Crypto.Hash import SHA256
from Crypto.Util.Padding import unpad
from Crypto.Cipher import AES
import base64

# Determine salt and ciphertext
encryptedDataB64 = 'U2FsdGVkX18A+AhjLZpfOq2HilY+8MyrXcz3lHMdUII2cud0DnnIcAtomToclwWOtUUnoyTY2qCQQXQfwDYotw=='
encryptedData = base64.b64decode(encryptedDataB64)
salt = encryptedData[8:16]
ciphertext = encryptedData[16:]

# Reconstruct Key/IV-pair
pbkdf2Hash = PBKDF2(b'"mypassword"', salt, 32 + 16, count=100000, hmac_hash_module=SHA256)
key = pbkdf2Hash[0:32]
iv = pbkdf2Hash[32:32 + 16]

# Decrypt with AES-256 / CBC / PKCS7 Padding
cipher = AES.new(key, AES.MODE_CBC, iv)
decrypted = unpad(cipher.decrypt(ciphertext), 16)

print(decrypted)

Edit - Regarding your comment: 16 MB should be possible, but for larger data the ciphertext would generally be read from a file and the decrypted data would be written to a file, in contrast to the example posted above.
Whether the data can be decrypted in one step ultimately depends on the available memory. If the memory is not sufficient, the data must be processed in chunks.
When using chunks it would make more sense not to Base64 encode the encrypted data but to store them directly in binary format. This is possible by omitting the -a option in the OpenSSL statement. Otherwise it must be ensured that always integer multiples of the block size (relative to the undecoded ciphertext) are loaded, where 3 bytes of the undecoded ciphertext correspond to 4 bytes of the Base64 encoded ciphertext.

In the case of the binary stored ciphertext: During decryption only the first block (16 bytes) should be (binary) read in the first step. From this, the salt can be determined (the bytes 8 to 16), then the key and IV (analogous to the posted code above).
The rest of the ciphertext can be (binary) read in chunks of suitable size ( = a multiple of the block size, e.g. 1024 bytes). Each chunk is encrypted/decrypted separately, see multiple encrypt/decrypt-calls. For reading/writing files in chunks with Python see e.g. here.
Further details are best answered within the scope of a separate question.

Melisandra answered 29/9, 2020 at 21:16 Comment(2)
thank you. But one thing I did not explain is that my encryptedData is quite large. It is a 16 MBytes files. On your solution, I would have to decrypt the whole thing. I am not sure if python can handle. Since it a block cypher, I believe we can decypher in blocks. Do you know how to do it? Assume your encryptedDataB64 is a very large file. thank you!Salvation
Thanks man, I really appreciate the clarification and code sample. The threads on the subject are not very good.Salvation

© 2022 - 2024 — McMap. All rights reserved.