Python Encrypting with PyCrypto AES

Asked 6/1, 2013 at 6:5 Answered 8/6, 2018 at 7:32

I just found pycrypto today, and I've been working on my AES encryption class. Unfortunately it only half-works. self.h.md5 outputs md5 hash in hex format, and is 32byte. This is the output. It seems to decrypt the message, but it puts random characters after decryption, in this case \n\n\n... I think I have a problem with block size of self.data, anyone know how to fix this?

Jans-MacBook-Pro:test2 jan$ ../../bin/python3 data.py b'RLfGmn5jf5WTJphnmW0hXG7IaIYcCRpjaTTqwXR6yiJCUytnDib+GQYlFORm+jIctest 1 2 3 4 5 endtest\n\n\n\n\n\n\n\n\n\n'

from Crypto.Cipher import AES
from base64 import b64encode, b64decode
from os import urandom

class Encryption():
    def __init__(self):
        self.h = Hash()

    def values(self, data, key):
        self.data = data
        self.key = key
        self.mode = AES.MODE_CBC
        self.iv = urandom(16)
        if not self.key:
            self.key = Cfg_Encrypt_Key
        self.key = self.h.md5(self.key, True)

    def encrypt(self, data, key):
        self.values(data, key)
        return b64encode(self.iv + AES.new(self.key, self.mode, self.iv).encrypt(self.data))

    def decrypt(self, data, key):
        self.values(data, key)
        self.iv = b64decode(self.data)[:16]
        return AES.new(self.key, self.mode, self.iv).decrypt(b64decode(self.data)[16:])

Protocol answered 6/1, 2013 at 6:5 Comment(1)

Where is Hash() defined here? – Onomatology 14/9, 2015 at 16:20

To be honest, the characters "\n\n\n\n\n\n\n\n\n\n" don't look that random to me. ;-)

You are using AES in CBC mode. That requires length of plaintext and ciphertext to be always a multiple of 16 bytes. With the code you show, you should actually see an exception being raised when data passed to encrypt() does not fulfill such condition. It looks like you added enough new line characters ('\n' to whatever the input is until the plaintext happened to be aligned.

Apart from that, there are two common ways to solve the alignment issue:

Switch from CBC (AES.MODE_CBC) to CFB (AES.MODE_CFB). With the default segment_size used by PyCrypto, you will not have any restriction on plaintext and ciphertext lengths.
Keep CBC and use a padding scheme like PKCS#7, that is:
- before encrypting a plaintext of X bytes, append to the back as many bytes you need to to reach the next 16 byte boundary. All padding bytes have the same value: the number of bytes that you are adding:
```
length = 16 - (len(data) % 16)
data += bytes([length])*length
```
  That's Python 3 style. In Python 2, you would have:
```
length = 16 - (len(data) % 16)
data += chr(length)*length
```
- after decrypting, remove from the back of the plaintext as many bytes as indicated by padding:
```
data = data[:-data[-1]]
```

Even though I understand in your case it is just a class exercise, I would like to point out that it is insecure to send data without any form of authentication (e.g. a MAC).

Ned answered 7/1, 2013 at 22:36 Comment(10)

Oh wow, your padding solution is really neat. I went with approach where I added $^End#Something^$ to end of body that is to be encrypted, and than also bunch of # signs till whole body is multiple of 16. Than when Im done decrypting, I rstrip #'s and remove my end string. Anyway, what do you mean with sending data without form of authentication? – Protocol 8/1, 2013 at 4:51

If you mean hmac signing, yes Ive builtin sha256 hmac signing into my encrypt function. Anyway, out of curiosity I tried your padding solution, and it doesn't work. I fiddled with it to the point I got padding to work, but unpadding just deletes everything. def pad2(self, data): if not isinstance(data, bytes): self.data = bytes(self.data, 'utf-8') size = (self.blockSize - len(self.data)) % self.blockSize self.data += bytes(size, 'ascii') return self.data def unpad2(self, data): return data[:-data[-1]] – Protocol 8/1, 2013 at 5:24

Yes, HMAC is OK. It's important to have it because serious attacks exist on the padding part (padding oracle attack). In Python/PyCrypto, all data you encrypt must always be a binary string, not a text string. That was not important in Python2, but in Python3 they are two different things now. – Ned 8/1, 2013 at 7:52

so you're saying I should never convert encrypted data to str, but keep it in bytes at all times? Why? – Protocol 10/1, 2013 at 4:27

Because plaintexts and ciphertexts are not text strings. They must be a sequence of exactly 16 bytes. If take a text strings, I can encode it into bytes in several different ways (UTF-8, UTF-16, ASCII, etc). That distinction is very important in Python3. – Ned 17/1, 2013 at 6:44

@Ned Really thanks for your answer!! – Rika 13/12, 2015 at 9:48

Thanks for this neat solution @Ned ! Just wanted to add that data[:-data[-1]] did not work out of the box for me with python 2. I had to use data[:-ord(data[-1])] instead. – Hadj 22/10, 2016 at 20:51

@mmuller, that's because in Python 2 the decoded data is a string, so the last position is still a string, hence ord() is required. In Python 3 the data is already a sequence of bytes, so any given position is an int. Try >>> b'ABC'[-1] in a python 3 and 2 terminal to see the difference – Navarrete 26/10, 2018 at 9:43

@SquareRootOfTwentyThree, question: what if the plaintext is already aligned with a 16-byte boundary? That would lead to 0 padding bytes, and your suggested data = data[:-data[-1]] will truncate the message itself, no? – Navarrete 26/10, 2018 at 9:46

Oh, nevermind... an aligned message will lead to 16 bytes of padding, not 0. Clever spec :-) I'll keep my comment anyway as it might help others who stumble on this – Navarrete 26/10, 2018 at 9:49

You can use a fix character as long as you remember the length of your initial payload, so you don't "throw" useful end bytes away. Try this:

import base64

from Crypto.Cipher import AES

def encrypt(payload, salt, key):
    return AES.new(key, AES.MODE_CBC, salt).encrypt(r_pad(payload))


def decrypt(payload, salt, key, length):
    return AES.new(key, AES.MODE_CBC, salt).decrypt(payload)[:length]


def r_pad(payload, block_size=16):
    length = block_size - (len(payload) % block_size)
    return payload + chr(length) * length


print(decrypt(encrypt("some cyphertext", "b" * 16, "b" * 16), "b" * 16, "b" * 16, len("some cyphertext")))

Hornbook answered 8/6, 2018 at 7:32 Comment(0)

from hashlib import md5
from Crypto.Cipher import AES
from Crypto import Random
import base64

def derive_key_and_iv(password, salt, key_length, iv_length):
    d = d_i = ''
    while len(d) < key_length + iv_length:
        d_i = md5(d_i + password + salt).digest()
        d += d_i
    return d[:key_length], d[key_length:key_length+iv_length]

def encrypt(in_file, out_file, password, key_length=32):
    bs = AES.block_size
    salt = Random.new().read(bs - len('Salted__'))
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    #print in_file
    in_file = file(in_file, 'rb')
    out_file = file(out_file, 'wb')
    out_file.write('Salted__' + salt)
    finished = False
    while not finished:
        chunk = in_file.read(1024 * bs)
        if len(chunk) == 0 or len(chunk) % bs != 0:
            padding_length = bs - (len(chunk) % bs)
            chunk += padding_length * chr(padding_length)
            finished = True
        out_file.write(cipher.encrypt(chunk))
    in_file.close()
    out_file.close()

def decrypt(in_file, out_file, password, key_length=32):
    bs = AES.block_size

    in_file = file(in_file, 'rb')
    out_file = file(out_file, 'wb')
    salt = in_file.read(bs)[len('Salted__'):]
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    next_chunk = ''
    finished = False
    while not finished:
        chunk, next_chunk = next_chunk, cipher.decrypt(in_file.read(1024 * bs))
        if len(next_chunk) == 0:
            padding_length = ord(chunk[-1])
            if padding_length < 1 or padding_length > bs:
               raise ValueError("bad decrypt pad (%d)" % padding_length)
            # all the pad-bytes must be the same
            if chunk[-padding_length:] != (padding_length * chr(padding_length)):
               # this is similar to the bad decrypt:evp_enc.c from openssl program
               raise ValueError("bad decrypt")
            chunk = chunk[:-padding_length]
            finished = True
        out_file.write(chunk)    
    in_file.close()
    out_file.close()

def encode(in_file, out_file):
    in_file = file(in_file, 'rb')
    out_file = file(out_file, 'wb')
    data = in_file.read()
    out_file.write(base64.b64encode(data))    
    in_file.close()
    out_file.close()

def decode(in_file, out_file):
    in_file = file(in_file, 'rb')
    out_file = file(out_file, 'wb')
    data = in_file.read()
    out_file.write(base64.b64decode(data))    
    in_file.close()
    out_file.close()

Rosemarie answered 28/7, 2014 at 13:7 Comment(3)

This is just code without any hints, maybe it is why nobody upvoted. Could worth adding some prose. – Epictetus 8/6, 2018 at 8:2

OMG, you made my day. Your (very spurious) comments did include the tip that the padding must the length of padding – Talca 1/9, 2020 at 13:23

Please note that padding_length should not be zero, if that would be the case make it the size of the key – Talca 2/9, 2020 at 12:39

AES.new().encrypt() and .decrypt() take as both input and output strings whose length is a multiple of 16. You have to fix it in one way or another. For example you can store the real length at the start, and use that to truncate the decrypted string.

Note also that while it's the only restriction for AES, other modules (notably in Crypto.PublicKey) have additional restrictions that comes from their mathematical implementation and that shouldn't (in my opinion) be visible to the end user, but are. For example Crypto.PublicKey.ElGamal will encrypt any short string, but if it starts with null characters, they are lost upon decryption.

Dropping answered 6/1, 2013 at 9:7 Comment(1)

Aha, I see. I've read somewhere that the length has to do something with 16byte multiplier, didn't really focus it. Thanks for making it clear. I suppose I will implant some sort of ending identifier in self.data and fill remaining to 16 multiplier with garbage data, than when decrypting delete everything from and including ending identifier on. This way I will eliminate the need to store additional length value. – Protocol 6/1, 2013 at 15:48

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags