Python Encrypting with PyCrypto AES
Asked Answered
P

4

20

I just found pycrypto today, and I've been working on my AES encryption class. Unfortunately it only half-works. self.h.md5 outputs md5 hash in hex format, and is 32byte. This is the output. It seems to decrypt the message, but it puts random characters after decryption, in this case \n\n\n... I think I have a problem with block size of self.data, anyone know how to fix this?

Jans-MacBook-Pro:test2 jan$ ../../bin/python3 data.py b'RLfGmn5jf5WTJphnmW0hXG7IaIYcCRpjaTTqwXR6yiJCUytnDib+GQYlFORm+jIctest 1 2 3 4 5 endtest\n\n\n\n\n\n\n\n\n\n'

from Crypto.Cipher import AES
from base64 import b64encode, b64decode
from os import urandom

class Encryption():
    def __init__(self):
        self.h = Hash()

    def values(self, data, key):
        self.data = data
        self.key = key
        self.mode = AES.MODE_CBC
        self.iv = urandom(16)
        if not self.key:
            self.key = Cfg_Encrypt_Key
        self.key = self.h.md5(self.key, True)

    def encrypt(self, data, key):
        self.values(data, key)
        return b64encode(self.iv + AES.new(self.key, self.mode, self.iv).encrypt(self.data))

    def decrypt(self, data, key):
        self.values(data, key)
        self.iv = b64decode(self.data)[:16]
        return AES.new(self.key, self.mode, self.iv).decrypt(b64decode(self.data)[16:])
Protocol answered 6/1, 2013 at 6:5 Comment(1)
Where is Hash() defined here?Onomatology
N
85

To be honest, the characters "\n\n\n\n\n\n\n\n\n\n" don't look that random to me. ;-)

You are using AES in CBC mode. That requires length of plaintext and ciphertext to be always a multiple of 16 bytes. With the code you show, you should actually see an exception being raised when data passed to encrypt() does not fulfill such condition. It looks like you added enough new line characters ('\n' to whatever the input is until the plaintext happened to be aligned.

Apart from that, there are two common ways to solve the alignment issue:

  1. Switch from CBC (AES.MODE_CBC) to CFB (AES.MODE_CFB). With the default segment_size used by PyCrypto, you will not have any restriction on plaintext and ciphertext lengths.

  2. Keep CBC and use a padding scheme like PKCS#7, that is:

    • before encrypting a plaintext of X bytes, append to the back as many bytes you need to to reach the next 16 byte boundary. All padding bytes have the same value: the number of bytes that you are adding:

      length = 16 - (len(data) % 16)
      data += bytes([length])*length
      

      That's Python 3 style. In Python 2, you would have:

      length = 16 - (len(data) % 16)
      data += chr(length)*length
      
    • after decrypting, remove from the back of the plaintext as many bytes as indicated by padding:

      data = data[:-data[-1]]
      

Even though I understand in your case it is just a class exercise, I would like to point out that it is insecure to send data without any form of authentication (e.g. a MAC).

Ned answered 7/1, 2013 at 22:36 Comment(10)
Oh wow, your padding solution is really neat. I went with approach where I added $^End#Something^$ to end of body that is to be encrypted, and than also bunch of # signs till whole body is multiple of 16. Than when Im done decrypting, I rstrip #'s and remove my end string. Anyway, what do you mean with sending data without form of authentication?Protocol
If you mean hmac signing, yes Ive builtin sha256 hmac signing into my encrypt function. Anyway, out of curiosity I tried your padding solution, and it doesn't work. I fiddled with it to the point I got padding to work, but unpadding just deletes everything. def pad2(self, data): if not isinstance(data, bytes): self.data = bytes(self.data, 'utf-8') size = (self.blockSize - len(self.data)) % self.blockSize self.data += bytes(size, 'ascii') return self.data def unpad2(self, data): return data[:-data[-1]]Protocol
Yes, HMAC is OK. It's important to have it because serious attacks exist on the padding part (padding oracle attack). In Python/PyCrypto, all data you encrypt must always be a binary string, not a text string. That was not important in Python2, but in Python3 they are two different things now.Ned
so you're saying I should never convert encrypted data to str, but keep it in bytes at all times? Why?Protocol
Because plaintexts and ciphertexts are not text strings. They must be a sequence of exactly 16 bytes. If take a text strings, I can encode it into bytes in several different ways (UTF-8, UTF-16, ASCII, etc). That distinction is very important in Python3.Ned
@Ned Really thanks for your answer!!Rika
Thanks for this neat solution @Ned ! Just wanted to add that data[:-data[-1]] did not work out of the box for me with python 2. I had to use data[:-ord(data[-1])] instead.Hadj
@mmuller, that's because in Python 2 the decoded data is a string, so the last position is still a string, hence ord() is required. In Python 3 the data is already a sequence of bytes, so any given position is an int. Try >>> b'ABC'[-1] in a python 3 and 2 terminal to see the differenceNavarrete
@SquareRootOfTwentyThree, question: what if the plaintext is already aligned with a 16-byte boundary? That would lead to 0 padding bytes, and your suggested data = data[:-data[-1]] will truncate the message itself, no?Navarrete
Oh, nevermind... an aligned message will lead to 16 bytes of padding, not 0. Clever spec :-) I'll keep my comment anyway as it might help others who stumble on thisNavarrete
H
4

You can use a fix character as long as you remember the length of your initial payload, so you don't "throw" useful end bytes away. Try this:

import base64

from Crypto.Cipher import AES

def encrypt(payload, salt, key):
    return AES.new(key, AES.MODE_CBC, salt).encrypt(r_pad(payload))


def decrypt(payload, salt, key, length):
    return AES.new(key, AES.MODE_CBC, salt).decrypt(payload)[:length]


def r_pad(payload, block_size=16):
    length = block_size - (len(payload) % block_size)
    return payload + chr(length) * length


print(decrypt(encrypt("some cyphertext", "b" * 16, "b" * 16), "b" * 16, "b" * 16, len("some cyphertext")))
Hornbook answered 8/6, 2018 at 7:32 Comment(0)
R
3
from hashlib import md5
from Crypto.Cipher import AES
from Crypto import Random
import base64

def derive_key_and_iv(password, salt, key_length, iv_length):
    d = d_i = ''
    while len(d) < key_length + iv_length:
        d_i = md5(d_i + password + salt).digest()
        d += d_i
    return d[:key_length], d[key_length:key_length+iv_length]

def encrypt(in_file, out_file, password, key_length=32):
    bs = AES.block_size
    salt = Random.new().read(bs - len('Salted__'))
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    #print in_file
    in_file = file(in_file, 'rb')
    out_file = file(out_file, 'wb')
    out_file.write('Salted__' + salt)
    finished = False
    while not finished:
        chunk = in_file.read(1024 * bs)
        if len(chunk) == 0 or len(chunk) % bs != 0:
            padding_length = bs - (len(chunk) % bs)
            chunk += padding_length * chr(padding_length)
            finished = True
        out_file.write(cipher.encrypt(chunk))
    in_file.close()
    out_file.close()

def decrypt(in_file, out_file, password, key_length=32):
    bs = AES.block_size

    in_file = file(in_file, 'rb')
    out_file = file(out_file, 'wb')
    salt = in_file.read(bs)[len('Salted__'):]
    key, iv = derive_key_and_iv(password, salt, key_length, bs)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    next_chunk = ''
    finished = False
    while not finished:
        chunk, next_chunk = next_chunk, cipher.decrypt(in_file.read(1024 * bs))
        if len(next_chunk) == 0:
            padding_length = ord(chunk[-1])
            if padding_length < 1 or padding_length > bs:
               raise ValueError("bad decrypt pad (%d)" % padding_length)
            # all the pad-bytes must be the same
            if chunk[-padding_length:] != (padding_length * chr(padding_length)):
               # this is similar to the bad decrypt:evp_enc.c from openssl program
               raise ValueError("bad decrypt")
            chunk = chunk[:-padding_length]
            finished = True
        out_file.write(chunk)    
    in_file.close()
    out_file.close()

def encode(in_file, out_file):
    in_file = file(in_file, 'rb')
    out_file = file(out_file, 'wb')
    data = in_file.read()
    out_file.write(base64.b64encode(data))    
    in_file.close()
    out_file.close()

def decode(in_file, out_file):
    in_file = file(in_file, 'rb')
    out_file = file(out_file, 'wb')
    data = in_file.read()
    out_file.write(base64.b64decode(data))    
    in_file.close()
    out_file.close()
Rosemarie answered 28/7, 2014 at 13:7 Comment(3)
This is just code without any hints, maybe it is why nobody upvoted. Could worth adding some prose.Epictetus
OMG, you made my day. Your (very spurious) comments did include the tip that the padding must the length of paddingTalca
Please note that padding_length should not be zero, if that would be the case make it the size of the keyTalca
D
0

AES.new().encrypt() and .decrypt() take as both input and output strings whose length is a multiple of 16. You have to fix it in one way or another. For example you can store the real length at the start, and use that to truncate the decrypted string.

Note also that while it's the only restriction for AES, other modules (notably in Crypto.PublicKey) have additional restrictions that comes from their mathematical implementation and that shouldn't (in my opinion) be visible to the end user, but are. For example Crypto.PublicKey.ElGamal will encrypt any short string, but if it starts with null characters, they are lost upon decryption.

Dropping answered 6/1, 2013 at 9:7 Comment(1)
Aha, I see. I've read somewhere that the length has to do something with 16byte multiplier, didn't really focus it. Thanks for making it clear. I suppose I will implant some sort of ending identifier in self.data and fill remaining to 16 multiplier with garbage data, than when decrypting delete everything from and including ending identifier on. This way I will eliminate the need to store additional length value.Protocol

© 2022 - 2024 — McMap. All rights reserved.