PyCrypto - How does the Initialization Vector work?
Asked Answered
L

2

14

I'm trying to understand how PyCrypto works to use in a project but I'm not fully understanding the significance of the Initialization Vector (IV). I've found that I can use the wrong IV when decoding a string and I still seem to get the message back except for the first 16 bytes (the block size). Am simply using it wrong or not understanding something?

Here's a sample code to demonstrate:

import Crypto
import Crypto.Random
from Crypto.Cipher import AES

def pad_data(data):
    if len(data) % 16 == 0:
        return data
    databytes = bytearray(data)
    padding_required = 15 - (len(databytes) % 16)
    databytes.extend(b'\x80')
    databytes.extend(b'\x00' * padding_required)
    return bytes(databytes)

def unpad_data(data):
    if not data:
        return data

    data = data.rstrip(b'\x00')
    if data[-1] == 128: # b'\x80'[0]:
        return data[:-1]
    else:
        return data


def generate_aes_key():
    rnd = Crypto.Random.OSRNG.posix.new().read(AES.block_size)
    return rnd

def encrypt(key, iv, data):
    aes = AES.new(key, AES.MODE_CBC, iv)
    data = pad_data(data)
    return aes.encrypt(data)

def decrypt(key, iv, data):
    aes = AES.new(key, AES.MODE_CBC, iv)
    data = aes.decrypt(data)
    return unpad_data(data)

def test_crypto ():
    key = generate_aes_key()
    iv = generate_aes_key() # get some random value for IV
    msg = b"This is some super secret message.  Please don't tell anyone about it or I'll have to shoot you."
    code = encrypt(key, iv, msg)

    iv = generate_aes_key() # change the IV to something random

    decoded = decrypt(key, iv, code)

    print(decoded)

if __name__ == '__main__':
    test_crypto()

I'm using Python 3.3.

Output will vary on execution, but I get something like this: b"1^,Kp}Vl\x85\x8426M\xd2b\x1aer secret message. Please don't tell anyone about it or I'll have to shoot you."

Lineate answered 5/2, 2013 at 20:30 Comment(0)
C
21

The behavior you see is specific to the CBC mode. With CBC, decryption can be visualized in the following way (from wikipedia):

CBC decryption

You can see that IV only contributes to the first 16 bytes of plaintext. If the IV is corrupted while it is in transit to the receiver, CBC will still correctly decrypt all blocks but the first one. In CBC, the purpose of the IV is to enable you to encrypt the same message with the same key, and still get a totally different ciphertext each time (even though the message length may give something away).

Other modes are less forgiving. If you get the IV wrong, the whole message is garbled at decryption. Take CTR mode for instance, where nonce takes almost the same meaning of IV:

CTR mode

Continuator answered 6/2, 2013 at 7:52 Comment(1)
okay, I think I understand... I thought that maybe the encrypted text would only be changed for the first block, but it seems to affect the whole chain of bytes. It seems like a common practice is to prepend the IV to the encrypted code before sending, so I think I'm going to be doing that.Lineate
T
3

The developer for PyCrypto pulled the specification for AES CBC Mode from NIST:

AES Mode_CBC -> referencing NIST 800-38a (The Recommendation for Cipher Mode Operations)

From that, page 8:

5.3 Initialization Vectors

The input to the encryption processes of the CBC, CFB, and OFB modes includes, in addition to the plaintext, a data block called the initialization vector (IV), denoted IV. The IV is used in an initial step in the encryption of a message and in the corresponding decryption of the message. The IV need not be secret; however, for the CBC and CFB modes, the IV for any particular execution of the encryption process must be unpredictable, and, for the OFB mode, unique IVs must be used for each execution of the encryption process. The generation of IVs is discussed in Appendix C.


Thing to remember, you need to use a random IV every time you compose a message, this adds a 'salt' to the message therefore making the message unique; even with the 'salt' being out in the open, it will not help break the encryption if the AES encryption key is unknown. If you do not use a randomized IV, say, you use the same 16 bytes each message, your messages, if you repeat yourself, will look the same going across the wire and you could be subject to frequency and/or replay attacks.

A test for the results of random IVs vs static:

def test_crypto ():
    print("Same IVs same key:")
    key = generate_aes_key()
    iv = b"1234567890123456"
    msg = b"This is some super secret message.  Please don't tell anyone about it or I'll have to shoot you."
    code = encrypt(key, iv, msg)
    print(code.encode('hex'))
    decoded = decrypt(key, iv, code)
    print(decoded)

    code = encrypt(key, iv, msg)
    print(code.encode('hex'))
    decoded = decrypt(key, iv, code)
    print(decoded)

    print("Different IVs same key:")
    iv = generate_aes_key()
    code = encrypt(key, iv, msg)
    print(code.encode('hex'))
    decoded = decrypt(key, iv, code)
    print(decoded)

    iv = generate_aes_key()
    code = encrypt(key, iv, msg)
    print(code.encode('hex'))
    decoded = decrypt(key, iv, code)
    print(decoded)

Hope this helps!

Tenure answered 6/2, 2013 at 4:34 Comment(5)
That answer doesn't explain why the IV, if it would affect the decryption of every single block in CBC mode, only affected the decryption of the first block.Expenditure
AttributeError: 'bytes' object has no attribute 'encode' is what I get when try printing out the encrypted code with the print(code.encode('hex'))Lineate
thanks for the note about repeat attacks... I'm going to use a random IV on each message and prepend it to the encrypted code.Lineate
I used python2.7, however, just remove the encode('hex') portion and it should run fine.Tenure
@Expenditure see the approved answer.Tenure

© 2022 - 2024 — McMap. All rights reserved.