PHP: Warning mcrypt_generic_init(): Iv size is incorrect; supplied length: 12, needed: 8
Asked Answered
K

1

7

Basic Facts:

$algorithm  = MCRYPT_BLOWFISH;
$mode       = MCRYPT_MODE_CBC;
$randSource = MCRYPT_DEV_URANDOM;

Note This is not a strict coding question.

Context:

CentOS 7, Apache 2.4.12, & PHP 5.6.20.

I am making an HTML email with a "verify your email address" link that allows a registration process to complete. Everything on my virtual private server is UTF-8, and all form and query string input is processed with multi-byte (mb) funcions.

Background

As an experiment (I know about the age and state of the mcrypt library), I am attempting to decrypt Blowfish encrypted query string parameters. Assume that on the way up, the encryption sequence works perfectly and I am receiving email with the link.

On the way down, the hmac_hash() signing (SHA-512, just for this experiment) is working and I am able to separate each independent message (32 characters) from its hash checksum (128 characters). Base64 decoding of the separated message portion is working. For each parameter, I am left with the composite cipher text, where composite cipher text equals the IV + base cipher text. Assume I use a version of substr() to obtain the IV and the base cipher text independently (which is par for the course).

Problem

PHP: Warning  mcrypt_generic_init(): Iv size is incorrect; supplied length: 12, needed: 8

Assume I have combed the PHP manual and Stackoverflow. Assume I have looked at other questions similar, but not exactly like this one. Assume I have searched the Internet to no avail. Assume I have enough experience to setup mb_string properly. Assume that I will take care of mcrypt padding when I get past this current problem.

Could multi-byte issues be interfering with decryption?

Could base64 encoding the IV + base cipher text corrupt the IV?

Could base64 padding be an issue?

Should I be specifying a more specific MCRYPT_BLOWFISH_*?

Why does the blowfish IV size report 8 bytes, but rarely produces an 8 byte IV?

Which substr() should I use, substr() or mb_substr(), for a setup that leans towards making everything UTF-8 and processes all other input as multi-byte UTF-8. I know that is an odd question, but all of the PHP Manual mycrypt decryption sequence examples use substr(), and none use mb_substr(). Everything on my site works with mb_functions when possible, and I would not mind using substr() if it solved my problem, but it does not solve it. When I use mb_substr(), I get the following warning.

PHP: Warning  mcrypt_generic_init(): Iv size is incorrect; supplied length: 11, needed: 8

Does anyone have any experience with this exact issue? Constructive answers will be rewarded!

Latest

Goal: To recreate a hash, like this, from a Blowfish encrypted query string.

Above is an example Blowfish hash that I am trying to reconstruct from an array, received via a SHA512 HMACed, symmetricly Blowfish encrypted (CBC), url safe Base64 encoded, urlencoded, query string (phew!).

Below, is what the strings for the query string (having chopped up the blowfish hash above) look like after encrypting, signing, and base64 encoding, but before being urlencoded. Each one is 128 characters long (each string gets longer as you do more stuff).

enter image description here

Decrypted Array

Above is the Base64 decoded and Blowfish decrypted array derived from the query string (Obviously, there are security steps in between this result, but I am just trying to show the latest state of things.) Something is not right. Encryption appears to work without any errors. Decryption does not produce any errors either. The plain text is just wrong. If I join/implode these elements, they will not be like the Blowfish hash above.

Kennethkennett answered 15/6, 2016 at 2:45 Comment(20)
We all know what assuming does, but just go with it! ;-)Kennethkennett
@Syon You seem pretty good with encryption. Any thoughts?Kennethkennett
Had a similar problem where i used ryndael 128. Spent hours looking for the cause but found that mcrypt is not actively maintained and switched to openssl encryption which is working good so far. It is a guess, but i think the problem lies in the way the key is derived/recreated.Limelight
@Limelight Thank you for looking at my problem. I gave you a reward just for taking a stab! ;-) Hey, the answer might be to just use the OpenSSL extension.But, there is nothing I can find anywhere that suggest Blowfish won't work with mcrypt. All mcrypt methods want a string for the key, but maybe the key should be a binary string?Kennethkennett
@Limelight I tried making sure the string I use for the key is at least 56 characters in length. Also, I integrated the a url safe version of base64_encode. Only other thing I can think of is to return the key as binary data and use it that way.Kennethkennett
@Limelight Apparently, the IV that is returned is ISO-8859-1. When I convert it to UTF-8, the IV always reports that it is 8 bytes. I may have solved this.Kennethkennett
It sounds like you are very close. Another thing that crossed my mind was to limit the characters used in creating the iv, but that goes at the cost of security.Limelight
Now, during the encryption phase I am getting the 'IV size incorrect" error message. I think mcrypt is not multi-byte aware. I tried limiting the IV ($ivSize / 2), and it gave me a "you only have 7 of 8" bytes needed. It's definitely a mb issue.Kennethkennett
@Limelight Perhaps I should force a ISO-8859-1 IV during encryption and convert the incoming IV (mb, UTF8,) to ISO-8859-1 IV. That might be the ticket.Kennethkennett
Ha! Forcing ISO-8859-1 during encryption gave me a "length 4, needed 8" error message. I'll just let it do normal IV creation during encryption and try forcing iso-8859-1 during decryption.Kennethkennett
The 12 bytes might be because you use mb_substr(), which takes characters, not bytes. With substr(), 8 will take 8 bytes. Encoding should never be the issue, because the output is either binary (use base64), or ascii-safe (mb or not won't make a difference).Ment
@Ment Hi Rudie. Thanks for your input. Consider, though, that from the time the IV is generated (ISO-8859-1), concatenated to the front of the cipher text (with an HMAC concatenated in front of all of that), base64_encoded, stored in a query string, urlencoded, output from the web server (UTF-8), brought in by clicking (filter_input_array(INPUT_GET)) on a link in email (where all inputs are assured to be UTF-8 via mb string functions in my filter framework, not ISO-8859-1), automatically urldecoded by PHP, separted from the HMAC, base64_decoded, and recovered from the composite cipher text.Kennethkennett
@Ment In other words, encoding is very much an issue when you switch contexts. When you never leave PHP, it's not an issue, but the IV is traveling into many different contexts. I have tried both substr() and mb_substr(), and as you can see in the script I say "characters" for mb_substr(). No dice. Yes, my stack is UTF-8 everything, but mcrypt_create_iv() outputs ISO-8859-1.Kennethkennett
So you should base64 it?, so encoding doesn't matter anymore. Encoding shouldn't be an issue anyway, because encryption artifacts are always binary (or made binary safe, like md5 output). What kind of strings are you substringing??Ment
@Ment I have, in a url safe way, base64 enoded the composite cipher text (IV + cipher text), but you're missing the point that it's going out UTF-8 and coming in UTF-8. Thus, mcrypt complains because it wants to work with the ISO-8859-1 character set.Kennethkennett
@AnthonyRutledge I know you're not really after a code but just wanted to share this link if it helps in any way. Look at CryptorService.phpZeigler
@Zeigler Thank you for your contribution. It's funny. I've resorted to ensuring that all input to any mcrypt function is in ISO-8859-1 format and the IV problem has vanished. Encryption works, but decryption does not. I will definitely look at the code in your link. Thank you.Kennethkennett
@Zeigler Hey, I looked over the encryption code. Unfortunately, no dice. That code sample (the encryption part), aside from it not being Blowfish or CBC mode, appears to be working in a world where they never have to consider multi-byte issues, or the fact that mycrypt outputs ISO-8859-1, but I am processing a multi-byte, UTF-8 query string. I do appreciate that you tried to help. Anymore suggestions will be greatly appreciated.Kennethkennett
@AnthonyRutledge - In terms of blowfish and cbc, although I haven't tested it myself but I far as I can see, if you set $algorithm = 'blowfish' and $mode = 'cbc', class should handle it but I don't think this would still sort out your particular issue. As you said, it would only work if you started using whole thing at very beginning in a certain manner which unfortunately doesn't really apply to you. It sounds like you might end up with a working solution where a bit of patchy and hacky code is involved.Zeigler
@Zeigler It's like the movie the Golden Child "Oh, there's a bottom Monty!"Kennethkennett
F
1

I would guess that the issue will hide somewhere with the UTF-8 encoding, as you use it in incorrect contexts. It could also be that your framework does some magic for all use-cases. This could be too much and generally end up in security hole or just bugs like that, as you don't do what really needs to be done when it really needs to be done.

Strings in PHP are just collections of bytes. You can store text there, in encoding of your choosing, or you could just store binary data there, like images. PHP knows neither what kind of data is in what string nor what encoding is used there. This is up to developer to track this information.

When working with encryption, you get binary data when generating random strings or encrypting some payloads. It's saved in strings, but it does not have UTF-8 encoding, as it's just bytes. I wouldn't even say that it's encoding is ISO-8859-1, as this would mean that byte 77 (0x4D) stands for letter "M". But for real, it's just numbers - 77 does not stand for any letter at all.

One more thing to add - for ASCII symbols (Latin letters, digits etc. - 0-127 byte values) it takes one byte to represent that symbol in UTF-8 encoding (same as in ISO-8859). So as far as you pass base64_encoded data, you shouldn't worry too much about it. mb_substr will also work in the same way as substr. But! for the binary data, you cannot use mb_* functions, as it works with characters. For example, if encrypted data is two bytes 0xC5 0xA1, it's only single symbol in UTF-8. Encryption works with bytes (up until the final result, which could be anything - even binary files), not characters.

As you've not provided any code, I've put some for you - I hope it will help with your issue (if it's still relevant at all).

To show passing parameters in URL, there are two files: encrypt.php and decrypt.php. Save to a directory, run php -S localhost:8000 in it and go to http://localhost:8000/encrypt.php

encrypt.php:

<?php
// mcrypt_enc_get_key_size($td) gives 56, so it's longest that this key can be
$key = 'LedsoilgarvEwAbDavVenpirabUfjaiktavKekjeajUmshamEsyenvoa';
$data = 'This is very important data, with some š UTF-8 ĘĖ symbols';

$td = mcrypt_module_open(MCRYPT_BLOWFISH, '', MCRYPT_MODE_CBC, '');

// create random IV - it's just random 8 bytes. You should use random_bytes() instead if available
$ivSize = mcrypt_enc_get_iv_size($td);
$iv = mcrypt_create_iv($ivSize, MCRYPT_DEV_URANDOM);

mcrypt_generic_init($td, $key, $iv);

$encrypted = mcrypt_generic($td, $data);

mcrypt_generic_deinit($td);
mcrypt_module_close($td);

// payload that you want to send - binary. It's neither UTF-8 nor ISO-8859-1 - it's just bytes
$payload = $iv . $encrypted;

// base64 to pass safely
$base64EncodedPayload = base64_encode($payload);
// URL encode for URL. No need to do both URL-safe base64 *and* base64 + urlencode
$link = 'http://localhost:8000/decrypt.php?encryptedBase64=' . urlencode($base64EncodedPayload);

// in fact, just for the reference, you don't even need base64_encode - urlencode also works at byte level
// base64_encode takes about 1.33 more space, but urlencode takes 3 times more than original for non-safe symbols, so base_64 will probably be shorter
$link2 = 'http://localhost:8000/decrypt.php?encrypted=' . urlencode($payload);

?>
<!doctype html>
<html>
    <head>
        <meta charset="utf-8">
    </head>
    <body>
        <pre><?php
            var_dump('Data:', $data);
            var_dump('Data size in bytes:', strlen($data));
            var_dump('Data size in characters - smaller, as 3 of the characters take 2 bytes:', mb_strlen($data, 'UTF-8'));
            var_dump('Encrypted data size in bytes - same as original:', strlen($encrypted));
            var_dump('Encrypted data size in characters - will be pseudo-random each time:', mb_strlen($encrypted, 'UTF-8'));

            var_dump('IV base64 encoded:', base64_encode($iv));
            var_dump('Encrypted string base64 encoded:', base64_encode($encrypted));
        ?></pre>
        <!-- Link will not contain any special characters, so htmlentities should not make any difference -->
        <!-- In any case, I would still recommend to use right encoding at the right context to avoid any issues if something changes -->
        <a href="<?php echo htmlentities($link, ENT_QUOTES, 'UTF-8');?>">Link to decrypt</a><br/>
        <a href="<?php echo htmlentities($link2, ENT_QUOTES, 'UTF-8');?>">Link to decrypt2</a>
    </body>
</html>

decrypt.php:

<?php
$key = 'LedsoilgarvEwAbDavVenpirabUfjaiktavKekjeajUmshamEsyenvoa';

if (isset($_GET['encryptedBase64'])) {
    // just get base64_encoded symbols (will be ASCII - same in UTF-8 or other encodings)
    $base64EncodedPayload = $_GET['encryptedBase64'];
    $payload = base64_decode($base64EncodedPayload);
} else {
    // just get binary string from URL
    $payload = $_GET['encrypted'];
}

$td = mcrypt_module_open(MCRYPT_BLOWFISH, '', MCRYPT_MODE_CBC, '');

$ivSize = mcrypt_enc_get_iv_size($td);

$iv = substr($payload, 0, $ivSize);
$encrypted = substr($payload, $ivSize);

mcrypt_generic_init($td, $key, $iv);

/* Decrypt encrypted string */
$decrypted = mdecrypt_generic($td, $encrypted);

/* Terminate decryption handle and close module */
mcrypt_generic_deinit($td);
mcrypt_module_close($td);

?>
<!doctype html>
<html>
    <head>
        <meta charset="utf-8">
    </head>
    <body>
        <pre><?php
            var_dump('IV base64 encoded:', base64_encode($iv));
            var_dump('Encrypted string base64 encoded:', base64_encode($encrypted));
            var_dump('Result:', $decrypted);
        ?></pre>
    </body>
</html>
Fatal answered 14/3, 2017 at 21:56 Comment(7)
So, the decryption sequence you depict would be for link2. I'll see if I can apply some of your UTF-8 advice to the crypto code (class Blowfish extends Cipher). I'll try omitting any use of mb_* functions at any time during the process of encryption or decryption. I do believe it is still wise to url-safe base64 encode the cipher text after it has been signed (hash_hmac('sha512', $string, $this->hmacKey, false)), because that is a cheap obscurity step that keeps the underscore and two other characters out of the query string. But, if that is causing a problem, it has to go.Kennethkennett
Both link and link2 works - it just needs same steps (in reverse) on the decrypting end (see if in decrypt.php). You can use URL-safe base64 encode, it's just not really necessary if you do urlencode. If it's clearer - you can do it, just make reverse actions when decrypting. About the framework - I misunderstood that from your comment (where all inputs are assured to be UTF-8 via mb string functions in my filter framework)Augmented
Ok, I will let you know how it goes. I'm actually in the middle of configuring a VirtualHost right now, but in a few hours, I might want to play. I will be in touch. Let us hope signing is not causing the problem.Kennethkennett
Marius, hey, before I dig into this, perhaps you could take a look at the code in my answer to this question on encoding. It's just two methods of a class, but it may have a bearing. #7980067Kennethkennett
I've been thinking. The sequence of events depicted in your answer is not the same sequence of events when taking in input through a GET request. Specifically, it lacks any attempt to verify the encoding of the input. That said, when I get my input sanitizer reorganized, I will let you know how it goes.Kennethkennett
As I've tried to explain in my answer - there is no "encoding" in binary data. It's just binary data, that's all. You could base64-encode it before sending and decode it after getting in another script, but even that is not necessary.Augmented
You are missing the point. When the data comes back to my web server, my PHP application is going to check the encoding (or, are you unaware of such a step). I am not doing encryption just within PHP. I will let you know how it goes.Kennethkennett

© 2022 - 2024 — McMap. All rights reserved.