shorter php cipher than md5?
Asked Answered
C

10

29

For a variety of stupid reasons, the maximum length of a given form variable that we are posting to an external server is 12 characters.

I wanted to obscure that value with md5, but obviously with 12 characters that isn't going to work. Is there a cipher with an already-made PHP function which will result in something 12 characters or less?

The security and integrity of the cipher isn't super important here. My last resort is to just write a function which moves each letter up or down an ascii value by x. So the goal isn't to obscure it from a cryptography expert, but just to not post it in plain text so a non-technical worker looking at it won't know what it is.

Thanks for any advice.

Cornelius answered 21/9, 2010 at 19:27 Comment(5)
I am tempted to recommend throwing in a str_rot13() call somewhere. On a more serious note, if you need to retrieve the decrypted version, you can't use md5() anyway since that's a hashing function, which makes it almost essentially a one-way encryption.Hydrodynamics
Please don't use the Caesar cipher. There are a lot of small/weak encryption/decryption algorithms out there to choose from that can be adapted to fit your 12 char mold. Caesar ciphers are simply too easy to figure out. Eventually you'll hit a pattern even a knuckle-dragger can see plain as day.Introductory
Which of them would you suggest, Joel?Cornelius
May I add that if this is a hidden form field, a non-technical worker probably won't be looking at it. The only one looking will be someone curious enough to try figuring it out. You can deter or slow them down by doing some character transitions (eg Caesar cipher). If you need the value to really not show at all, perhaps have the form post back to your server (where you can keep the variable hidden from plain view), and then your server can post to the remote one bringing back any results. I don't know if this fits your use case.Tabu
Pass your result from md5($data) to Alphabet::convert($hash, Alphabet::HEX, Alphabet::ALPHANUMERIC) and then substr($result, 0, 12) to preserve the highest amount of information.Elegance
W
17

This is an addition to this answer.

The answer proposes to take the first twelve characters from a 32 character representation of md5. Thus 20 characters of information will be lost - this will result in way more possible collisions.

You can reduce the loss of information by taking the first twelve characters of a 16 character representation (the raw form):

substr(md5($string, true), 0, 12);

This will maintain 75% of the data, whereas the use of the 32 char form only maintains 37.5% of the data.

Willy answered 21/9, 2010 at 20:52 Comment(2)
This results in 12 bytes, which are not necessarily 12 characters and it's also risky to use this with non binary safe functions, which are unfortunately still pretty common, since those will truncate at the first \0 character.Temikatemp
Does md5( $string, true ) not return also non-visible characters? I think this will cause problems in HTML - the hash might get urlencoded by the browser and change in length...Fanaticism
R
30

You can shorten md5() result itself.

The binary form of md5() is only 16 bytes long.

Or you can encode it using base64 that will give you 22 alphanumeric characters

rtrim(base64_encode(md5($string', true)), '=');
Renettarenew answered 19/2, 2015 at 9:14 Comment(4)
This should be the accepted answer. It's the full MD5 hash in a more compact representation.Hallett
This does not keep the full MD5 hash... According to the PHP docs, "Warning: base_convert() may lose precision on large numbers due to properties related to the internal "double" or "float" type used." On 32-bit systems, at least, converting a 32-digit hex to base-32 results in the last 16 digits being set to 0. Try it: print base_convert(md5('foo'), 16, 32); results in 5cnkcdmj62v000000000000000.Triptych
@Triptych This could be usefull to avoid loosing precision as you mentioned php.net/manual/es/function.base-convert.php#109660Fredric
Just for the sake of completeness, you will need to install php7.0-bcmath in the server to use the function provided by @FredricStayathome
W
17

This is an addition to this answer.

The answer proposes to take the first twelve characters from a 32 character representation of md5. Thus 20 characters of information will be lost - this will result in way more possible collisions.

You can reduce the loss of information by taking the first twelve characters of a 16 character representation (the raw form):

substr(md5($string, true), 0, 12);

This will maintain 75% of the data, whereas the use of the 32 char form only maintains 37.5% of the data.

Willy answered 21/9, 2010 at 20:52 Comment(2)
This results in 12 bytes, which are not necessarily 12 characters and it's also risky to use this with non binary safe functions, which are unfortunately still pretty common, since those will truncate at the first \0 character.Temikatemp
Does md5( $string, true ) not return also non-visible characters? I think this will cause problems in HTML - the hash might get urlencoded by the browser and change in length...Fanaticism
R
12

Try crc32() maybe?

Rehnberg answered 21/9, 2010 at 19:30 Comment(0)
N
8

If you just need a hash, you can still use the first 12 characters from the md5 hash.

substr(md5($yourString), 0, 12);
Novation answered 21/9, 2010 at 19:29 Comment(10)
I was just typing the very same!Helsie
I should clarify that we still need to decode it. There could be a case where the first 12 characters of one md5 are the same as another.Cornelius
@tiredofcoding: MD5, as any other hash, is a one-way function: you can't decode it. What you want is a cipher, not a hash.Leanaleanard
I know that. What I'm saying is that we'll maintain the values of each piece of data we encode with md5. So we'll know that value x = md5 value y.Cornelius
@tiredofcoding: Then you may consider picking out different parts of the MD5 hash to reduce chances of collision, or split the MD5 into four 8-character strings and fold them somehow into a single string.Leanaleanard
"value x = md5 value y" and first12LettersFrom(value x) = first12LettersFrom(md5 value y) too. I don't really see the problem here.Novation
but what about the chances of the first 12 characters being the same as another?Cornelius
Every hash function has collisions. If you want to avoid absolutely collisions, you need to cypher and compress your String so it fit in 12 characters.Novation
@Leanaleanard That won't improve collision odds - any well designed hash function will have good distribution over any subset.Dysgenics
@Cornelius With 12 hexadecimal characters, you have 48 bits of hash. According to the birthday paradox, you'd expect a collision after about 2^24 hashes - 16 million of them. If you use base64, then your 12 characters encode 9 bytes, which is 72 bits, for an expected collision after about 2^36 or 68 billion hashes.Dysgenics
P
4

All the answers are suggesting loosing some of the data (higher collision possibility), but looks like using using base conversion is a better approach: e.g. like described here http://proger.i-forge.net/Short_MD5/OMF

You may also generate any random string and insert it into database, checking if not already exists prior to saving. This will allow you to have short hashes, and ensure there are no collisions.

Parliamentarian answered 8/10, 2013 at 17:12 Comment(0)
S
3

I have to put this suggestion across as I have to assume you are in control of the script that your encrypted value is sent to....

I also have to assume that you can create many form fields but they can't have a length larger than 12 characters each.

If that's the case, could you not simply create more than one form field and spread the md5 string across multiple hidden fields?

You could just split the md5 string into chunks of 8 and submit each chunk in a hidden form field and then join them together at the other end.

Just a thought...

Staal answered 24/10, 2013 at 13:13 Comment(0)
R
2

You can make use of a larger alphabet and make hash shorter but still reversible to original value.

I implemented it here - for example, hash ee45187ab28b4814cf03b2b4224eb974 becomes 7fBKxltZiQd7TFsUkOp26w - it goes from 32 to 22 characters. And it can become even less if you use a larger alpahabet. If you use unicode, you can even encode hash with emoji...

Robomb answered 30/5, 2019 at 15:18 Comment(1)
Do you have PHP version?Surd
M
0

This probably won't be of use to the OP since they were looking for 2 way function but may help someone looking for a shorter hash than md5. Here is what I came up with for my needs (thanks to https://rolandeckert.com/notes/md5 for highlighting the base64_encode function). Encode the md5 hash as base(64) and remove any undesirable base(64) characters. I'm removing vowels + and / so reducing the effective base from 64 to 52.

Note if you truncate a base(b) encoded hash after c characters it will allow for b ^ c unique hashes. Is this robust enough to avoid collisions? It depends on how many items (k) you are hashing. The probability of collision is roughly (k * k) / (b ^ c) / 2, so if you used the function below to hash k = 1 million items with base b = 52 encoding truncated after c = 12 characters the probability of collision is < 1 in 750 million. Compare to truncating the hex encoded (b = 16) hash after c = 12 characters. The probability of collision is roughly 1 in 500! Just say no to truncating hex encoded hashes. :)

I'll go out on a limb and say the function below (with length 12) is reasonably safe for 10 million items (< 1 in 7.5 million probability of collision), but if you want to be extra safe use base(64) encoding (comment out the $remove array) and/or truncate fewer characters.

// convert md5 to base64, remove undesirable characters and truncate to $length
function tinymd5($str, $length) { // $length 20-22 not advised unless $remove = '';
    // remove vowels to prevent undesirable words and + / which may be problematic
    $remove = array('a', 'e', 'i', 'o', 'u', 'A', 'E', 'I', 'O', 'U', '+', '/');
    $salt = $str;
    do { // re-salt and loop if rebase removes too many characters
        $salt = $base64 = base64_encode(md5($salt, TRUE));
        $rebase = substr(str_replace($remove, '', $base64), 0, $length);
    } while ($length < 20 && substr($rebase, -1) == '=');
    return str_pad($rebase, min($length, 22), '='); // 22 is max possible length
}

$str = 'Lorem ipsum dolor sit amet 557726776';
echo '<br />' . md5($str);         // 565a0bf7e0ba474fdaaec57b82e6504a
$x = md5($str, TRUE);
echo '<br />' . base64_encode($x); // VloL9+C6R0/arsV7guZQSg==
echo '<br />' . tinymd5($str, 12); // VlL9C6R0rsV7
echo '<br />' . tinymd5($str, 17); // VlL9C6R0rsV7gZQSg
$x = md5(base64_encode($x), TRUE); // re-salt triggered < 20
echo '<br />' . base64_encode($x); // fmkPW/OQLqp7PTex0nK3NQ==
echo '<br />' . tinymd5($str, 18); // fmkPWQLqp7PTx0nK3N
echo '<br />' . tinymd5($str, 19); // fmkPWQLqp7PTx0nK3NQ
echo '<br />' . tinymd5($str, 20); // fmkPWQLqp7PTx0nK3NQ=
echo '<br />' . tinymd5($str, 22); // fmkPWQLqp7PTx0nK3NQ===
Meiny answered 10/2, 2019 at 2:24 Comment(1)
Thanks for the code. is there a way to decrypted?Tyrannize
L
0

I came up with base 90 for reducing md5 to 20 multi-byte characters (that I tested to fit properly in a mysql's varchar(20) column). Unfortunately this actually makes the string potentially larger than even the 32 bytes from php's md5, with the only advantage that they can be stored in varchar(20) columns. Of course you could just replace the alphabet with single-byte ones if your worries are about storage...

There are a couple of rules that are important to have in mind if your idea is to use this reduced hash as a lookup key in something like mysql and for other kinds of processing:

  1. By default MySQL does not differentiate Upper Case from Lower Case in a typical where clause which takes out a lot of characters right out of the possible target alphabets. This include not only english character but also almost all characters in other languages.

  2. It's important that your hash can be upper-cased and lower-cased transparently since many systems uppercase these keys, so to keep it consistent with md5 in that sense you should use only lowercase when using case-able characters.

This is the alphabet I used (I handpicked each character to make the hashes as nice as possible):

define('NICIESTCHARS', [
    "0","1","2","3","4","5","6","7","8","9",
    "a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z",
    "¢","£","¥","§","¶","ø","œ","ƒ","α","δ","ε","η","θ","ι","λ","μ","ν","π","σ","τ","φ","ψ","ω","ћ","џ","ѓ","ѝ","й","ќ","ў","ф","э","ѣ","ѷ","ѻ","ѿ","ҁ","∂","∆","∑","√","∫",
    "!","#","$","%","&","*","+","=","@","~","¤","±"
]);

Here is the code in PHP (I suppose it's not the best code but does the job). And keep in mind that it only works for strings in hexa (0-F) that are a multiple of 8 in length like md5 in php which is 32 0-f bytes:

function mbStringToArray ($string) {
    $strlen = mb_strlen($string);
    while ($strlen) {
        $array[] = mb_substr($string,0,1,"UTF-8");
        $string = mb_substr($string,1,$strlen,"UTF-8");
        $strlen = mb_strlen($string);
    }
    return $array;
} 

class Base90{ 
    static function toHex5($s){
        // Converts a base 90 number with a multiple of 5 digits to hex (compatible with "hexdec").
        $chars = preg_split('//u', $s, null, PREG_SPLIT_NO_EMPTY);
        $map = array_flip(NICIESTCHARS);
        $rt = '';
        $part = [];
        $b90part = '';
        foreach($chars as $c){
            $b90part .= $c;
            $part[] = $map[$c];
            if(count($part) == 5){
                $int = base90toInt($part);
                $rt .= str_pad(dechex($int), 8, "0", STR_PAD_LEFT);
                $part = [];
                $b90part = '';
            }
        }
        return $rt;
    }
    
    static function fromHex8($m){
        // Converts an hexadecimal number compatible with "hexdec" to base 90
        $parts = [];
        $part = '';
        foreach(str_split($m) as $i => $c){
            $part.= $c;
            if(strlen($part) === 8){
                $parts[] = intToBase90(hexdec($part));
                $part = '';
            }
        }
        return implode('', $parts);
    }
}



function intToBase90($int){
    $residue = $int;
    $result = [];
    while($residue){
        $digit = $residue % 90;
        $residue -= $digit;
        $residue = $residue / 90;
        array_unshift($result,  NICIESTCHARS[$digit]);
    }
    $result = implode('', $result);
    return $result;
}

function base90toInt($digits){
    $weight = 1;
    $rt = 0;
    while(count($digits)){
        $rt += array_pop($digits)*$weight;
        $weight *= 90;
    }
    return $rt;
}
Leicester answered 18/1, 2022 at 17:29 Comment(0)
D
-1

$hashlen = 4;
$cxtrong = TRUE;
$sslk = openssl_random_pseudo_bytes($hashlen, $cxtrong);
$rand = bin2hex($sslk);

echo $rand;

You can change the hash length (in multiples of two) by changing the value of the variable $hashlen

Descry answered 9/2, 2020 at 11:50 Comment(1)
These are just random numbers, rather than a hash of some content.Milburn

© 2022 - 2024 — McMap. All rights reserved.