If I have a 32 character string (an MD5 hash) and I encode it using Base64, what's the maximun length of the encoded string?
An MD5 value is always 22 (useful) characters long in Base64 notation. Many Base64 algorithms will also append 2 characters of padding when encoding an MD5 hash, bringing the total to 24 characters. The padding adds no useful information and can be discarded. Only the first 22 characters matter.
Here's why:
An MD5 hash is a 128-bit value. Every character in a Base64 string contains 6 bits of information, because there are 64 possible values for the character, and it takes 6 powers of 2 to reach 64. With 6 bits of information in every character, 21 characters has 126 bits of information, and 22 characters contains 132 bits of information. Since 128 bits cannot fit within 21 characters but does fit within 22 characters (with a little room to spare), a 128-bit value will always be represented as 22 characters in Base64.
A note on the padding:
I mentioned above that many Base64 encoding algorithms add a couple of characters of padding when encoding an MD5 value. This is because Base64 represents 3 bytes of information as 4 characters. Since MD5 has 16 bytes of information, many Base64 encoding algorithms append "==" to designate that the input of 16 bytes was 2 bytes short of the next multiple of 3, which would have been 18 bytes. These 2 equal signs add no information whatsoever to the string, and can be discarded when storing.
As per http://en.wikipedia.org/wiki/Base64
"Note that given an input of n bytes, the output will be (n + 2 - ((n + 2) % 3)) / 3 * 4 bytes long, which converges to n * 4 / 3 or 1.33333n for large n."
So, it will be ((32 + 2 - (32 + 2) % 3)) / 3 * 4 = 34 - (34 % 3) / 3 * 4 = (34 - 1) / 3 * 4 = 33/3*4 = 44 characters.
You could always extract it in raw binary form (128 bits) and encode it directly into base 64, which means converting 16 bytes instead of 32, which becomes 24 bytes when base 64 encoded.
MD5 128 bits is represented as 22 characters in Base64. also have 2 padding charater '=' in this case.
How?
$ md5sum ./README.md
c6b5f48774aa0a87a82a276ff86be507 ./README.md
$ md5sum ./README.md | base64
YzZiNWY0ODc3NGFhMGE4N2E4MmEyNzZmZjg2YmU1MDcgIC4vUkVBRE1FLm1kCg==
In this case Base64 encoded string does not shorter than the MD5 hash length
Because what is encoded is the storage form of MD5 hash. not MD5 hash value itself.
Need to note how many bit is used to store one digit of MD5 hash.
Right way:
convert the hash value so 1 convert the hexadecimal to binary
2 convert the binary to base64 coded sting
$ cat ./README.md | openssl dgst -md5
c6b5f48774aa0a87a82a276ff86be507
$ cat ./README.md | openssl dgst -md5 -binary | openssl enc -base64
xrX0h3SqCoeoKidv+GvlBw==
or
$ md5sum ./LICENSE
e3fc50a88d0a364313df4b21ef20c29e ./LICENSE
$ cat ./LICENSE | openssl dgst -md5 -binary | openssl enc -base64
4/xQqI0KNkMT30sh7yDCng==
$ (echo 0:; echo e3fc50a88d0a364313df4b21ef20c29e) | xxd -rp -l 16|base64
4/xQqI0KNkMT30sh7yDCng==
© 2022 - 2024 — McMap. All rights reserved.