Maximum length for MD5 input/output
Asked Answered
D

9

192

What is the maximum length of the string that can have md5 hashed? Or: If it has no limit, and if so what will be the max length of the md5 output value?

Dropout answered 3/8, 2010 at 7:43 Comment(1)
Follow the wiki: en.wikipedia.org/wiki/MD5Washtub
M
281

MD5 processes an arbitrary-length message into a fixed-length output of 128 bits, typically represented as a sequence of 32 hexadecimal digits.

Modigliani answered 3/8, 2010 at 7:46 Comment(11)
Note to self: MD5 hash length = 128 bits = 16 bytes = 32 hex digitsSpotlight
[A normal Edit] 32 hex digits and the string contains only words from 'a-z' and digits from '0-9'Synge
I noticed a little mistake in previous comments. Text should be as quoted :) "32 hex digits and the string contains only letters from 'a-z' and digits from '0-9'"Actinomycin
or: 128 bits = 32 hex digits * 4bitDogmatic
May you please tell me how many characters exactly can be as MD5's input? MD5("how many characters can be in here?");Rematch
@Rematch As the answer states, the input has an arbitrary length. This means the parameter can be any length you need.Superincumbent
@Peping A small correction: The input can be as long as the used datatype in the programming language used can be. Example: Java's strings use an array internally, therefore, a string can only contain (2^31)-1 characters (or less, depending on the heap size).That would also be your maximum input for the MD5 function in Java. But pure theoretically, the MD5 function could process indeed an input of arbitrary length. ;)Deen
FWIW - @Deen - I suppose you were making a tongue-in-cheek comment, because OP mentioned an input "string", but I'd like to point out that input could be as large as desired in practice, not just in theory. For example, a Java function could take as input the name of a file. That file could be as large as available storage; not limited by any datatype. It could even be from some streaming source, that far exceeds available local storage. Just saying. :PLonnie
@Lonnie Well, it depends on your implementation. If your hash function supports streaming data, then you're correct, obviously. While the hashing algorithm itself shouldn't have a limitation in theory, the practical implementation itself might have some form of limitation. Thank you for the heads up. ;)Deen
In principle MD5 uses the input length as a variable in the final block, but it wraps around at 2^64 bits, i.e. 2^61 bytes or about 2.3 exabytes, at which point I would suggest some implementations might fail. Most strings however will stay slightly below 2.3 exabytes :PMadelina
Additional explanation: byte = 8bits, a hex digit (0~F) represents 4 bits, a pair of hex digits (00~FF) represent 1 byte, that's why 128 bits = 16 bytes = 32 hex digits.Echt
W
46

Append Length

A 64-bit representation of b (the length of the message before the padding bits were added) is appended to the result of the previous step. In the unlikely event that b is greater than 2^64, then only the low-order 64 bits of b are used.

  • The hash is always 128 bits. If you encode it as a hexdecimal string you can encode 4 bits per character, giving 32 characters.
  • MD5 is not encryption. You cannot in general "decrypt" an MD5 hash to get the original string.

See more here.

Wickedness answered 3/8, 2010 at 7:48 Comment(3)
The length of the message is unlimited, What do you mean message? Is it input? My question is MD5("how many characters exactly?");Rematch
@Rematch Your input can be as long as possible in your current programming language, in Java this would be (2^31)-1 characters in a string. And yes, the "message" is the input.Deen
@Rematch ... or from a file, the input could be as large as available storage.Lonnie
A
15

You can have any length, but of course, there can be a memory issue on the computer if the String input is too long. The output is always 32 characters.

Achromatize answered 3/8, 2010 at 7:47 Comment(1)
If the string input is too long it wouldn't exist in the system in the first place, unless it's in a file, in which case you can pass in blocks to the digest function as they are read, in other words, you only need to have block bytes of the input available at a time.Thingumabob
C
7

The algorithm has been designed to support arbitrary input length. I.e you can compute hashes of big files like ISO of a DVD...

If there is a limitation for the input it could come from the environment where the hash function is used. Let's say you want to compute a file and the environment has a MAX_FILE limit.

But the output string will be always the same: 32 hex chars (128 bits)!

Ciliate answered 3/8, 2010 at 7:45 Comment(0)
M
4

You may want to use SHA-1 instead of MD5, as MD5 is considered broken.

You can read more about MD5 vulnerabilities in this Wikipedia article.

Modigliani answered 3/8, 2010 at 7:44 Comment(14)
Its creator, as well as Bruce Schneier and Homeland Security are in agreement that it's broken... How many more 'rumorspreading' do you need to convince you that it's actually been broken for some time? Fact is that it's arbitrarily easy to find an input that generates a specific hash. Of course you can mitigate this risk by salting your inputs, using sufficiently large salts. On a side-note: SHA-1 is considered to be just as broken. If you advise people to upgrade, advise them to upgrade to SHA-2, please.Deneendenegation
@Deneendenegation oh I need a very little. An example. Given a hash, will you bring a source string? Not a link to some great article, not someone's opinion but just a source string?Wherefrom
@Col. Shrapnel: Producing a source string is good, but I'd say if he can produce an article showing how to crack it at a cost of $xxxxxxx that would also be enough.Wickedness
@Mark no problem, I haven't said "original source". You are free to provide any string that will produce the same hash. Money? Who said money? Fact is that it's arbitrarily easy to find an input that generates a specific hash.. He haven't said it will cost any moneyWherefrom
I actually do see a problem in using unsalted md5 for saving passwords. Fact is (and everybody may verify this using a simple Google search) that there are many providers who allow you to "crack" some of the "easier" hashes (i.e. having a short alphanumeric source string for example - exactly what you have in passwords...) I actually tried that on some of the hashes in a database and found that I can at least find 1/4 using this. So even a hobby developer having not the smallest knowledge about cryptography can find the password behind the hash - if the password is bad - which it is!Dennie
But I don't want to say that using md5(md5()) or sha1() instead will fix the problem. There are providers for those, too, though far less.Dennie
@Col. Shrapnel: Are you a Mathematician or a Cryptographer? If not, why do you doubt the opinion of those scientists? In a mathematical sense, MD5 is definitely broken because there are methods which are faster than brute force. This might not have an impact on the daily use of MD5 yet (in a large scale). But a hash or encryption scheme has not only to be secure now but also in the future. And if there are already ways to break it, it will become even easier in the future. Of course it depends in which context you want to use MD5, but broken is broken.Grampus
For SHA-1 were also found collisions(theoretical attack). See Comparison of SHA functionsBeaman
Nobody really mentioned what they really mean under the term "broken". Although, @YourCommonSense makes sense.Stinko
Hashes aren't only used for security.Sorcim
You are talking about security uses of MD5. But MD5 (or any other hashing technique) has a ton of other uses. I, for one, want to use it to rename a file by its hash. I'm surely not concerned about the collision resistance of MD5. Everything you posted is still true, just my 2 cents.Beadledom
MD5 is considered broken/insecure and should not be used for critical stuff like passwords in databases anymore. It can still be used for file checksums, though.Deen
That depends on the use. The question does not say what the use of the hash will be. MD5 is not considered broken. It is however completely unsuitable for passwords etc. as it is NOT used as a CRYPTOGRAPHIC hash function in any modern sense. It is however extremely faster than SHA1, and so is a very nice one to use if we just care about e.g. finding an insecure cache file name or creating a fast hash lookup for e.g counting, checksums etc. It does its job very well, however this job is not security related. – ntg 1 min ago editTranquilize
How does this answer the question?Sublittoral
I
4

A 128-bit MD5 hash is represented as a sequence of 32 hexadecimal digits.

Ineffable answered 3/8, 2010 at 7:46 Comment(0)
M
3

There is no limit to the input of md5 that I know of. Some implementations require the entire input to be loaded into memory before passing it into the md5 function (i.e., the implementation acts on a block of memory, not on a stream), but this is not a limitation of the algorithm itself. The output is always 128 bits. Note that md5 is not an encryption algorithm, but a cryptographic hash. This means that you can use it to verify the integrity of a chunk of data, but you cannot reverse the hashing. Also note that md5 is considered broken, so you shouldn't use it for anything security-related (it's still fine to verify the integrity of downloaded files and such).

Marchpast answered 3/8, 2010 at 7:48 Comment(0)
C
2

md5 algorithm appends the message length to the last 64 bits of the last block, thus it would be fair to say that the message can be 2^64 bits long (18 e18 bits).

Curlew answered 27/6, 2021 at 15:38 Comment(1)
that would be 18 exabits or 18 e3 petabits or 18 e6 terabits or 18 e9 Gigabits.Curlew
M
0

Max length for MD5 input : largest definable and usable stream of bit A stream of bit definition constraints can depend on operating system, hardware constraints, programming language and more...

Length for MD5 output : Fixed-length always 128 bits For easier display, they are usually displayed in hex, which because each hex digit (0-1-2-3-4-5-6-7-8-9-A-B-C-D-E-F) takes up 4 bits of space, so its output can be displayed as 32 hex digits. 128 bits = 16 bytes = 32 hex digits

Medamedal answered 30/7, 2021 at 16:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.