What is the clash rate for md5? [closed]
Asked Answered
G

2

39

What's the probability for the clash for the md5 algorithm? I believe it is extremely low.

Gerdi answered 13/1, 2012 at 15:12 Comment(0)
S
47

You need to hash about 2^64 values to get a single collision among them, on average, if you don't try to deliberately create collisions. Hash collisions are very similar to the Birthday problem.

If you look at two arbitrary values, the collision probability is only 2-128.

The problem with md5 is that it's relatively easy to craft two different texts that hash to the same value. But this requires a deliberate attack, and doesn't happen accidentally. And even with a deliberate attack it's currently not feasible to get a plain text matching a given hash.

In short md5 is safe for non security purposes, but broken in many security applications.

Suggestion answered 13/1, 2012 at 15:15 Comment(8)
2^(n/2) as predicted by the birthday problem.Suggestion
Due to this information, does it suitable to create documents ids for a system contains millions of documents based on their md5 hash of their respective content.? @SuggestionSienese
@sємsєм I'd rather use SHA256, but MD5 shouldn't be a problem as long as the documents are created by a benign party.Suggestion
I prefer md5 due to performance I think md5 is much quicker than SHA256, is not it? @SuggestionSienese
@sємsєм It is faster, but even SHA-2 and SHA-3 can handle several hundred MB/s on a desktop CPU. If that's still not good enough, you can look at Skein or Blake2, which are almost as fast as MD5 while still being secure. | Alternatively if you can use a secret key, HMAC-MD5 is still relatively secure.Suggestion
Great answer, thanks!Chouest
@Albert "that's 1 clash every X files" You can't really say it like that, because the probability scales quadratically with the number of files.Suggestion
Doesn't being attackable means that the algorithm is no longer perfectly random? Thus precluding "two arbitrary values, the collision probability is only 2-128"?Zerla
D
7

It generates a 128-bit value. The accidental clash rate should therefore be 2-64 (because of the Birthday Paradox).

Dunagan answered 13/1, 2012 at 15:16 Comment(1)
The collision probability because significant around 2^64 values, but the clash rate for two arbitrary values is only 2^-128.Suggestion

© 2022 - 2024 — McMap. All rights reserved.