Is there a standard for using PBKDF2 as a password hash?
Asked Answered
U

2

13

Join me in the fight against weak password hashes.

A PBKDF2 password hash should contain the salt, the number of iterations, and the hash itself so it's possible to verify later. Is there a standard format, like RFC2307's {SSHA}, for PBKDF2 password hashes? BCRYPT is great but PBKDF2 is easier to implement.

Apparently, there's no spec. So here's my spec.

>>> from base64 import urlsafe_b64encode
>>> password = u"hashy the \N{SNOWMAN}"
>>> salt = urlsafe_b64decode('s8MHhEQ78sM=')
>>> encoded = pbkdf2_hash(password, salt=salt)
>>> encoded
'{PBKDF2}1000$s8MHhEQ78sM=$hcKhCiW13OVhmLrbagdY-RwJvkA='

Update: http://www.dlitz.net/software/python-pbkdf2/ defines a crypt() replacement. I updated my little spec to match his, except his starts with $p5k2$ instead of {PBKDF2}. (I have the need to migrate away from other LDAP-style {SCHEMES}).

That's {PBKDF2}, the number of iterations in lowercase hexadecimal, $, the urlsafe_base64 encoded salt, $, and the urlsafe_base64 encoded PBKDF2 output. The salt should be 64 bits, the number of iterations should be at least 1000, and the PBKDF2 with HMAC-SHA1 output can be any length. In my implementation it is always 20 bytes (the length of a SHA-1 hash) by default.

The password must be encoded to utf-8 before being sent through PBKDF2. No word on whether it should be normalized into Unicode's NFC.

This scheme should be on the order of iterations times more costly to brute force than {SSHA}.

Uncaused answered 24/9, 2009 at 18:19 Comment(1)
Some time after I wrote this, I became aware of the SHA-based crypt replacement used in RedHat Linux. It is probably better designed than the PBKDF2 scheme in question. en.wikipedia.org/wiki/Crypt_%28Unix%29#SHA-based_schemeUncaused
C
4

There is a specification for the parameters (salt and iterations) of PBKDF2, but it doesn't include the hash. This is included in PKCS #5 version 2.0 (see Appendix A.2). Some platforms have built-in support for encoding and decoding this ASN.1 structure.

Since PBKDF2 is really a key derivation function, it doesn't make sense for it to specify a way to bundle the "hash" (which is the really a derived key) together with the derivation parameters—in normal usage, the key must remain secret, and is never stored.

But for usage as a one-way password hash, the hash can be stored in a record with the parameters, but in its own field.

Catastrophism answered 24/9, 2009 at 18:34 Comment(0)
R
4

I'll join you in the fight against weak hashes.

OWASP has a Password Storage Cheat Sheet (https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet) with some guidance; they recommend 64,000 PBKDF2 iterations minimum as of 2012, doubling every two years (i.e. 90,510 in 2012).

Note that a storing a long, cryptographically random salt per-userid is always basic.

Note that having a widely variable per-userid number of iterations and storing the number of iterations along with the salt will add some complexity to cracking software, and may help preclude certain optimizations. For instance, "bob" gets encrypted with 135817 iterations, while "alice" uses 95,121 iterations, i.e. perhaps a minimum of(90510 + RAND(90510)) for 2013.

Note also that all of this is useless if users are allowed to choose weak passwords like "password", "Password1!", "P@$$w0rd", and "P@$$w0rd123", all of which will be found by rules based dictionary attacks very quickly indeed (the latter is simply "password" with the following rules: uppercase first letter, 1337-speak, add a three digit number to the end). Take a basic dictionary list (phpbb, for a good, small starter wordlist) and apply rules like this to it, and you'll crack a great many passwords where people try "clever" tricks.

Therefore, when checking new passwords, don't just apply "All four of upper, lower, number, digit, at least 11 characters long", since "P@$$w0rd123" complies with this seemingly very tough rule. Instead, use that basic dictionary list and see if basic rules would crack it (it's a lot simpler than actually trying a crack - you can lower-case your list and their word, and then simply write code like "if the last 4 characters are a common year, check all but the last four characters against the wordlist", and "if the last 3 characters are digits, check all but the last 3 characters against the wordlist" and "check all but the last two characters against the wordlist" and "De-1337 the password - turn @'s into a, 3 into e, and so on, and then check it against the wordlist and try those other rules too."

As far as passphrases go, in general are a great idea, particularly if some other characters are added to the middle of words, but if and only if they're long enough, since you're giving up a lot of possible combinations.

Note that modern machines with GPU's are up to the tens of billions of hash iterations (MD5, SHA1, SHA-256, SHA-512, etc.) per second, even in 2012. As far as word combination "correct horse battery staple" type passwords, this one is at best a very modest password- it's only 4 all lower case English words of length 7 or less with spaces. So, if we go looking for XKCD style passwords with an 18 billion guess a second setup: A modern small american english dictionary has: 6k words of length 5 or less 21k words of length 7 or less 36k words of length 9 or less 46k words of length 11 or less 49k words of length 13 or less

With an XKCD style passphrase, and without bothering to filter words by popularity ("correct" vs. "chair's" vs. "dumpier" vs. "hemorrhaging") we have 21k^4, which is only about 2E17 possibilities. With the 18 billion/sec setup (a single machine with 8 GPU's if we're facing a single SHA1 iteration), that's about 4 months to exhaustively search the keyspace. If we had ten such setups, that's about two weeks. If we excluded unlikely words like "dumpier", that's a lot faster for a quick first pass.

Now, if you get words out of a "huge" linux american english wordlist, like "Balsamina" or "Calvinistically" (both chosen by using the "go to row" feature", then we'd have 30k words of length 5 or less 115k words of length 7 or less 231k words of length 9 or less 317k words of length 11 or less 362k words of length 13 or less

Even with the 7 length max limit, with this huge dictionary as a base and randomly chosen words, we have 115k^4 ~= 1.8E20 possibilities, or about 12 years if the setup is kept up to date (doubling in power every 18 months). This is extremely similar to a 13 character, lower case + number only password. "300 years" is what most estimates will tell you, but they fail to take Moore's Law into account.

Raneeraney answered 2/1, 2013 at 16:30 Comment(2)
That cheat sheet looks useful. Your post ends mid-sentence, can you fix it? Also, how do you feel about passPHRASES instead of passWORDS? See xkcd.com/936Bargain
I just merged the two half-answers by @Raneeraney into a single answer. I haven't made any changes, just copied the other post and pasted it in here.Handkerchief

© 2022 - 2024 — McMap. All rights reserved.