I'm asking about the format used after the password is hashed and preparing it for storage. The dollar sign $
annotation is something that seems to be widespread. Is that described in a standard somewhere (including the identifiers for algorithms)?
For example, when using Go with golang.org/x/crypto/bcrypt
, it gives such an encoded string (playground):
func main() {
h, err := bcrypt.GenerateFromPassword([]byte("foo"), bcrypt.DefaultCost)
if err != nil {
panic(err)
}
fmt.Printf("%s", h)
// Output: $2a$10$g1d5KuvDIrRoUyWL2BQs7uLOWCzlM.zqbRm8o364u20p20YNmJ.Ve
}
However, other hashing packages like scrypt
(example) and argon2
return just the resulting hash. Using the argon2
shell command, there is an encoded string returned:
echo "foo" | argon2 saltsalt
Type: Argon2i
Iterations: 3
Memory: 4096 KiB
Parallelism: 1
Hash: d9e4f94546b9e5b0cfb2dbf9dad81d41371845d8b6a8c25ce7caf23e13f1ef72
Encoded: $argon2i$v=19$m=4096,t=3,p=1$c2FsdHNhbHQ$2eT5RUa55bDPstv52tgdQTcYRdi2qMJc58ryPhPx73I
0.005 seconds
Verification ok
I found a Go / argon2
specific blog post explaining this encoding, so far so good
Variations I found
My trouble lies with the definition of the dollar separated string, the portability and variations I found.
glibc
The man 3 crypt
page gives some pointers. There is a table of identifiers:
ID Method
───────────────────────────────────────────────────────────
1 MD5
2a Blowfish (not in mainline glibc; added in some Linux
distributions)
5 SHA-256 (since glibc 2.7)
6 SHA-512 (since glibc 2.7)
But this doesn't cover newer types, like argon2i
or scrypt
.
Then there are the example strings:
$id$salt$encrypted
$id$rounds=yyy$salt$encrypted
The latter being only supported after Glibc 2.7.
bcrypt
While bcrypt
uses the 2a
(blowfish) identifier from Glibc, its encoding seems to be slightly different as seen from the above example:
$2a$10$g1d5KuvDIrRoUyWL2BQs7uLOWCzlM.zqbRm8o364u20p20YNmJ.Ve
$id$cost$<dot seperated line of what exactly?>
argon2
Argon2 uses 5 fields and a full name identifier like argon2
$argon2i$v=19$m=4096,t=3,p=1$c2FsdHNhbHQ$2eT5RUa55bDPstv52tgdQTcYRdi2qMJc58ryPhPx73I
$id$version$parameters$salt$encrypted
why?
I want to write a package that hashes and verifies passwords in an algorithm agnostic way. Allowing the consumers to change parameters and algorithms without refactoring their code. Therefore during verification the package should be able to assert the algorithm used when storing the password. If stored version of parameters or algorithm is different than the one currently in use, the password is re-hashed and a new encoded string is returned.
As a bonus, I would like the package to have the ability to re-hash "legacy" passwords which might have been stored by older (not go) applications. For instance, md5
. In order to do all this I would like to have a deeper understanding of the storage format itself.