What column type/length should I use for storing a Bcrypt hashed password in a Database?
Asked Answered
Q

5

372

I want to store a hashed password (using BCrypt) in a database. What would be a good type for this, and which would be the correct length? Are passwords hashed with BCrypt always of same length?

EDIT

Example hash:

$2a$10$KssILxWNR6k62B7yiX0GAe2Q7wwHlrzhF3LqtVvpyvHZf0MwvNfVu

After hashing some passwords, it seems that BCrypt always generates 60 character hashes.

EDIT 2

Sorry for not mentioning the implementation. I am using jBCrypt.

Quicksilver answered 4/5, 2011 at 9:16 Comment(2)
Also see Openwall's PHP password hashing framework (PHPass). Its portable and hardened against a number of common attacks on user passwords. The guy who wrote the framework (SolarDesigner) is the same guy who wrote John The Ripper and sits as a judge in the Password Hashing Competition. So he knows a thing or two about attacks on passwords.Desrochers
If anyone falls on this looking for a solution for scrypt : Gumbo's answer also applies to scrypt. I personally applied BINARY(64) in MySQL and it allowed me to test for byte equality under Python later on.Predominant
J
419

The modular crypt format for bcrypt consists of

  • $2$, $2a$ or $2y$ identifying the hashing algorithm and format
  • a two digit value denoting the cost parameter, followed by $
  • a 53 characters long base-64-encoded value (they use the alphabet ., /, 09, AZ, az that is different to the standard Base 64 Encoding alphabet) consisting of:
    • 22 characters of salt (effectively only 128 bits of the 132 decoded bits)
    • 31 characters of encrypted output (effectively only 184 bits of the 186 decoded bits)

Thus the total length is 59 or 60 bytes respectively.

As you use the 2a format, you’ll need 60 bytes. And thus for MySQL I’ll recommend to use the CHAR(60) BINARYor BINARY(60) (see The _bin and binary Collations for information about the difference).

CHAR is not binary safe and equality does not depend solely on the byte value but on the actual collation; in the worst case A is treated as equal to a. See The _bin and binary Collations for more information.

Jesuit answered 4/5, 2011 at 11:11 Comment(8)
Be aware - storing as binary(60) can cause unexpected behavior for string equality (among other things). In .NET this can be overcome by using String.Equals(fromDataBaseBinary60string, typicalishString, StringComparison.InvariantCulture)Trichinopoly
If you define the column as CHAR(60) CHARACTER SET latin1 COLLATE latin1_bin, you now get the advantages of accurate string comparison without needing a binary column.Countertenor
About CHAR comparison, just use COLLATE SQL_Latin1_General_CP1_CS_AS in query or column specificationColston
@AndreFigueiredo SQL_Latin1_General_CP1_CS_AS is unknown in MySQL. What is known is latin1_general_cs.Jesuit
I'm not clear should we store as char which is not binary safe or binary(60) which has unexpected behaviour.....Stale
I don't get the reason why I should bother about binary or not binary at all, if I don't want to sort by password hash, compare hashes on database layer or make the password column unique (“Sorry, this password is already in use”) at all. I'd recommend using UTF8 everywhere. In that case there will be no problems, will there?Counteract
@Neon The problem is that you might compare different hashes to be equal. If you explicitly specify that it's a binary column (or a VARCHAR with the right collation), you don't run the risk of, somewhere else, changing some setting that makes it a case-insensitive compare. It also makes your intent more clear, which is generally a good thing -- you're storing binary data; you should store it as binary data.Pyrography
One shouldn't be comparing hashes for password checks. Every bcrypt library I've encountered also has a password check function. It's fine to store the password as a string (that is, an array of characters using CHAR or VARCHAR or VARCHAR2) and compare that string to the user supplied password using a library's password checking function.Amoeboid
C
61

A Bcrypt hash can be stored in a BINARY(40) column.

BINARY(60), as the other answers suggest, is the easiest and most natural choice, but if you want to maximize storage efficiency, you can save 20 bytes by losslessly deconstructing the hash. I've documented this more thoroughly on GitHub: https://github.com/ademarre/binary-mcf

Bcrypt hashes follow a structure referred to as modular crypt format (MCF). Binary MCF (BMCF) decodes these textual hash representations to a more compact binary structure. In the case of Bcrypt, the resulting binary hash is 40 bytes.

Gumbo did a nice job of explaining the four components of a Bcrypt MCF hash:

$<id>$<cost>$<salt><digest>

Decoding to BMCF goes like this:

  1. $<id>$ can be represented in 3 bits.
  2. <cost>$, 04-31, can be represented in 5 bits. Put these together for 1 byte.
  3. The 22-character salt is a (non-standard) base-64 representation of 128 bits. Base-64 decoding yields 16 bytes.
  4. The 31-character hash digest can be base-64 decoded to 23 bytes.
  5. Put it all together for 40 bytes: 1 + 16 + 23

You can read more at the link above, or examine my PHP implementation, also on GitHub.

Consuela answered 17/5, 2013 at 9:9 Comment(6)
Cost of longer field: 20 bytes times even a million + records: 20MB, once you reach a million records +. Cost of improperly implementing a shortened field length, in a highly complex security & engineering field: $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ You do the math.Coachandfour
@Kzqai, like I said, the larger 60-byte column is the most natural choice, but how aggressively to pursue storage efficiency is project dependent. For example, it's common to try to fit the entire database in memory, and 20 MB here and another 20 there can add up quickly in a memory-constrained environment.Consuela
Your example feeds into my point. --- If you want to put your database into memory, optimize every other column before touching the bcrypt storage column. --- If you have optimized every other column to insane degrees, and only the bcrypt hash column is left, get another gig of memory just for bcrypt. --- If you have done both of the above... ...stop, you haven't optimized every other column of low hanging fruit, and you're about to mess with a tested cryptographic security system that works, and replace it with a more complicated home-grown system with a chance of implementation failure.Coachandfour
@Coachandfour There is no risk of weakening the security of your Bcrypt library here. It's a data encoding that gets undone on retrieval from storage prior to the password check. This is not "don't roll your own crypto" territory.Consuela
Nice explanation. :) Though your explanation gave a great idea, I just want to go with 60 chars, even 100 chars, just to be on safe side. Nice debate too @Coachandfour and AndreDDamar
@AndreD I am so happy to find your spec. I was doing something similar in 41 bytes, by storing the scheme and cost separately. (And I also called it BMCF). Call me a weirdo, but I love that I can save 1 byte. To other readers: you don't actually need to "undo" the compression before checking.Papiamento
R
29

If you are using PHP's password_hash() with the PASSWORD_DEFAULT algorithm to generate the bcrypt hash (which I would assume is a large percentage of people reading this question) be sure to keep in mind that in the future password_hash() might use a different algorithm as the default and this could therefore affect the length of the hash (but it may not necessarily be longer).

From the manual page:

Note that this constant is designed to change over time as new and stronger algorithms are added to PHP. For that reason, the length of the result from using this identifier can change over time. Therefore, it is recommended to store the result in a database column that can expand beyond 60 characters (255 characters would be a good choice).

Using bcrypt, even if you have 1 billion users (i.e. you're currently competing with facebook) to store 255 byte password hashes it would only ~255 GB of data - about the size of a smallish SSD hard drive. It is extremely unlikely that storing the password hash is going to be the bottleneck in your application. However in the off chance that storage space really is an issue for some reason, you can use PASSWORD_BCRYPT to force password_hash() to use bcrypt, even if that's not the default. Just be sure to stay informed about any vulnerabilities found in bcrypt and review the release notes every time a new PHP version is released. If the default algorithm is ever changed it would be good to review why and make an informed decision whether to use the new algorithm or not.

Robertoroberts answered 20/7, 2015 at 21:30 Comment(0)
C
25

I don't think that there are any neat tricks you can do storing this as you can do for example with an MD5 hash.

I think your best bet is to store it as a CHAR(60) as it is always 60 chars long

Chartist answered 4/5, 2011 at 9:38 Comment(4)
Although, the PHP documentation notes that columns should be able to hold more data, for future releases...Versicle
No reason to gold plate. If the software that you are using requires sixty bytes, then allocate sixty bytes. If there is a future release to your software that changes this, then you can worry about it when that release happens. You shouldn't be automatically installing functionality-changing updates.Archiearchiepiscopacy
I think this is the best answer. There's no need to get into the complexity of the algorithm as the other answers do. All the details about binary, collation, etc. should be handled by whatever library is in use. 60 chars. That's the answer.Amoeboid
Note that with some databases (postgresql) the column "size" isn't mandatory.Linguini
I
1

I think best choice is nonbinary type, because in comparison is less combination and should be faster. If data is encoded with base64_encode then each position, each byte have only 64 possible values. If encoded with bin2hex each byte have only 16 possible values, but string is much longer. In binary byte have 256 position on each. I use for hashes in form of encode64 VARCHAR(255) column with ascii character set and the same collation. VARBINARY causes comparison problem as described in MySQL documentation. I don't know why answers advice to use VARBINARY have so many positives. I checked this on my author site, where measure time (just refresh to see).

Insomnia answered 6/12, 2022 at 0:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.