Salt and hash a password in Python
Asked Answered
H

10

135

This code is supposed to hash a password with a salt. The salt and hashed password are being saved in the database. The password itself is not.

Given the sensitive nature of the operation, I wanted to make sure everything was kosher.

import hashlib
import base64
import uuid

password = 'test_password'
salt     = base64.urlsafe_b64encode(uuid.uuid4().bytes)


t_sha = hashlib.sha512()
t_sha.update(password+salt)
hashed_password =  base64.urlsafe_b64encode(t_sha.digest())
Historiography answered 7/3, 2012 at 0:34 Comment(5)
Why are you b64 encoding the salt? It would be simpler just to use the salt directly and then b64 encode both together t_sha.digest() + salt. You can split the salt out again later when you've decoded the salted hash password as you know the decoded hashed password is exactly 32 bytes.Escaut
@Escaut - I base64 encoded the salt so I could do strong operations on it without having to worry about weird issues. Will the "bytes" version work as a string? If that is the case, then I don't need to base64 encode t_sha.digest() either. I probably wouldn't save the hashed password and the salt together just because that seems a little more complicated and a little less readable.Historiography
If you're using Python 2.x then the bytes object will work perfectly well as a string. Python doesn't put any restrictions on what you can have in a string. However the same might not apply if you pass the string to any external code such as a database. Python 3.x distinguishes byte types and strings so in that case you wouldn't want to use string operations on the salt.Escaut
I can't tell you how to do it in python, but plain SHA-512 is a bad choice. Use a slow hash such as PBKDF2, bcrypt or scrypt.Homochromatic
Side note: I'd advise against using UUIDs as a source of cryptographic randomness. Yes, the implementation used by CPython is cryptographically secure, but that's not dictated by Python's spec nor the UUID spec, and vulnerable implementations exist. If your codebase gets run using a Python implementation without secure UUID4s, you'll have weakened your security. That may be an unlikely scenario, but it costs nothing to use secrets instead.Pacer
U
60

EDIT: This answer is wrong. A single iteration of SHA512 is fast, which makes it inappropriate for use as a password hashing function. Use one of the other answers here instead.


Looks fine by me. However, I'm pretty sure you don't actually need base64. You could just do this:

import hashlib, uuid
salt = uuid.uuid4().hex
hashed_password = hashlib.sha512(password + salt).hexdigest()

If it doesn't create difficulties, you can get slightly more efficient storage in your database by storing the salt and hashed password as raw bytes rather than hex strings. To do so, replace hex with bytes and hexdigest with digest.

Ubald answered 7/3, 2012 at 2:44 Comment(8)
Yes, hex would work just fine. I prefer base64 because the strings are a little shorter. Its more efficient to pass around and do operations on shorter strings.Historiography
Now, how do you reverse it to get the password back?Milreis
You don't reverse it, you never reverse a password. That's why we hash it and we don't encrypt it. If you need to compare an input password with a stored password, you hash the input and compare the hashes. If you encrypt a password anyone with the key can decrypt it and see it. It's not safeFootrace
uuid.uuid4().hex is different each time it is generated. How are you going to compare a password for checking purposes if you can't get the same uuid back?Bogusz
@Bogusz I think salt is stored in the database and the salty hashed password too.Hydrophobic
@Bogusz , salt values are stored in the database along with the username, and hashed value of his/her corresponding password. When the user types the password, it is actually hashed, this hashed value is concatenated with the salt value, which is stored in the database. If it is matched, then the login is considered as successful.Forelli
Rather than leaving an edit on how your accepted answer is wrong, you should come back and fix it...Fingerbreadth
if the answer ir wrong - delete it!Announcement
H
112

Based on the other answers to this question, I've implemented a new approach using bcrypt.

Why use bcrypt

If I understand correctly, the argument to use bcrypt over SHA512 is that bcrypt is designed to be slow. bcrypt also has an option to adjust how slow you want it to be when generating the hashed password for the first time:

# The '12' is the number that dictates the 'slowness'
bcrypt.hashpw(password, bcrypt.gensalt( 12 ))

Slow is desirable because if a malicious party gets their hands on the table containing hashed passwords, then it is much more difficult to brute force them.

Implementation

def get_hashed_password(plain_text_password):
    # Hash a password for the first time
    #   (Using bcrypt, the salt is saved into the hash itself)
    return bcrypt.hashpw(plain_text_password, bcrypt.gensalt())

def check_password(plain_text_password, hashed_password):
    # Check hashed password. Using bcrypt, the salt is saved into the hash itself
    return bcrypt.checkpw(plain_text_password, hashed_password)

Notes

I was able to install the library pretty easily in a linux system using:

pip install py-bcrypt

However, I had more trouble installing it on my windows systems. It appears to need a patch. See this Stack Overflow question: py-bcrypt installing on win 7 64bit python

Historiography answered 20/5, 2014 at 19:35 Comment(4)
12 is the default value for gensaltTactics
According to pypi.python.org/pypi/bcrypt/3.1.0, the maximum password length for bcrypt is 72 bytes. Any characters beyond that are ignored. For this reason, they recommend hashing with a cryptographic hash function first and to then base64-encode the hash (see the link for details). Side remark: It seems py-bcrypt is the old pypi package and has since been renamed to bcrypt.Rinker
Hashing is ok, but check_password always giving false in djanoAzoic
This worked for me in Python 3.8.5, but I had to encode the password as bytes as follows: plain_text_password.encode('utf-8')Biel
U
60

EDIT: This answer is wrong. A single iteration of SHA512 is fast, which makes it inappropriate for use as a password hashing function. Use one of the other answers here instead.


Looks fine by me. However, I'm pretty sure you don't actually need base64. You could just do this:

import hashlib, uuid
salt = uuid.uuid4().hex
hashed_password = hashlib.sha512(password + salt).hexdigest()

If it doesn't create difficulties, you can get slightly more efficient storage in your database by storing the salt and hashed password as raw bytes rather than hex strings. To do so, replace hex with bytes and hexdigest with digest.

Ubald answered 7/3, 2012 at 2:44 Comment(8)
Yes, hex would work just fine. I prefer base64 because the strings are a little shorter. Its more efficient to pass around and do operations on shorter strings.Historiography
Now, how do you reverse it to get the password back?Milreis
You don't reverse it, you never reverse a password. That's why we hash it and we don't encrypt it. If you need to compare an input password with a stored password, you hash the input and compare the hashes. If you encrypt a password anyone with the key can decrypt it and see it. It's not safeFootrace
uuid.uuid4().hex is different each time it is generated. How are you going to compare a password for checking purposes if you can't get the same uuid back?Bogusz
@Bogusz I think salt is stored in the database and the salty hashed password too.Hydrophobic
@Bogusz , salt values are stored in the database along with the username, and hashed value of his/her corresponding password. When the user types the password, it is actually hashed, this hashed value is concatenated with the salt value, which is stored in the database. If it is matched, then the login is considered as successful.Forelli
Rather than leaving an edit on how your accepted answer is wrong, you should come back and fix it...Fingerbreadth
if the answer ir wrong - delete it!Announcement
R
48

Edit:

The library suggested in this answer is now outdated, and the hashlib key derivation functionality mentioned in this answer: https://mcmap.net/q/167044/-salt-and-hash-a-password-in-python is a good suggestion to use nowadays.

Original Answer The smart thing is not to write the crypto yourself but to use something like passlib: https://passlib.readthedocs.io/en/stable/#

It is easy to mess up writing your crypto code in a secure way. The nasty thing is that with non crypto code you often immediately notice it when it is not working since your program crashes. While with crypto code you often only find out after it is to late and your data has been compromised. Therefore I think it is better to use a package written by someone else who is knowledgeable about the subject and which is based on battle tested protocols.

Also passlib has some nice features which make it easy to use and also easy to upgrade to a newer password hashing protocol if an old protocol turns out to be broken.

Also just a single round of sha512 is more vulnerable to dictionary attacks. sha512 is designed to be fast and this is actually a bad thing when trying to store passwords securely. Other people have thought long and hard about all this sort issues so you better take advantage of this.

Ratline answered 8/6, 2012 at 12:11 Comment(3)
I suppose the advice of using crypo libraries is good, but the OP is already using hashlib, a crypto library which is also in the Python standard library (unlike passlib). I would continue to use hashlib if I were in the OPs situation.Miskolc
@dghubble hashlib is for cryptographic hash functions. passlib is for securely storing passwords. They're not the same thing (although a lot of people seem to think so.. and then their users passwords get cracked).Grapple
In case anyone is wondering: passlib generates its own salt, which is stored in the returned hash string (at least for certain schemes such as BCrypt+SHA256) - so you don't need to worry about it.Diabolism
P
41

As of Python 3.4, the hashlib module in the standard library contains key derivation functions which are "designed for secure password hashing".

So use one of those, like hashlib.pbkdf2_hmac, with a salt generated using os.urandom:

from typing import Tuple
import os
import hashlib
import hmac

def hash_new_password(password: str) -> Tuple[bytes, bytes]:
    """
    Hash the provided password with a randomly-generated salt and return the
    salt and hash to store in the database.
    """
    salt = os.urandom(16)
    pw_hash = hashlib.pbkdf2_hmac('sha256', password.encode(), salt, 100000)
    return salt, pw_hash

def is_correct_password(salt: bytes, pw_hash: bytes, password: str) -> bool:
    """
    Given a previously-stored salt and hash, and a password provided by a user
    trying to log in, check whether the password is correct.
    """
    return hmac.compare_digest(
        pw_hash,
        hashlib.pbkdf2_hmac('sha256', password.encode(), salt, 100000)
    )

# Example usage:
salt, pw_hash = hash_new_password('correct horse battery staple')
assert is_correct_password(salt, pw_hash, 'correct horse battery staple')
assert not is_correct_password(salt, pw_hash, 'Tr0ub4dor&3')
assert not is_correct_password(salt, pw_hash, 'rosebud')

Note that:

  • The use of a 16-byte salt and 100000 iterations of PBKDF2 match the minimum numbers recommended in the Python docs. Further increasing the number of iterations will make your hashes slower to compute, and therefore more secure.
  • os.urandom always uses a cryptographically secure source of randomness
  • hmac.compare_digest, used in is_correct_password, is basically just the == operator for strings but without the ability to short-circuit, which makes it immune to timing attacks. That probably doesn't really provide any extra security value, but it doesn't hurt, either, so I've gone ahead and used it.

For theory on what makes a good password hash and a list of other functions appropriate for hashing passwords with, see https://security.stackexchange.com/q/211/29805.

Pacer answered 6/7, 2019 at 15:8 Comment(3)
@stackoverflow.com/users/84131/chris-dutrow , this should be the accepted answer for python code starting with version 3.4Gesticulate
Note that hashlib.pbkdf2_hmac() has been deprecated since Python 3.10.Dubbing
No, the actual note is that the "Slow Python implementation of pbkdf2_hmac is deprecated" and that the (faster) pbkdf2_hmac function "will only be available when Python is compiled with OpenSSL".Headway
L
27

For this to work in Python 3 you'll need to UTF-8 encode for example:

hashed_password = hashlib.sha512(password.encode('utf-8') + salt.encode('utf-8')).hexdigest()

Otherwise you'll get:

Traceback (most recent call last):
File "", line 1, in
hashed_password = hashlib.sha512(password + salt).hexdigest()
TypeError: Unicode-objects must be encoded before hashing

Ligon answered 10/6, 2012 at 15:12 Comment(4)
No. Don't use any sha hash function for hashing passwords. Use something like bcrypt. See the comments to other questions for the reason.Longanimity
@Longanimity Which comments?Metheglin
@CoolCloud ctrl+f "Why use bcrypt"Longanimity
@Longanimity Great, thanks :)Metheglin
S
12

passlib seems to be useful if you need to use hashes stored by an existing system. If you have control of the format, use a modern hash like bcrypt or scrypt. At this time, bcrypt seems to be much easier to use from python.

passlib supports bcrypt, and it recommends installing py-bcrypt as a backend: http://pythonhosted.org/passlib/lib/passlib.hash.bcrypt.html

You could also use py-bcrypt directly if you don't want to install passlib. The readme has examples of basic use.

see also: How to use scrypt to generate hash for password and salt in Python

Straightlaced answered 28/8, 2013 at 13:9 Comment(0)
C
8

I don' want to resurrect an old thread, but... anyone who wants to use a modern up to date secure solution, use argon2.

https://pypi.python.org/pypi/argon2_cffi

It won the the password hashing competition. ( https://password-hashing.net/ ) It is easier to use than bcrypt, and it is more secure than bcrypt.

Comedietta answered 14/8, 2017 at 16:56 Comment(0)
F
1
import bcrypt

password = "You know my name"
salt = bcrypt.gensalt(2)
hashed = bcrypt.hashpw(password, salt)

hashed2 = bcrypt.hashpw("password1", hashed)

if hashed2 == hashed:
    print("It matches")
else:
    print("It does not match")

Ref: https://github.com/erlichmen/py-bcrypt/blob/master/simple_test.py

pip install py-bcrypt
Fredia answered 7/2 at 1:0 Comment(0)
P
0

I did the same thing in NodeJs before:

 echo "console.log(require('crypto').createHmac('sha256', 'salt').update('password').digest('hex'))" | node

it's equivalent in python is:

python3 -c 'import hashlib;import base64;import hmac;print(hmac.new(b"salt", "password".encode(), hashlib.sha256).hexdigest())'

And the equivalent shell command is:

echo -n "password" | openssl sha256 -hmac "salt"
Petropavlovsk answered 10/3, 2022 at 10:11 Comment(0)
D
0

Personally i'd use the 'swiftcrypt' module.

pip install swiftcrypt

It has many functions for security-related reasons.

import swiftCrypt

password= "coolPassword here"

salt = swiftCrypt.Salts().generate_salt(14)
hashedPass = swiftCrypt.Hash().hash_password(password,salt,"sha256")

print(hashedPass)

You can generate a salt with a custom length, if you want a random length simply leave it empty.

It will create a hashed password using the salt and an algorithm of your choice.

You can then verify the password like this:

verifiedPass = swiftCrypt.Checker().verify_password(password, hashedPass, salt,"sha256")
if verifiedPass == True:
    print("Password is correct!")
else:
    print("Password is incorrect!")                             

The verify_password method takes the password that the user entered, the stored hashed password, the salt used to create the password, and the algorithm used.

Dunigan answered 28/8, 2023 at 19:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.