Securely Erasing Password in Memory (Python)
Asked Answered
I

7

53

How do you store a password entered by the user in memory and erase it securely after it is no longer need?

To elaborate, currently we have the following code:

username = raw_input('User name: ')
password = getpass.getpass()
mail = imaplib.IMAP4(MAIL_HOST)
mail.login(username, password)

After calling the login method, what do we need to do to fill the area of memory that contains password with garbled characters so that someone cannot recover the password by doing a core dump?

There is a similar question, however it is in Java and the solution uses character arrays: How does one store password hashes securely in memory, when creating accounts?

Can this be done in Python?

Iyar answered 8/4, 2009 at 1:13 Comment(3)
Near the bottom of this IBM article, they talk about using a mutable data structure instead of an immutable string.Zither
The link to the IBM article in the comment above doesn't work anymore, use an archived page.Elegist
I was trying to achieve something similar and came across this : sjoerdlangkemper.nl/2016/06/09/clearing-memory-in-pythonBabs
P
56

Python doesn't have that low of a level of control over memory. Accept it, and move on. The best you can do is to del password after calling mail.login so that no references to the password string object remain. Any solution that purports to be able to do more than that is only giving you a false sense of security.

Python string objects are immutable; there's no direct way to change the contents of a string after it is created. Even if you were able to somehow overwrite the contents of the string referred to by password (which is technically possible with stupid ctypes tricks), there would still be other copies of the password that have been created in various string operations:

  • by the getpass module when it strips the trailing newline off of the inputted password
  • by the imaplib module when it quotes the password and then creates the complete IMAP command before passing it off to the socket

You would somehow have to get references to all of those strings and overwrite their memory as well.

Poser answered 8/4, 2009 at 2:1 Comment(3)
Not to mention the possibility that the OS will swap your whole memory page out to disk, where it could sit for months.Legg
The swap issue is not python specific ofc, but here is a discussion about that part: security.stackexchange.com/questions/29350/…Corvin
If they can read the page file, your problems are way bigger than a password in the heap.Thermionic
C
21

There actually -is- a way to securely erase strings in Python; use the memset C function, as per Mark data as sensitive in python

Edited to add, long after the post was made: here's a deeper dive into string interning. There are some circumstances (primarily involving non-constant strings) where interning does not happen, making cleanup of the string value slightly more explicit, based on CPython reference counting GC. (Though still not a "scrubbing" / "sanitizing" cleanup.)

Cannon answered 18/11, 2009 at 4:32 Comment(5)
Note that this is OS-dependent. Windows and Linux code is given in the linked post.Reinhart
It's also highly dependent on internal interpreter details such as: id having the same value as the object pointer, the offset of string data from the object pointer, etc. Incredibly brittle; do not recommend.Pyroxenite
@ConradMeyer Of course it is. While this may be abstractly considered "brittle", and certainly no-one is recommending it, it does answer the question of "is this possible" better than the currently accepted answer beginning with "Python doesn't have that low of a level of control over memory. Accept it, and move on." which is absolutely false and unhelpful, as immediately demonstrated by the existence of ctypes. This solution is actually even worse than you might be suggesting; you would be modifying hashed data values application-wide and destroying the ability to represent certain strings.Cannon
I find the argument this answers "is it possible" better than the accepted answer pretty silly. As you mention, it totally breaks the interpreter; and additionally, it doesn't work with any other regular Python string functionality or libraries that make copies or temporary values. And it relies on something with even weaker type safety / warnings / errors than regular C. So you're better off just using C in the first place. I wouldn't characterize that as "possible in Python." I'm also not happy that the first answer is the correct one, but unfortunately, it is.Pyroxenite
@ConradMeyer "Just use C in the first place." 🤔 No. Honestly, memory scrubbing has never actually come up in my career developing Python web applications and devops systems, despite the fact that Heartbleed was actively wielded against my infrastructure. (Edited to add: all things are possible. Not all things are reasonable. "Can this be done?" is a yup. Next up: "Should this be done?" Probably not unless you have very explicit needs. That's a different question.)Cannon
W
6

The correct solution is to use a bytearray() ... which is mutable, and you can safely clear keys and sensitive material from RAM.

However, there are some libraries, notably the python "cryptography" library that prevent "bytearray" from being used. This is problematic... to some extent these cryptographic libraries should ensure that only mutable types be used for key material.

There is SecureString which is a pip module that allows you to fully remove a key from memory...(I refactored it a bit and called it SecureBytes). I wrote some unit tests that demonstrate that the key is fully removed.

But there is a big caveat: if someone's password is "type", then the word "type" will get wiped from all of python... including in function definitions and object attributes.

In other words... mutating immutable types is a terrible idea, and unless you're extremely careful, can immediately crash any running program.

The right solution is: never use immutable types for key material, passwords, etc. Anyone building a cryptographic library or routine like "getpass" should be working with a "bytearray" instead of python strings.

Woman answered 20/8, 2018 at 18:52 Comment(1)
As a follow up to this I ported the SecureString to work with integers and bytes (called SecureBytes). Both are horribly unsafe unless you are careful to work with crptographic key material... and not immutable things that could propagate to the rest of python. Tested on win/mac/linux.Woman
D
4

If you don't need the mail object to persist once you are done with it, I think your best bet is to perform the mailing work in a subprocess (see the subprocess module.) That way, when the subprocess dies, so goes your password.

Deborahdeborath answered 8/4, 2009 at 2:26 Comment(1)
Not unless actively scrubbed within that subprocess, or extremely luckily reallocated by the system to another process and overwritten rapidly enough, …and even then, in some circumstances through nearby memory cell inference — the value would persist and be reachable through things like spectre, heartbleed, and so forth.Cannon
G
0

This could be done using numpy chararray:

import numpy as np

username = raw_input('User name: ')
mail = imaplib.IMAP4(MAIL_HOST)
x = np.chararray((20,))
x[:] = list("{:<20}".format(raw_input('Password: ')))
mail.login(username, x.tobytes().strip())
x[:] = ''

You would have to determine the maximum size of password, but this should remove the data when it is overwritten.

Gaggle answered 23/7, 2015 at 14:47 Comment(1)
Unfortunately, you've already lost when raw_input() returns. And again when tobytes() is invoked. You've maybe erased one copy, but not either of those other copies.Pyroxenite
C
-4

EDIT: removed the bad advice...

You can also use arrays like the java example if you like, but just overwriting it should be enough.

http://docs.python.org/library/array.html

Conger answered 8/4, 2009 at 1:17 Comment(1)
All password = "somethingelse" does is remove the reference to the old password one line earlier. It doesn't actually overwrite anything.Poser
F
-5

Store the password in a list, and if you just set the list to null, the memory of the array stored in the list is automatically freed.

Fattish answered 8/4, 2009 at 1:20 Comment(3)
The level of indirection of storing the string in a list offers zero protection.Poser
Also, there is no specification to clear the memory after being freed. The memory will remain intact and will be vulnerable to being imaged or swapped to disk over time.Bavardage
There is a nice article on why this doesn't work properly: effbot.org/pyfaq/…Spectatress

© 2022 - 2024 — McMap. All rights reserved.