One of the basic data structures in Python is the dictionary, which allows one to record "keys" for looking up "values" of any type. Is this implemented internally as a hash table? If not, what is it?
Yes, it is a hash mapping or hash table. You can read a description of python's dict implementation, as written by Tim Peters, here.
That's why you can't use something 'not hashable' as a dict key, like a list:
>>> a = {}
>>> b = ['some', 'list']
>>> hash(b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list objects are unhashable
>>> a[b] = 'some'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list objects are unhashable
You can read more about hash tables or check how it has been implemented in python and why it is implemented that way.
.keys()
can retrieve a list of keys. A real hash table wouldn't store keys, just hashes to save space. –
Mosby There must be more to a Python dictionary than a table lookup on hash(). By brute experimentation I found this hash collision:
>>> hash(1.1)
2040142438
>>> hash(4504.1)
2040142438
Yet it doesn't break the dictionary:
>>> d = { 1.1: 'a', 4504.1: 'b' }
>>> d[1.1]
'a'
>>> d[4504.1]
'b'
Sanity check:
>>> for k,v in d.items(): print(hash(k))
2040142438
2040142438
Possibly there's another lookup level beyond hash() that avoids collisions between dictionary keys. Or maybe dict() uses a different hash.
(By the way, this in Python 2.7.10. Same story in Python 3.4.3 and 3.5.0 with a collision at hash(1.1) == hash(214748749.8)
.)
(I haven't found any collisions in Python 3.9.6. Since the hashes are bigger -- hash(1.1) == 230584300921369601
-- I estimate it would take my desktop a thousand years to find one. So I'll get back to you on this.)
hash('I wandered lonely as a cloud, that drifts on high o\'er vales and hills, when all at once, I saw a crowd, a host of golden daffodils.')
This gives a 19-digit decimal - -4037225020714749784
if you're geeky enough to care. Continue in your own words, kids, and the hash is still a 19-digit number. I assume there is a limit on length of string you can hash in Python, but safe to say many more possible strings than possible values. And hash(False)
= 0 by the way. –
Inactivate Yes. Internally it is implemented as open hashing based on a primitive polynomial over Z/2 (source).
To expand upon nosklo's explanation:
a = {}
b = ['some', 'list']
a[b] = 'some' # this won't work
a[tuple(b)] = 'some' # this will, same as a['some', 'list']
© 2022 - 2024 — McMap. All rights reserved.
dict
implementation. – Aubyn