John Millikin proposed a solution similar to this:
class A(object):
def __init__(self, a, b, c):
self._a = a
self._b = b
self._c = c
def __eq__(self, othr):
return (isinstance(othr, type(self))
and (self._a, self._b, self._c) ==
(othr._a, othr._b, othr._c))
def __hash__(self):
return hash((self._a, self._b, self._c))
The problem with this solution is that the hash(A(a, b, c)) == hash((a, b, c))
. In other words, the hash collides with that of the tuple of its key members. Maybe this does not matter very often in practice?
Update: the Python docs now recommend to use a tuple as in the example above. Note that the documentation states
The only required property is that objects which compare equal have the same hash value
Note that the opposite is not true. Objects which do not compare equal may have the same hash value. Such a hash collision will not cause one object to replace another when used as a dict key or set element as long as the objects do not also compare equal.
Outdated/bad solution
The Python documentation on __hash__
suggests to combine the hashes of the sub-components using something like XOR, which gives us this:
class B(object):
def __init__(self, a, b, c):
self._a = a
self._b = b
self._c = c
def __eq__(self, othr):
if isinstance(othr, type(self)):
return ((self._a, self._b, self._c) ==
(othr._a, othr._b, othr._c))
return NotImplemented
def __hash__(self):
return (hash(self._a) ^ hash(self._b) ^ hash(self._c) ^
hash((self._a, self._b, self._c)))
Update: as Blckknght points out, changing the order of a, b, and c could cause problems. I added an additional ^ hash((self._a, self._b, self._c))
to capture the order of the values being hashed. This final ^ hash(...)
can be removed if the values being combined cannot be rearranged (for example, if they have different types and therefore the value of _a
will never be assigned to _b
or _c
, etc.).
__key
function, this is about as fast as any hash can be. Sure, if the attributes are known to be integers, and there aren't too many of them, I suppose you could potentially run slightly faster with some home-rolled hash, but it likely wouldn't be as well distributed.hash((self.attr_a, self.attr_b, self.attr_c))
is going to be surprisingly fast (and correct), as creation of smalltuple
s is specially optimized, and it pushes the work of getting and combining hashes to C builtins, which is typically faster than Python level code. – Fulmis