Creating a namedtuple with a custom hash function
Asked Answered
T

3

29

Say I have a namedtuple like this:

FooTuple = namedtuple("FooTuple", "item1, item2")

And I want the following function to be used for hashing:

foo_hash(self):
    return hash(self.item1) * (self.item2)

I want this because I want the order of item1 and item2 to be irrelevant (I will do the same for the comparison-operator). I thought of two ways to do this. The first would be:

FooTuple.__hash__ = foo_hash

This works, but it feels hacked. So I tried subclassing FooTuple:

class EnhancedFooTuple(FooTuple):
    def __init__(self, item1, item2):
        FooTuple.__init__(self, item1, item2)

    # custom hash function here

But then I get this:

DeprecationWarning: object.__init__() takes no parameters

So, what can I do? Or is this a bad idea altogether and I should just write my own class from scratch?

Thedrick answered 11/7, 2010 at 13:37 Comment(0)
A
34

I think there is something wrong with your code (my guess is that you created an instance of the tuple with the same name, so fooTuple is now a tuple, not a tuple class), because subclassing the named tuple like that should work. Anyway, you don't need to redefine the constructor. You can just add the hash function:

In [1]: from collections import namedtuple

In [2]: Foo = namedtuple('Foo', ['item1', 'item2'], verbose=False)

In [3]: class ExtendedFoo(Foo):
   ...:     def __hash__(self):
   ...:         return hash(self.item1) * hash(self.item2)
   ...: 

In [4]: foo = ExtendedFoo(1, 2)

In [5]: hash(foo)
Out[5]: 2
Addison answered 11/7, 2010 at 13:45 Comment(2)
Note that repr(foo) will still be talking of Foo. This could be done better as class Foo(namedtuple('Foo', ['item1', 'item2'], verbose=False)):Sheriesherif
pay attention to @Sven's answer hereVehement
S
19

Starting in Python 3.6.1, this can be achieved more cleanly with the typing.NamedTuple class (as long as you are OK with type hints):

from typing import NamedTuple, Any


class FooTuple(NamedTuple):
    item1: Any
    item2: Any

    def __hash__(self):
        return hash(self.item1) * hash(self.item2)
Stranger answered 8/10, 2019 at 21:10 Comment(3)
Starting in Python 3.7 I'm not sure what NamedTuple brings to the party compared to using a dataclassNano
@MartinCR - for one thing, tuples are guaranteed to be immutable. Yes, you can make a dataclass frozen, but it is still possible to modify it by pealing away one layer of abstraction. Regardless, they may work for most cases, but this question was specifically about namedtuples.Stranger
@MartinCR Named tuples are iterable and therefore can be unpacked. Dataclasses can’t.Weatherproof
U
0

A namedtuple with a custom __hash__ function is useful to store immutable data models into dict and set

For example:

class Point(namedtuple('Point', ['label', 'lat', 'lng'])):
    def __eq__(self, other):
        return self.label == other.label

    def __hash__(self):
        return hash(self.label)

    def __str__(self):
        return ", ".join([str(self.lat), str(self.lng)])

Override both __eq__ and __hash__ allows grouping businesses into a set, ensuring that each business line is unique in the collection:

walgreens = Point(label='Drugstore', lat = 37.78735890, lng = -122.40822700)
mcdonalds = Point(label='Restaurant', lat = 37.78735890, lng = -122.40822700)
pizza_hut = Point(label='Restaurant', lat = 37.78735881, lng = -122.40822713)

businesses = [walgreens, mcdonalds, pizza_hut]
businesses_by_line = set(businesses)

assert len(business) == 3
assert len(businesses_by_line) == 2
Unpretentious answered 12/4, 2020 at 14:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.