How to access an element of a set using an equivalent object?

Asked 23/12, 2011 at 12:48 Answered 25/3, 2024 at 12:33

If I have an object that compares equal to an element of a Python set, but is not the same object, is there a reasonable way to get a reference to the object in the set? The use case would be using the set to identify and share duplicated data.

Example (Python 2.7):

>>> a = "This is a string"
>>> b = "This is a string"
>>> a is b
False
>>> a == b
True
>>> s = set((a,))
>>> b in s
True

How to get a reference to a using b and s? I can think of one way, but I'm not sure if it is not implementation-dependent whether you get a or b. EDIT: This does not work when s has more than one element; intersection is quite naturally implemented something like [x for x in smaller_set if x in larger_set]

>>> for x in set((b,)).intersection(s): c = x
...
>>> c is a
True

Perhaps a good workaround would be to use a dict that maps each key to itself, instead of the set.

Loeffler answered 23/12, 2011 at 12:48 Comment(4)

If you need a specific one of two equal, hashable objects, it seems likely that the objects shouldn't be equal and/or hashable. Why do you need this? – Intake 23/12, 2011 at 12:54

I think your suspicions are justified: pypy 1.7.0 and ironpython 3.0 both (can) return False for your final c is a. – Barmen 23/12, 2011 at 12:59

I could save memory by changing references to equal object to references to the same object. – Loeffler 23/12, 2011 at 13:1

I'll be fine using a dict, like I mention in the last sentence. I'm asking if I'm missing something here. – Loeffler 23/12, 2011 at 13:5

I found a similar question on python-list: Get item from set. There is a clever answer with reference to get_equivalent(container, item) (Python recipe).

The trick is to construct a wrapper object for the 'key' object, and check if wrapper is in the set using the in operator. If the wrapper hashes equal to the key, its __eq__ method can gain access to the object in the set, and save a reference to it. An important point from the discussion is that the __eq__ method of the set elements must return NotImplemented for unrecognized types, otherwise the wrapper's __eq__ may not get called.

Loeffler answered 29/12, 2011 at 20:32 Comment(0)

Your use case sounds like it is a use case for dictionaries. Use, as keys, the attribute of the object that compares equal to the "foreign" object, and as values the desired objects themselves.

If it is a simple use case, and you can have a linear search, however, you could do the obvious - it would not be bad:

def get_equal(in_set, in_element):
   for element in in_set:
       if element == in_element:
           return element
   return None

If you need what exactly what you ar asking for (I can wonder some use cases for that) - the way to go is to create a custom dictionary class that has a set as one of its members, do implement proxy methods to the member set, and in both dictionary and set methods, keeps sync of both the dictionary and set contents. This would be time consuming to implement right, but relatively straightforward, and have a O(1) time.

If having to copy references to all data around is not an issue (this is linear, but probably worst than the straightforward search above), you can use the expression

(data - (data - {key})).pop()

as in:

In [40]: class A:
    ...:     def __init__(self, id, extra):
    ...:         self.id = id
    ...:         self.extra = extra
    ...:     def __eq__(self, other):
    ...:         return self.id == other.id
    ...:     def __hash__(self):
    ...:         return hash(self.id)
    ...:     def __repr__(self):
    ...:         return f"({self.id}, {self.extra})"
    ...: 
    ...: 

In [41]: data = set(A(i, "initial") for i in range(10))

In [42]: (data - (data - {A(5, None)})).pop()
Out[42]: (5, initial)

Garbe answered 23/12, 2011 at 13:39 Comment(0)

Here's a quick solution I made by exploiting the behavior of the 'eq' and 'contains' methods. Code comments are (hopefully) self documenting, but it's otherwise pretty simple.

As the example shows, it works for set, dict and list, and in theory could work for any object that implements 'contains'.

import typing as _ts
from typing import Any


class Getter:
    __slots__ = "key", "value"

    def __init__(self, key, value=None):
        self.key = key
        self.value = value

    def __repr__(self):
        return "{}({}, {})".format(
            type(self).__name__,
            repr(self.key), repr(self.value),
        )

    def __hash__(self):
        return hash(self.key)

    def __eq__(self, other):
        self.value = other
        return self.key == other


RAISES = object()


def getkey(keyed: _ts.Container, key: Any, default: Any = RAISES):
    getter = Getter(key)
    if getter in keyed:
        # providing '__contains__' is implemented to call
        #  the '__eq__' method (which in any sane case it
        #  should be), this results in our special
        #  'Getter.__eq__' method being called with the
        #  element we're trying to get as the 'other' argument
        return getter.value
    if default is RAISES:
        raise KeyError(key)
    return default


if __name__ == '__main__':
    # testing

    class T(int):
        def __repr__(self):
            return "T({})".format(int.__repr__(self))

    def main():
        # works for both builtin set and dict
        hm1 = {T(1), T(2), T(3)}
        hm2 = {T(1): 1, T(2): 2, T(3): 3}
        print(getkey(hm1, 2))
        print(getkey(hm2, 2))
        # should print "T(2)"

        # even works for list
        lst = [T(1), T(2), T(3)]
        print(getkey(lst, 3))
        # should print "T(3)"

        # in theory could work for any type that
        #  implements '__contains__' by calling '__eq__'

    main()

Otolith answered 25/3, 2024 at 12:33 Comment(0)

Recommended topics

Hot tags