Using descriptors in unhashable classes - python
Asked Answered
D

1

6

A common design pattern when using python descriptors is to have the descriptor keep a dictionary of instances using that descriptor. For example, suppose I want to make an attribute that counts the number of times it's accessed:

class CountingAttribute(object):

    def __init__(self):
        self.count = 0
        self.value = None


class MyDescriptor(object):

    def __init__(self):
        self.instances = {} #instance -> CountingAttribute

    def __get__(self, inst, cls):
        if inst in self.instances:
           ca = self.instances[inst]
        else:
            ca = CountingAttribute()
            self.instances[inst] = ca
        ca.count += 1
        return ca


class Foo(object):
    x = MyDescriptor()


def main():
    f = Foo()
    f.x
    f.x
    print("f.x has been accessed %d times (including the one in this print)"%(f.x.count,))

if __name__ == "__main__":
    main()

This is a completely silly example that doesn't do anything useful; I'm trying to isolate the main point.

The problem is that I can't use this descriptor in a class which isn't hashable, because the line

self.instances[inst] = ca

uses instances as a dictionary key. Is there a wise way of handling this sort of case? For example, one immediately thinks to use the instance's id, but I'm not sure if doing that will break something about how hashes are supposed to be used.

EDIT: I realize that instances should be something like a weakref.WeakKeyDictionary but I'm trying to keep it simple here to focus on the issue of hashability.

Doglike answered 12/4, 2014 at 23:44 Comment(2)
My first thought was to try using a WeakKeyDictionary for instances. Unfortunately, this doesn't seem to fix the underlying problem of unhashability, but it's a good idea nonetheless. (Otherwise, the descriptor would keep an otherwise-dead object alive.)Viburnum
Yes, in my actual implementation I do use a WeakKeyDictionary. I didn't do that here because I am trying to keep the focus on the question at hand.Doglike
H
3

You could use id(inst) as a key.

Be aware that this doesn't cover the case that an object is destroyed and a new one is created with a new id.

In order to detect this properly, you should store the ca and a weakref in the dictionary. If you detect that the weakref's referred object is gone, you have to assume that the given id is reused.

Something like

import weakref

class MyDescriptor(object):

    def __init__(self):
        self.instances = {} #instance -> CountingAttribute

    def __get__(self, inst, cls):
        if inst is None: return self.instances # operating on the class, we get the dictionary.
        i = id(inst)
        if i in self.instances:
            ca, wr = self.instances[i]
            if wr() is None: del self.instances[i]
        if i not in self.instances:
            ca = CountingAttribute()
            self.instances[i] = (ca, weakref.ref(inst))
        ca.count += 1
        return ca

This relieves from the hashability problems conntected to a WeakKeyDictionary.

But maybe you don't need the dict at all. A completely different approach could be

class MyDescriptor(object):

    def __get__(self, inst, cls):
        if inst is None: return self, cls
        try:
            ca = inst.__the_ca
        except AttributeError:
            ca = inst.__the_ca = CountingAttribute()
        ca.count += 1
        return ca

This approach has its downsides as well. For example, you cannot easily use the descriptor more than once in a class without making it ugly as well. Thus, it should only be used with care. The first solution is, while more complex, the most uncomplicated one.

Haematocele answered 13/4, 2014 at 19:57 Comment(6)
To what is the weak reference referring? :)Doglike
In the second version, you'd probably be wise to use the __varname private variable syntax, so that you won't accidentally break things if the class you're put in already has a the_ca attribute that means something else. So: instance.__the_ca. This is exactly the kind of thing the __varname syntax is for!Equivalence
To be honest, I really don't like the second version. Having descriptors paste things onto the objects they manage seems... backwards. At the same time, I agree that with name mangling it's ok.Doglike
@Doglike Both versions have their downsides. I don't like it so much as well, but it is simpler than the other.Haematocele
@glglgl: A major problem with the second method is that you can't use the descriptor more than once in a class, because .__the_ca gets used by each one. This could be fixed by adding a label to each use of the descriptor, but now this gets ugly. This is an illustration of why I don't like descriptors pasting data onto the instances they manage, and a good reason to use the first method from your post.Doglike
@Doglike Here I agree. I'll edit that into the post.Haematocele

© 2022 - 2024 — McMap. All rights reserved.