How to properly subclass dict and override __getitem__ & __setitem__

Asked 6/3, 2010 at 0:24 Answered 2/11, 2020 at 13:32

Solved python dictionary inheritance subclass

104

I am debugging some code and I want to find out when a particular dictionary is accessed. Well, it's actually a class that subclasses dict and implements a couple extra features. Anyway, what I would like to do is subclass dict myself and override __getitem__ and __setitem__ to produce some debugging output. Right now, I have

class DictWatch(dict):
    def __init__(self, *args):
        dict.__init__(self, args)

    def __getitem__(self, key):
        val = dict.__getitem__(self, key)
        log.info("GET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
        return val

    def __setitem__(self, key, val):
        log.info("SET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
        dict.__setitem__(self, key, val)

'name_label' is a key which will eventually be set that I want to use to identify the output. I have then changed the class I am instrumenting to subclass DictWatch instead of dict and changed the call to the superconstructor. Still, nothing seems to be happening. I thought I was being clever, but I wonder if I should be going a different direction.

Enmity answered 6/3, 2010 at 0:24 Comment(4)

Did you try to use print instead of log? Also, could you explain how do you create/configure you log? – Technical 6/3, 2010 at 0:39

Doesn't dict.__init__ take *args? – Contour 28/10, 2017 at 23:4

Looks a bit like a good candidate for a decorator. – Contour 28/10, 2017 at 23:6

realpython.com/inherit-python-dict – Kwarteng 18/10, 2023 at 2:5

What you're doing should absolutely work. I tested out your class, and aside from a missing opening parenthesis in your log statements, it works just fine. There are only two things I can think of. First, is the output of your log statement set correctly? You might need to put a logging.basicConfig(level=logging.DEBUG) at the top of your script.

Second, __getitem__ and __setitem__ are only called during [] accesses. So make sure you only access DictWatch via d[key], rather than d.get() and d.set()

Batrachian answered 6/3, 2010 at 0:42 Comment(4)

Actually it's not extra parens, but a missing opening paren around (str(dict.get(self, 'name_label')), str(key), str(val))) – Cistercian 6/3, 2010 at 0:44

True. To the OP: For future reference, you can simply do log.info('%s %s %s', a, b, c), instead of a Python string formatting operator. – Batrachian 6/3, 2010 at 0:50

Logging level ended up being the issue. I'm debugging someone else's code and I was originally testing in another file which head a different level of debugging set. Thanks! – Enmity 6/3, 2010 at 3:1

What is dict.set? It doesn't exist. dict don't have a set attribute. – Kwarteng 18/10, 2023 at 2:3

Another issue when subclassing dict is that the built-in __init__ doesn't call update, and the built-in update doesn't call __setitem__. So, if you want all setitem operations to go through your __setitem__ function, you should make sure that it gets called yourself:

class DictWatch(dict):
    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)

    def __getitem__(self, key):
        val = dict.__getitem__(self, key)
        print('GET', key)
        return val

    def __setitem__(self, key, val):
        print('SET', key, val)
        dict.__setitem__(self, key, val)

    def __repr__(self):
        dictrepr = dict.__repr__(self)
        return '%s(%s)' % (type(self).__name__, dictrepr)
        
    def update(self, *args, **kwargs):
        print('update', args, kwargs)
        for k, v in dict(*args, **kwargs).items():
            self[k] = v

Studio answered 6/3, 2010 at 1:27 Comment(8)

I have tried your sol, but it seems that it only works for only one level of indexing (i.e., dict[key] and not dict[key1][key2] ... )* – Huerta 4/4, 2019 at 16:42

d[key1] returns something, perhaps a dictionary. The second key indexes that. This technique can’t work unless that returned thing supports the watch behavior also. – Studio 4/4, 2019 at 16:48

@AndrewNaguib: Why should it work with nested arrays? Nested array do not work with normal python dict either (if you did not implement it yourself) – Chrissa 1/5, 2019 at 11:32

Yes I did not know so :), for nested indexing level DictWatch(val) should be returned instead. – Huerta 1/5, 2019 at 11:34

@AndrewNaguib: __getitem__ would need to test val and only do that conditionally — i.e. if isinstance(val, dict): ... – Cheops 18/9, 2019 at 18:46

Having to overwrite 5 methods for a simple case feel overcomplicated. This is why collections.UserDict exists. UserDict only require to overwrite __setitem__ to be compatible with __init__, setdefault, update,... – Morphosis 2/11, 2020 at 17:1

Subclassing MutableMapping or UserDict is preferred over subclassing dict in most cases. However UserDict does not subclass dict so if you need the real builtin python dict as your parent class, this does not help you. @Morphosis – Studio 18/11, 2020 at 18:47

Does the update method take any more argument than a positional argument for the other dictionary that is used to update the first dictionary? – Mauro 19/7, 2022 at 9:10

Second, __getitem__ and __setitem__ are only called during [] accesses. So make sure you only access DictWatch via d[key], rather than d.get() and d.set()

Batrachian answered 6/3, 2010 at 0:42 Comment(4)

Actually it's not extra parens, but a missing opening paren around (str(dict.get(self, 'name_label')), str(key), str(val))) – Cistercian 6/3, 2010 at 0:44

True. To the OP: For future reference, you can simply do log.info('%s %s %s', a, b, c), instead of a Python string formatting operator. – Batrachian 6/3, 2010 at 0:50

What is dict.set? It doesn't exist. dict don't have a set attribute. – Kwarteng 18/10, 2023 at 2:3

Consider subclassing UserDict or UserList. These classes are intended to be subclassed whereas the normal dict and list are not, and contain optimisations.

Bish answered 26/3, 2018 at 19:21 Comment(4)

For reference, the documentation in Python 3.6 says "The need for this class has been partially supplanted by the ability to subclass directly from dict; however, this class can be easier to work with because the underlying dictionary is accessible as an attribute". – Sludge 16/9, 2018 at 17:33

@andrew an example might be helpful. – Skiba 26/9, 2019 at 9:40

@VasanthaGaneshK treyhunner.com/2019/04/… – Aeolus 11/2, 2020 at 15:53

Another reason to use UserDict: It makes copy() behave correctly. – Lassitude 4/1 at 13:48

As Andrew Pate's answer proposed, subclassing collections.UserDict instead of dict is much less error prone.

Here is an example showing an issue when inheriting dict naively:

class MyDict(dict):

  def __setitem__(self, key, value):
    super().__setitem__(key, value * 10)


d = MyDict(a=1, b=2)  # Bad! MyDict.__setitem__ not called
d.update(c=3)  # Bad! MyDict.__setitem__ not called
d['d'] = 4  # Good!
print(d)  # {'a': 1, 'b': 2, 'c': 3, 'd': 40}

UserDict inherits from collections.abc.MutableMapping, so this works as expected:

class MyDict(collections.UserDict):

  def __setitem__(self, key, value):
    super().__setitem__(key, value * 10)


d = MyDict(a=1, b=2)  # Good: MyDict.__setitem__ correctly called
d.update(c=3)  # Good: MyDict.__setitem__ correctly called
d['d'] = 4  # Good
print(d)  # {'a': 10, 'b': 20, 'c': 30, 'd': 40}

Similarly, you only have to implement __getitem__ to automatically be compatible with key in my_dict, my_dict.get, …

Note: UserDict is not a subclass of dict, so isinstance(UserDict(), dict) will fail (but isinstance(UserDict(), collections.abc.MutableMapping) will work).

Morphosis answered 2/11, 2020 at 13:32 Comment(0)

That should not really change the result (which should work, for good logging threshold values) : your init should be :

def __init__(self,*args,**kwargs) : dict.__init__(self,*args,**kwargs)

instead, because if you call your method with DictWatch([(1,2),(2,3)]) or DictWatch(a=1,b=2) this will fail.

(or,better, don't define a constructor for this)

Stockmon answered 6/3, 2010 at 0:48 Comment(1)

I'm only worried about the dict[key] form of access, so this isn't an issue. – Enmity 6/3, 2010 at 2:16

All you will have to do is

class BatchCollection(dict):
    def __init__(self, inpt={}):
        super(BatchCollection, self).__init__(inpt)

A sample usage for my personal use

### EXAMPLE
class BatchCollection(dict):
    def __init__(self, inpt={}):
        super(BatchCollection, self).__init__(inpt)

    def __setitem__(self, key, item):
        if (isinstance(key, tuple) and len(key) == 2
                and isinstance(item, collections.Iterable)):
            # self.__dict__[key] = item
            super(BatchCollection, self).__setitem__(key, item)
        else:
            raise Exception(
                "Valid key should be a tuple (database_name, table_name) "
                "and value should be iterable")

Note: tested only in python3

Lashoh answered 6/10, 2017 at 8:3 Comment(1)

Since this is Python 3, I recommend just using super() instead of super(BatchCollection, self) – Sheol 15/10, 2021 at 11:47

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags