How to make a dictionary that returns key for keys missing from the dictionary instead of raising KeyError?
Asked Answered
H

7

67

I want to create a python dictionary that returns me the key value for the keys are missing from the dictionary.

Usage example:

dic = smart_dict()
dic['a'] = 'one a'
print(dic['a'])
# >>> one a
print(dic['b'])
# >>> b
Haunted answered 3/6, 2011 at 15:23 Comment(3)
I was expecting to be able to get this behavior with collections.defautldict() but for some reason I'm missing something about how it works.Haunted
There's a number of ways to do this. One possibly important distinction / consideration is whether or not they also add the missing key to the underlying dictionary.Either
related: Is there a clever way to pass the key to defaultdict's default_factoryMacur
F
96

dicts have a __missing__ hook for this:

class smart_dict(dict):
    def __missing__(self, key):
        return key

Could simplify it as (since self is never used):

class smart_dict(dict):
    @staticmethod
    def __missing__(key):
        return key
Flyn answered 3/6, 2011 at 15:36 Comment(3)
Furthermore the docs say "defaultdict objects support [__missing__] in addition to the standard dict operations" which is just wrong.Lutero
Why is that wrong? I'd guess missing is the primary method by which defaultdict worksHey
Bear in mind that while smart_dict()['a'] returns 'a', smart_dict().get('a') returns None (or whatever value passed as default), which might not be the desired behaviour.Cartesian
H
36

Why don't you just use

dic.get('b', 'b')

Sure, you can subclass dict as others point out, but I find it handy to remind myself every once in a while that get can have a default value!

If you want to have a go at the defaultdict, try this:

dic = defaultdict()
dic.__missing__ = lambda key: key
dic['b'] # should set dic['b'] to 'b' and return 'b'

except... well: AttributeError: ^collections.defaultdict^object attribute '__missing__' is read-only, so you will have to subclass:

from collections import defaultdict
class KeyDict(defaultdict):
    def __missing__(self, key):
        return key

d = KeyDict()
print d['b'] #prints 'b'
print d.keys() #prints []
Holub answered 3/6, 2011 at 15:26 Comment(2)
Why subclass defaultdict vs subclassing dict?Lefebvre
The real usefulness of defaultdict is in cases like dd=defaultdict(list), where you can unconditionally write dd[key].append(item), which is a lot more cumbersome with .get, so this solution to the trivial lookup problem is not always what is needed.Timbering
H
15

Congratulations. You too have discovered the uselessness of the standard collections.defaultdict type. If that execrable midden heap of code smell offends your delicate sensibilities as much as it did mine, this is your lucky StackOverflow day.

Thanks to the forbidden wonder of the 3-parameter variant of the type() builtin, crafting a non-useless default dictionary type is both fun and profitable.

What's Wrong with dict.__missing__()?

Absolutely nothing, assuming you like excess boilerplate and the shocking silliness of collections.defaultdict – which should behave as expected but really doesn't. To be fair, Jochen Ritzel's accepted solution of subclassing dict and implementing the optional __missing__() method is a fantastic workaround for small-scale use cases only requiring a single default dictionary.

But boilerplate of this sort scales poorly. If you find yourself instantiating multiple default dictionaries, each with their own slightly different logic for generating missing key-value pairs, an industrial-strength alternative automating boilerplate is warranted.

Or at least nice. Because why not fix what's broken?

Introducing DefaultDict

In less than ten lines of pure Python (excluding docstrings, comments, and whitespace), we now define a DefaultDict type initialized with a user-defined callable generating default values for missing keys. Whereas the callable passed to the standard collections.defaultdict type uselessly accepts no parameters, the callable passed to our DefaultDict type usefully accepts the following two parameters:

  1. The current instance of this dictionary.
  2. The current missing key to generate a default value for.

Given this type, solving sorin's question reduces to a single line of Python:

>>> dic = DefaultDict(lambda self, missing_key: missing_key)
>>> dic['a'] = 'one a'
>>> print(dic['a'])
one a
>>> print(dic['b'])
b

Sanity. At last.

Code or It Didn't Happen

def DefaultDict(keygen):
    '''
    Sane **default dictionary** (i.e., dictionary implicitly mapping a missing
    key to the value returned by a caller-defined callable passed both this
    dictionary and that key).

    The standard :class:`collections.defaultdict` class is sadly insane,
    requiring the caller-defined callable accept *no* arguments. This
    non-standard alternative requires this callable accept two arguments:

    #. The current instance of this dictionary.
    #. The current missing key to generate a default value for.

    Parameters
    ----------
    keygen : CallableTypes
        Callable (e.g., function, lambda, method) called to generate the default
        value for a "missing" (i.e., undefined) key on the first attempt to
        access that key, passed first this dictionary and then this key and
        returning this value. This callable should have a signature resembling:
        ``def keygen(self: DefaultDict, missing_key: object) -> object``.
        Equivalently, this callable should have the exact same signature as that
        of the optional :meth:`dict.__missing__` method.

    Returns
    ----------
    MappingType
        Empty default dictionary creating missing keys via this callable.
    '''

    # Global variable modified below.
    global _DEFAULT_DICT_ID

    # Unique classname suffixed by this identifier.
    default_dict_class_name = 'DefaultDict' + str(_DEFAULT_DICT_ID)

    # Increment this identifier to preserve uniqueness.
    _DEFAULT_DICT_ID += 1

    # Dynamically generated default dictionary class specific to this callable.
    default_dict_class = type(
        default_dict_class_name, (dict,), {'__missing__': keygen,})

    # Instantiate and return the first and only instance of this class.
    return default_dict_class()


_DEFAULT_DICT_ID = 0
'''
Unique arbitrary identifier with which to uniquify the classname of the next
:func:`DefaultDict`-derived type.
'''

The key ...get it, key? to this arcane wizardry is the call to the 3-parameter variant of the type() builtin:

type(default_dict_class_name, (dict,), {'__missing__': keygen,})

This single line dynamically generates a new dict subclass aliasing the optional __missing__ method to the caller-defined callable. Note the distinct lack of boilerplate, reducing DefaultDict usage to a single line of Python.

Automation for the egregious win.

Heiress answered 23/5, 2017 at 6:7 Comment(2)
Cool, didn't know the 3-parameter variant! I agree with your sentiment (so defaultdict needs a PEP?), however I guess most people would just need a "sane" defaultdict without the industrial strength class generation. Actually, why create many classes at all? Why not have the default (callable) as a member of the class? This inspired me to write https://mcmap.net/q/293632/-how-to-make-a-dictionary-that-returns-key-for-keys-missing-from-the-dictionary-instead-of-raising-keyerror.Spillman
That was exactly my initial thoughts, but then I realized the probable rationale. It seems the author's initial idea was to use constructor like that: default_str_dict = defaultdict(str) instead of default_str_dict = defaultdict(''). Anyway, this definitely was a poor decision and led to unexpected behavior and inconvenience.Gallant
M
13

The first respondent mentioned defaultdict, but you can define __missing__ for any subclass of dict:

>>> class Dict(dict):
        def __missing__(self, key):
            return key


>>> d = Dict(a=1, b=2)
>>> d['a']
1
>>> d['z']
'z'

Also, I like the second respondent's approach:

>>> d = dict(a=1, b=2)
>>> d.get('z', 'z')
'z'
Middleclass answered 18/10, 2011 at 18:27 Comment(0)
S
5

I agree this should be easy to do, and also easy to set up with different defaults or functions that transform a missing value somehow.

Inspired by Cecil Curry's answer, I asked myself: why not have the default-generator (either a constant or a callable) as a member of the class, instead of generating different classes all the time? Let me demonstrate:

# default behaviour: return missing keys unchanged
dic = FlexDict()
dic['a'] = 'one a'
print(dic['a'])
# 'one a'
print(dic['b'])
# 'b'

# regardless of default: easy initialisation with existing dictionary
existing_dic = {'a' : 'one a'}
dic = FlexDict(existing_dic)
print(dic['a'])
# 'one a'
print(dic['b'])
# 'b'

# using constant as default for missing values
dic = FlexDict(existing_dic, default = 10)
print(dic['a'])
# 'one a'
print(dic['b'])
# 10

# use callable as default for missing values
dic = FlexDict(existing_dic, default = lambda missing_key: missing_key * 2)
print(dic['a'])
# 'one a'
print(dic['b'])
# 'bb'
print(dic[2])
# 4

How does it work? Not so difficult:

class FlexDict(dict):
    '''Subclass of dictionary which returns a default for missing keys.
    This default can either be a constant, or a callable accepting the missing key.
    If "default" is not given (or None), each missing key will be returned unchanged.'''
    def __init__(self, content = None, default = None):
        if content is None:
            super().__init__()
        else:
            super().__init__(content)
        if default is None:
            default = lambda missing_key: missing_key
        self.default = default # sets self._default

    @property
    def default(self):
        return self._default

    @default.setter
    def default(self, val):
        if callable(val):
            self._default = val
        else: # constant value
            self._default = lambda missing_key: val

    def __missing__(self, x):
        return self.default(x)

Of course, one can debate whether one wants to allow changing the default-function after initialisation, but that just means removing @default.setter and absorbing its logic into __init__.

Enabling introspection into the current (constant) default value could be added with two extra lines.

Spillman answered 13/12, 2017 at 10:24 Comment(0)
C
0

Subclass dict's __getitem__ method. For example, How to properly subclass dict and override __getitem__ & __setitem__

Countermark answered 3/6, 2011 at 15:25 Comment(0)
H
0

VERY late to the party, but I'm just bothered with this for so many times, that I thought I'd just research this myself.

The web doc say nothing on the exact semantics of this overridable __missing__ method (and is misleading to some degree), but the help(defaultdict.__missing__) (which I doubt if any of us would read) output will actually tell you the information you need:

>>> help(defaultdict.__missing__)
Help on method_descriptor:
__missing__(...)
    __missing__(key) # Called by __getitem__ for missing key; pseudo-code:
    if self.default_factory is None: raise KeyError((key,))
    self[key] = value = self.default_factory()
    return value

So it's now clear that all __missing__ in subclass must follow the similar procedure, instead of just a plain return foo(key)

Here is an example you can copy from


class MyDefaultDict(defaultdict):
    def __missing__(self, key):
        value = key + 1
        self[key] = value
        return value

Replace key + 1 with the type of transform of your choice.

Helio answered 25/5, 2023 at 12:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.