Why doesn't Python have a hybrid getattr + __getitem__ built in?

Asked 18/7, 2011 at 19:14 Answered 18/7, 2011 at 20:3

I have methods that accept dicts or other objects and the names of "fields" to fetch from those objects. If the object is a dict then the method uses __getitem__ to retrieve the named key, or else it uses getattr to retrieve the named attribute. This is pretty common in web templating languages. For example, in a Chameleon template you might have:

<p tal:content="foo.keyname">Stuff goes here</p>

If you pass in foo as a dict like {'keyname':'bar'}, then foo.keyname fetches the 'keyname' key to get 'bar'. If foo is an instance of a class like:

class Foo(object):
    keyname = 'baz'

then foo.keyname fetches the value from the keyname attribute. Chameleon itself implements that function (in the chameleon.py26 module) like this:

def lookup_attr(obj, key):
    try:
        return getattr(obj, key)
    except AttributeError as exc:
        try:
            get = obj.__getitem__
        except AttributeError:
            raise exc
        try:
            return get(key)
        except KeyError:
            raise exc

I've implemented it in my own package like:

try:
    value = obj[attribute]
except (KeyError, TypeError):
    value = getattr(obj, attribute)

The thing is, that's a pretty common pattern. I've seen that method or one awfully similar to it in a lot of modules. So why isn't something like it in the core of the language, or at least in one of the core modules? Failing that, is there a definitive way of how that could should be written?

Lao answered 18/7, 2011 at 19:14 Comment(5)

I strongly object that "The thing is, that's a pretty common pattern". Explicit is better than explicit. Attribute access is something completely that access by key... – Papa 18/7, 2011 at 19:17

Yes, but it's still pretty common in certain contexts like page template languages. I've seen it (and needed to implement it) often enough to wish it were refactored into core or a common module. – Lao 18/7, 2011 at 19:21

There is nothing like that - likely because the Python developer don't want to enforce bad programming style....point..if you to write bad code then you have to write it explictly....explicit is better than implicit. – Papa 18/7, 2011 at 19:23

@Blackmoon, I disagree that it's bad code, at least in this context. Note that the exact same pattern is available in Django page templates, and I don't think its users would agree that it's a bad idea. Again: this is a common idiom in certain contexts, and those contexts are common enough that I was wondering why they all had to re-implement it. – Lao 18/7, 2011 at 19:34

certain contexts does not mean common contexts. And certain context are often very specific to specific systems...so such a functionality in general in pointless...if you need it: you implemented according to your own needs in your specific context. – Papa 18/7, 2011 at 19:37

I sort of half-read your question, wrote the below, and then reread your question and realized I had answered a subtly different question. But I think the below actually still provides an answer after a sort. If you don't think so, pretend instead that you had asked this more general question, which I think includes yours as a sub-question:

"Why doesn't Python provide any built-in way to treat attributes and items as interchangable?"

I've given a fair bit of thought to this question, and I think the answer is very simple. When you create a container type, it's very important to distinguish between attributes and items. Any reasonably well-developed container type will have a number of attributes -- often though not always methods -- that enable it to manage its contents in graceful ways. So for example, a dict has items, values, keys, iterkeys and so on. These attributes are all accessed using . notation. Items, on the other hand, are accessed using [] notation. So there can be no collisions.

What happens when you enable item access using . notation? Suddenly you have overlapping namespaces. How do you handle collisions now? If you subclass a dict and give it this functionality, either you can't use keys like items as a rule, or you have to create some kind of namespace hierarchy. The first option creates a rule that is onerous, hard to follow, and hard to enforce. The second option creates an annoying amount of complexity, without fully resolving the collision problem, since you still have to have an alternative interface to specify whether you want items the item or items the attribute.

Now, for certain kinds of very primitive types, this is acceptable. That's probably why there's namedtuple in the standard library, for example. (But note that namedtuple is subject to these very problems, which is probably why it was implemented as a factory function (prevents inheritance) and uses weird, private method names like _asdict.)

It's also very, very, very easy to create a subclass of object with no (public) attributes and use setattr on it. It's even pretty easy to override __getitem__, __setitem__, and __delitem__ to invoke __getattribute__, __setattr__ and __delattr__, so that item access just becomes syntactic sugar for getattr(), setattr(), etc. (Though that's a bit more questionable since it creates somewhat unexpected behavior.)

But for any kind of well-developed container class that you want to be able to expand and inherit from, adding new, useful attributes, a __getattr__ + __getitem__ hybrid would be, frankly, an enormous PITA.

Guenther answered 18/7, 2011 at 20:3 Comment(0)

The closest thing in the python standard library is a namedtuple(), http://docs.python.org/dev/library/collections.html#collections.namedtuple

Foo = namedtuple('Foo', ['key', 'attribute'])
foo = Foo(5, attribute=13)
print foo[1]
print foo.key

Or you can easily define your own type that always actually stores into it's dict but allows the appearance of attribute setting and getting:

class MyDict(dict):
    def __getattr__(self, attr):
        return self[attr]
    def __setattr__(self, attr, value):
        self[attr] = value

d = MyDict()

d.a = 3
d[3] = 'a'
print(d['a']) # 3
print(d[3]) # 'a'
print(d['b']) # Returns a keyerror

But don't do d.3 because that's a syntax error. There are of course more complicated ways out there of making a hybrid storage type like this, search the web for many examples.

As far as how to check both, the Chameleon way looks thorough. When it comes to 'why isn't there a way to do both in the standard library' it's because ambiguity is BAD. Yes, we have ducktyping and all other kinds of masquerading in python, and classes are really just dictionaries anyway, but at some point we want different functionality from a container like a dict or list than we want from a class, with it's method resolution order, overriding, etc.

Abrasion answered 18/7, 2011 at 19:37 Comment(0)

You can pretty easily write your own dict subclass that natively behaves this way. A minimal implementation, which I like to call a "pile" of attributes, is like so:

class Pile(dict):
    # raise AttributeError for missing key here to fulfill API
    def __getattr__(self, key):
        if key in self:
            return self[key]
        else:
            raise AttributeError(key)
    def __setattr__(self, key, value):
        self[key] = value

Unfortunately if you need to be able to deal with either dictionaries or attribute-laden objects passed to you, rather than having control of the object from the beginning, this won't help.

In your situation I would probably use something very much like what you have, except break it out into a function so I don't have to repeat it all the time.

Deepsea answered 18/7, 2011 at 19:28 Comment(3)

That's just the thing; I don't really have control over what gets passed in and I want to make it as easy as possible for my package's users, preferably using the same semantics as everyone else who's already already solved it. :-) Thanks for the answer, though. Sorry about whoever downvoted it. :-/ – Lao 18/7, 2011 at 19:38

Be careful! - return self[key] - will raise "KeyError" instead of "AttributeError" required by API docs: docs.python.org/3.9/reference/datamodel.html#object.__getattr__ Instead use this: if key in self: return self[key] else: raise AttributeError(key) – Antilebanon 27/10, 2022 at 14:24

Great comment, I've updated. – Deepsea 24/1, 2023 at 14:34

Recommended topics

Hot tags