Truth value of empty set
Asked Answered
K

3

44

I am interested in the truth value of Python sets like {'a', 'b'}, or the empty set set() (which is not the same as the empty dictionary {}). In particular, I would like to know whether bool(my_set) is False if and only if the set my_set is empty.

Ignoring primitive (such as numerals) as well as user-defined types, https://docs.python.org/3/library/stdtypes.html#truth says:

The following values are considered false:

  • [...]
  • any empty sequence, for example, '', (), [].
  • any empty mapping, for example, {}.
  • [...]

All other values are considered true

According to https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range, a set is not a sequence (it is unordered, its elements do not have indices, etc.):

There are three basic sequence types: lists, tuples, and range objects.

And, according to https://docs.python.org/3/library/stdtypes.html#mapping-types-dict,

There is currently only one standard mapping type, the dictionary.

So, as far as I understand, the set type is not a type that can ever be False. However, when I try, bool(set()) evaluates to False.

Questions:

  • Is this a documentation problem, or am I getting something wrong?
  • Is the empty set the only set whose truth value is False?
Kanchenjunga answered 28/6, 2017 at 22:13 Comment(8)
same behavior for any iterable ? i.e. with dict.items()Conceptualism
It almost certainly is a mistaken omission: the set built-in type came relatively late to the game (version 2.2 or 2.3). They likely never updated the docs here to add or an empty setButter
@Butter But doesn't the same apply to dicts as well? It's been around pretty much since the beginning, but was omitted.Showery
@ChristianDean I believe dict was addressed here: "any empty mapping, for example {}"Butter
@Butter Oh, Ok. I guess your right.Showery
Question solved, but someone should file a documentation bug bugs.python.orgBughouse
@Bughouse I created an issue on the Python bug tracker, see bugs.python.org/issue30803Kanchenjunga
the fix was merged.Kanchenjunga
B
29

After looking at the source code for CPython, I would guess this is a documentation error, however, it could be implementation dependent and therefore would be a good issue to raise on the Python bug tracker.

Specifically, object.c defines the truth value of an item as follows:

int
PyObject_IsTrue(PyObject *v)
{
    Py_ssize_t res;
    if (v == Py_True)
        return 1;
    if (v == Py_False)
        return 0;
    if (v == Py_None)
        return 0;
    else if (v->ob_type->tp_as_number != NULL &&
             v->ob_type->tp_as_number->nb_bool != NULL)
        res = (*v->ob_type->tp_as_number->nb_bool)(v);
    else if (v->ob_type->tp_as_mapping != NULL &&
             v->ob_type->tp_as_mapping->mp_length != NULL)
        res = (*v->ob_type->tp_as_mapping->mp_length)(v);
    else if (v->ob_type->tp_as_sequence != NULL &&
             v->ob_type->tp_as_sequence->sq_length != NULL)
        res = (*v->ob_type->tp_as_sequence->sq_length)(v);
    else
        return 1;
    /* if it is negative, it should be either -1 or -2 */
    return (res > 0) ? 1 : Py_SAFE_DOWNCAST(res, Py_ssize_t, int);
}

We can clearly see that the value is value would be always true if it is not a boolean type, None, a sequence, or a mapping type, which would require tp_as_sequence or tp_as_mapping to be set.

Fortunately, looking at setobject.c shows that sets do implement tp_as_sequence, suggesting the documentation seems to be incorrect.

PyTypeObject PySet_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "set",                              /* tp_name */
    sizeof(PySetObject),                /* tp_basicsize */
    0,                                  /* tp_itemsize */
    /* methods */
    (destructor)set_dealloc,            /* tp_dealloc */
    0,                                  /* tp_print */
    0,                                  /* tp_getattr */
    0,                                  /* tp_setattr */
    0,                                  /* tp_reserved */
    (reprfunc)set_repr,                 /* tp_repr */
    &set_as_number,                     /* tp_as_number */
    &set_as_sequence,                   /* tp_as_sequence */
    0,                                  /* tp_as_mapping */
    /* ellipsed lines */
};

Dicts also implement tp_as_sequence, so it seems that although it is not a sequence type, it sequence-like, enough to be truthy.

In my opionion, the documentation should clarify this: mapping-like types, or sequence-like types will be truthy dependent on their length.

Edit As user2357112 correctly points out, tp_as_sequence and tp_as_mapping do not mean the type is a sequence or a map. For example, dict implements tp_as_sequence, and list implements tp_as_mapping.

Bingle answered 28/6, 2017 at 22:32 Comment(8)
Slightly aside -- is the Python language actually defined by the documentation, or by the reference implementation?Kanchenjunga
@peterthomassen: There's barely anything that can be considered a spec, really. It's not like C or C++, with a standards body and a mostly-thorough official standard. The docs are often incomplete, and the exact behavior of the interpreter isn't normative for other implementations.Tyne
Note that the presence of tp_as_sequence doesn't mean an object is a sequence and the presence of tp_as_mapping doesn't mean an object is a mapping. The language has evolved since those members were designed and named.Tyne
Where in the above is the __nonzero__ (or __bool__ in python 3) taken into account?Foltz
@shadow: Python-level __bool__ corresponds to the v->ob_type->tp_as_number->nb_bool part, which is the C-level counterpart. Depending on whether the type is written in C or Python, either __bool__ or nb_bool will be a wrapper for the other.Tyne
@shadow, It's actually not considered. Builtin types differ from user-defined types. github.com/python/cpython/blob/…Bingle
@Tyne Beating me to the punch again.Bingle
@AlexanderHuszagh I raised this on the Python bug tracker, see bugs.python.org/issue30803Kanchenjunga
M
28

The documentation for __bool__ states that this method is called for truth value testing and if it is not defined then __len__ is evaluated:

Called to implement truth value testing and the built-in operation bool(); [...] When this method is not defined, __len__() is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither __len__() nor __bool__(), all its instances are considered true.

This holds for any Python object. As we can see set does not define a method __bool__:

>>> set.__bool__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'set' has no attribute '__bool__'

so the truth testing falls back on __len__:

>>> set.__len__
<slot wrapper '__len__' of 'set' objects>

Therefore only an empty set (zero-length) is considered false.

The part for truth value testing in the documentation is not complete with regard to this aspect.

Merrifield answered 28/6, 2017 at 22:48 Comment(0)
T
18

That part of the docs is poorly written, or rather, poorly maintained. The following clause:

instances of user-defined classes, if the class defines a __bool__() or __len__() method, when that method returns the integer zero or bool value False.

really applies to all classes, user-defined or not, including set, dict, and even the types listed in all the other clauses (all of which define either __bool__ or __len__). (In Python 2, None is false despite not having a __len__ or Python 2's equivalent of __bool__, but that exception is gone since Python 3.3.)

I say poorly maintained because this section has been almost unchanged since at least Python 1.4, and maybe earlier. It's been updated for the addition of False and the removal of separate int/long types, but not for type/class unification or the introduction of sets.

Back when the quoted clause was written, user-defined classes and built-in types really did behave differently, and I don't think built-in types actually had __bool__ or __len__ at the time.

Tyne answered 28/6, 2017 at 22:35 Comment(2)
Just a heads up that magic methods can be implementation-dependent: in Python 2, __bool__ does not exist, while __nonzero__ provides the same functionality, which I discovered the hard way.Bingle
@AlexanderHuszagh: Yup. The __nonzero__ name dates back to before Python had a bool type. Back then, it returned either 0 or 1, and the method was thought of as testing nonzeroness instead of coercing to bool.Tyne

© 2022 - 2024 — McMap. All rights reserved.