Printing boolean values True/False with the format() method in Python
Asked Answered
C

1

22

I was trying to print a truth table for Boolean expressions. While doing this, I stumbled upon the following:

>>> format(True, "") # shows True in a string representation, same as str(True)
'True'
>>> format(True, "^") # centers True in the middle of the output string
'1'

As soon as I specify a format specifier, format() converts True to 1. I know that bool is a subclass of int, so that True evaluates to 1:

>>> format(True, "d") # shows True in a decimal format
'1'

But why does using the format specifier change 'True' to 1 in the first example?

I turned to the docs for clarification. The only thing it says is:

A general convention is that an empty format string ("") produces the same result as if you had called str() on the value. A non-empty format string typically modifies the result.

So the string gets modified when you use a format specifier. But why the change from True to 1 if only an alignment operator (e.g. ^) is specified?

Costly answered 14/5, 2014 at 12:37 Comment(5)
"A general convention is that an empty format string ("") produces the same result as if you had called str() on the value. A non-empty format string typically modifies the result." -- docsDisburse
I don't know why it's done like that, but if you want to fix it you could do format(str(True),"^")Youlandayoulton
Thanks, I already fixed it, but I was just curious as to the "why" :)Costly
My guess is it has to do with an implicit conversion for the '^' operator, although that is weird. Also note that '^' is the align-center operator, not the width operator.Decahedron
Yes, it is weird. Corrected the question.Costly
S
9

Excellent question! I believe I have the answer. This requires digging around through the Python source code in C, so bear with me.

First, format(obj, format_spec) is just syntactic sugar for obj.__format__(format_spec). For specifically where this occurs, you'd have to look in abstract.c, in the function:

PyObject *
PyObject_Format(PyObject* obj, PyObject *format_spec)
{
    PyObject *empty = NULL;
    PyObject *result = NULL;

    ...

    if (PyInstance_Check(obj)) {
        /* We're an instance of a classic class */
HERE -> PyObject *bound_method = PyObject_GetAttrString(obj, "__format__");
        if (bound_method != NULL) {
            result = PyObject_CallFunctionObjArgs(bound_method,
                                                  format_spec,
                                                  NULL);

    ...
}

To find the exact call, we have to look in intobject.c:

static PyObject *
int__format__(PyObject *self, PyObject *args)
{
    PyObject *format_spec;

    ...

    return _PyInt_FormatAdvanced(self,
                     ^           PyBytes_AS_STRING(format_spec),
                     |           PyBytes_GET_SIZE(format_spec));
               LET'S FIND THIS
    ...
}

_PyInt_FormatAdvanced is actually defined as a macro in formatter_string.c as a function found in formatter.h:

static PyObject*
format_int_or_long(PyObject* obj,
               STRINGLIB_CHAR *format_spec,
           Py_ssize_t format_spec_len,
           IntOrLongToString tostring)
{
    PyObject *result = NULL;
    PyObject *tmp = NULL;
    InternalFormatSpec format;

    /* check for the special case of zero length format spec, make
       it equivalent to str(obj) */
    if (format_spec_len == 0) {
        result = STRINGLIB_TOSTR(obj);   <- EXPLICIT CAST ALERT!
        goto done;
    }

    ... // Otherwise, format the object as if it were an integer
}

And therein lies your answer. A simple check for whether format_spec_len is 0, and if it is, convert obj into a string. As you well know, str(True) is 'True', and the mystery is over!

Springbok answered 14/5, 2014 at 23:36 Comment(14)
I think the OP is asking why '^'.format(True) == '1', not why ''.format(True) == 'True'. Or rather why there would be an implicit cast to int for the former when the latter properly evaluates.Decahedron
True is an int, and it's only cast out of an int and into a str when format_spec is empty.Springbok
Not in python it isn't. >>> type(True) = <type 'bool'>, and type(True) == type(1) == False. In the underlying C code in the C interpreter it obviously is converted into an int since C lacks a bool primitive, but that wouldn't explain the implicit conversion on the python side.Decahedron
The conversion happens on the C side, not the python side. All of the format business I reference above happens entirely in C.Springbok
Again, that wouldn't explain the implicit conversion. >>> 'The Bool is {}'.format(True) == 'The Bool is True'. It seems to only do it when you specify arguments into the {}, I.E. 'The Bool is {:^}'.format(True) == 'The Bool is 1', however 'The Bool is {!r:^}'.format(True) == 'The Bool is True'Decahedron
I don't think we disagree here. I haven't looked at it closely, but str.format(*args, **kwargs) (which I believe is implemented on the C side here) seems like a helper for obj.__format__(format_spec). So '{!cast:format_spec}'.format(True) --> (cast(obj)).__format__(format_spec) appears completely consistent with each other.Springbok
Here we go. When rendering the objects passed into str.format(*args, **kwargs), a call to obj.__format__(format_spec) is made. There's some handling for doing a cast here, before the __format__ call is made.Springbok
@HuuNguyen Thanks for looking into this. I've tried reading the C source code, but my C isn't too good. I do see that using an empty format specifier makes format(obj, "") == str(obj). But I can't really find anything that would explain why format(True) == 'True' != format(True, "^"). I would very much appreciate if you could elaborate on that.Costly
If you're wondering why format(True, "^") is 1 and not 'True', then you already know. It's because internally, True is represented as an integer. The only time it ever is not an integer is when you cast it explicitly using '{!r:}'.format(True), or you specify an empty format_spec. Even though '^' is just an alignment operator, it is still non-empty, so the code path for the explicit conversion above is bypassed. In every case that is not these two exceptions, True is just a 1, and is formatted as an integer.Springbok
So it would be correct to say that the printed 'True' is the exception, not the printed '1'? It really is a corner case, by the way, because I can't think of any other type that would behave like members of class bool in combination with format().Costly
Correct, it is very much the exception. For all other objects without an explicit C-side __format__ implementation, the object is converted into a string and then __format__ is called on that string. You can see that happening here. For funsies, try format(True, "c") and you'll truly see that bool is an int to the bone! Except, of course, in the two cases I pointed out.Springbok
To clarify: Since bool is a subclass of int but bool doesn't specify it's own __format__ method the one from int is used through regular inheritance. In the case of the special handling of an empty format string, str() is called on the object which is implemented for bool.Retrogressive
So, in order to have both non-empty format string (such as for alignment) and boolean printed as true/false, we should first str(boolean) and send this (as a string) to format function?Maunsell
@Maunsell exactly! Keep in mind the formatting rules will treat it as if it were a string for the entire duration of the function call, from beginning to end.Springbok

© 2022 - 2024 — McMap. All rights reserved.