Why must Python list addition be homogenous?
Asked Answered
S

4

24

Can anyone familiar with Python's internals (CPython, or other implementations) explain why list addition is required to be homogenous:

In [1]: x = [1]

In [2]: x+"foo"
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
C:\Users\Marcin\<ipython-input-2-94cd84126ddc> in <module>()
----> 1 x+"foo"

TypeError: can only concatenate list (not "str") to list

In [3]: x+="foo"

In [4]: x
Out[4]: [1, 'f', 'o', 'o']

Why shouldn't the x+"foo" above return the same value as the final value of x in the above transcript?

This question follows on from NPE's question here: Is the behaviour of Python's list += iterable documented anywhere?

Update: I know it is not required that heterogenous += work (but it does) and likewise, it is not required that heterogenous + be an error. This question is about why that latter choice was made.

It is too much to say that the results of adding a sequence to a list are uncertain. If that were a sufficient objection, it would make sense to prevent heterogenous +=. Update2: In particular, python always delegates operator calls to the lefthand operand, so no issue "what is the right thing to do" arises": the left-hand object always governs (unless it delegates to the right).

Update3: For anyone arguing that this is a design decision, please explain (a) why it is not documented; or (b) where it is documented.

Update4: "what should [1] + (2, ) return?" It should return a result value equal with the value of a variable x initially holding [1] immediately after x+=(2, ). This result is well-defined.

Sunglass answered 16/12, 2012 at 20:4 Comment(0)
S
9

These bug reports suggest that this design quirk was a mistake.

Issue12318:

Yes, this is the expected behavior and yes, it is inconsistent.

It's been that way for a long while and Guido said he wouldn't do it again (it's in his list of regrets). However, we're not going to break code by changing it (list.__iadd__ working like list.extend).

Issue575536:

The intent was that list.__iadd__ correspond exactly to list.extend(). There's no need to hypergeneralize list.__add__() too: it's a feature that people who don't want to get surprised by Martin-like examples can avoid them by using plain + for lists.

(Of course, there are those of us who find this behaviour quite surprising, including the developer who opened that bug report).

(Thanks to @Mouad for finding these).

Sunglass answered 17/12, 2012 at 13:50 Comment(2)
I think that my answer was deleted because someone didn't think it was an answer ! but i don't mind b/c we all learned from your question thanks for it :) and thanks for extracting the good part from my answer; i want also to add what Raymond said quoting GvR he wouldn't do it again (it's in his list of regrets), +1 BTW i think you should accept your answer.Fullscale
@Fullscale Yes. I'd like to know why your answer was deleted. It was deleted by ThiefMaster with no comment.Sunglass
D
11

From the Zen of Python:

In the face of ambiguity, refuse the temptation to guess.

Let's look at what happens here:

x + y

This gives us a value, but of what type? When we add things in real life, we expect the type to be the same as the input types, but what if they are disparate? Well, in the real world, we refuse to add 1 and "a", it doesn't make sense.

What if we have similar types? In the real world, we look at context. The computer can't do this, so it has to guess. Python picks the left operand and lets that decide. Your issue occurs because of this lack of context.

Say a programmer wants to do ["a"] + "bc" - this could mean they want "abc" or ["a", "b", "c"]. Currently, the solution is to either call "".join() on the first operand or list() on the second, which allows the programmer to do what they want and is clear and explicit.

Your suggestion is for Python to guess (by having a built-in rule to pick a given operand), so the programmer can do the same thing by doing the addition - why is that better? It just means it's easier to get the wrong type by mistake, and we have to remember an arbitrary rule (left operand picks type). Instead, we get an error so we can give Python the information it needs to make the right call.

So why is += different? Well, that's because we are giving Python that context. With the in-place operation we are telling Python to modify a value, so we know that we are dealing with something of the type the value we are modifying is. This is the context Python needs to make the right call, so we don't need to guess.

When I talk about guessing, I'm talking about Python guessing the programmer's intent. This is something Python does a lot - see division in 3.x. / does float division, correcting the error of it being integer division in 2.x.

This is because we are implicitly asking for float division when we try to divide. Python takes this into account and it's operations are done according to that. Likewise, it's about guessing intent here. When we add with + our intent is unclear. When we use +=, it is very clear.

Dyslalia answered 16/12, 2012 at 20:53 Comment(12)
"In the face of ambiguity, refuse the temptation to guess." This answer does not explain (a) why any guesswork is involved; or (b) why there is no guesswork in a heterogenous +=. The result in the latter is well-defined exactly because python always delegates operator calls to the left operand. "because we are giving Python that context." Python always has that context because of how operator calls are evaluated.Sunglass
But += does "guess", I would argue. At the very least, the difference between a = a + b and a += b is confusing. Note that for a = 1 and b = 0.5, a += b gives a == 1.5.Cheadle
@delnan Yes. I'm asking why there is a difference, because the two choices appear inconsistent. I note that list addition is never commutative.Sunglass
I've added to the end. It's all about programmer intent. Not surprising the programmer is a tenant of Python, and these choices are how that is done.Dyslalia
@Sunglass In case you meant that as rebuttal to my comment, I wasn't addressing you, I was addressing the answer.Cheadle
As to the argument about the difference in / in 3 vs 2, note that this is the exact opposite choice of the one you are arguing for: 3 has become more type-permissive.Sunglass
@delnan Cool. Not as a rebuttal, but to say that I didn't understand the import.Sunglass
Yes. Because that is the expectation. People come into programming with biases from what they know outside of programming. Python conforms to those, to make the language easier to use. Knowing an arbitrary rule is not natural, but knowing that division works the way it did in maths is.Dyslalia
@Lattyware To me it makes the most sense that x += y should have the same result as x = x + y. I would be quite surprised (and I am in this case) if they were different. In this case I would expect them both to not work.Choctaw
@Sunglass That is an argument why x += y where x is a list and y isn't shouldn't work, and I would say maybe that is right, but it's not an argument to make x = x + y in the context work, which would definitely make things less obvious.Dyslalia
@delnan Yes, but that's a learned rule. It's less obvious to newcomers. Python chooses to aim towards normal for the average person, not normal for a programmer.Dyslalia
Whoops, wrong reply there, two comments up was intended to be @Choctaw - not Marcin, apologies.Dyslalia
S
9

These bug reports suggest that this design quirk was a mistake.

Issue12318:

Yes, this is the expected behavior and yes, it is inconsistent.

It's been that way for a long while and Guido said he wouldn't do it again (it's in his list of regrets). However, we're not going to break code by changing it (list.__iadd__ working like list.extend).

Issue575536:

The intent was that list.__iadd__ correspond exactly to list.extend(). There's no need to hypergeneralize list.__add__() too: it's a feature that people who don't want to get surprised by Martin-like examples can avoid them by using plain + for lists.

(Of course, there are those of us who find this behaviour quite surprising, including the developer who opened that bug report).

(Thanks to @Mouad for finding these).

Sunglass answered 17/12, 2012 at 13:50 Comment(2)
I think that my answer was deleted because someone didn't think it was an answer ! but i don't mind b/c we all learned from your question thanks for it :) and thanks for extracting the good part from my answer; i want also to add what Raymond said quoting GvR he wouldn't do it again (it's in his list of regrets), +1 BTW i think you should accept your answer.Fullscale
@Fullscale Yes. I'd like to know why your answer was deleted. It was deleted by ThiefMaster with no comment.Sunglass
O
1

I believe Python designers made addition this way so that '+' operator stays consistently commutative with regard to result type: type(a + b) == type(b + a)

Everybody expects that 1 + 2 has the same result as 2 + 1. Would you expect [1] + 'foo' to be the same as 'foo' + [1]? If yes, what should be the result?

You have 3 choices, you either pick left operand as result type, right operand as result type, or raise an error.

+= is not commutative because it contains assignment. In this case you either pick left operand as result type or throw. The surprise here is that a += b is not the same as a = a + b. a += b does not translate in English to "Add a to b and assign result to a". It translates to "Add a to b in place". That's why it doesn't work on immutables such as string or tuple.

Thanks for the comments. Edited the post.

Ozieozkum answered 17/12, 2012 at 3:30 Comment(9)
-1 List addition is not commutative: ([1] + [2]) != ([2]+[1]).Sunglass
@Sunglass While he's a bit fuzzy on that, the third paragraph clarifies that he's talking about the result's types, and that is correct. Not necessarily the answer to this question, but list addition is indeed commutative with respect to result types.Cheadle
@delnan Sure, but that is destroyed if any sequence type at all allows heterogenous addition.Sunglass
@Sunglass Do any sequence types allow heterogeneous?Cheadle
@delnan Sure, any sequence type that you want to create that does that. pastebin.com/N16EK4BMSunglass
@Sunglass That wasn't the question. Badly-written user code can wreak havoc in countless ways, this is normal. We're talking about the design decisions for the standard library types. And it seems indeed consistent in this regard: The designers valued this property, so they designed their types such that they uphold it.Cheadle
@delnan Well, if that code is badly-written, you're assuming that there is a problem with heterogenous addition. What that problem is, if anthing, is the question at hand.Sunglass
sigh Badly-written under the assumption one wants to avoid it (and thereby be consistent with the stdlib). Replace "badly-written" by "heterogenous-addition allowing" if that suits your ilk.Cheadle
@delnan Your unexpressed assumption is still that the stdlib is working correctly. As this behaviour is not documented, it would appear to be open to other implementers to provide different behaviour.Sunglass
V
0

My guess is that Python is strongly typed, and there's not a clear indication of the right thing to do here. Are you asking Python to append the string itself, or to cast the string to a list (which is what you indicated you'd like it to do)?

Remember explicit is better than implicit. In the most common case, neither of those guesses is correct and you're accidentally trying to do something you didn't intent. Raising a TypeError and letting you sort it out is the safest, most Pythonic thing to do here.

Vulcanology answered 16/12, 2012 at 20:24 Comment(10)
"Are you asking Python to append the string itself" What does this even mean? The operation is sufficiently well defined that += works. There is no reason of simple logic or expectations why the two operations should not have the same (as in value-equal) result.Sunglass
"Raising a TypeError and letting you sort it out is the safest, most Pythonic thing to do here." Rinse and repeat until you're programming in C++.Sunglass
@Sunglass Slippery slope is a logical fallacy. Reverse your example and continue making Python more loosely typed and we are all writing JavaScript.Dyslalia
@Lattyware (a) Slippery slope is not a logical fallacy (b) I am not making a slippery slope argument - I am making the argument that the the proposed rationale could applied to any language feature to remove the usual operation of ducktyping, but this has not in fact been done, so if this is the rationale, it is an anomaly which requires explanation.Sunglass
@Sunglass An obvious result of list += string is list = list + [string], or list.append(string). I don't see that as any less "obviously" correct than list += list(string). You and I have different opinions of the correct outcome, so I think a reasonable answer is "none of the above".Vulcanology
But list.__iadd__ does, in fact, go for one option and disregards the other (it acts as list.extend, appending the items of the iterable). According to your argument, it wouldn't.Cheadle
@KirkStrauser If that is your objection, then heterogenous += should be disallowed. This is not the choice that the CPython implementors have made.Sunglass
@KirkStrauser list+=string is actually list.extend(string).Foothold
@AshwiniChaudhary Fair enough. It still goes against strong typing as it's implicitly casting one type (string) to another not-blatantly-similar type (list). Adding a float and int has a fairly obvious (and very well defined in the language spec) result, but I personally don't want Python guessing about how to cast other random types around. If I'd wanted PHP, Perl, or REXX, I would've stuck with them.Vulcanology
@KirkStrauser Neither list.__iadd__ nor list.extend cast anything. They simply work on any iterable, and string happens to be an iterable of one-character strings.Cheadle

© 2022 - 2024 — McMap. All rights reserved.