Is there an explanation for inline operators in "k += c += k += c;"?
Asked Answered
C

7

89

What is the explanation for the result from the following operation?

k += c += k += c;

I was trying to understand the output result from the following code:

int k = 10;
int c = 30;
k += c += k += c;
//k=80 instead of 110
//c=70

and currently I am struggling with understanding why the result for "k" is 80. Why is assigning k=40 not working (actually Visual Studio tells me that that value is not being used elsewhere)?

Why is k 80 and not 110?

If I split the operation to:

k+=c;
c+=k;
k+=c;

the result is k=110.

I was trying to look through the CIL, but I am not so profound in interpreting generated CIL and can not get a few details:

 // [11 13 - 11 24]
IL_0001: ldc.i4.s     10
IL_0003: stloc.0      // k

// [12 13 - 12 24]
IL_0004: ldc.i4.s     30
IL_0006: stloc.1      // c

// [13 13 - 13 30]
IL_0007: ldloc.0      // k expect to be 10
IL_0008: ldloc.1      // c
IL_0009: ldloc.0      // k why do we need the second load?
IL_000a: ldloc.1      // c
IL_000b: add          // I expect it to be 40
IL_000c: dup          // What for?
IL_000d: stloc.0      // k - expected to be 40
IL_000e: add
IL_000f: dup          // I presume the "magic" happens here
IL_0010: stloc.1      // c = 70
IL_0011: add
IL_0012: stloc.0      // k = 80??????
Colporteur answered 13/2, 2019 at 16:14 Comment(6)
You got different result because you split the function, k += c += k += c = 80 because the values of k and c stay the same in all the sums, so k += c += k += c is equal to 10 + 30 + 10 + 30Bozen
Interesting exercise, but, in practice, never write code chaining like that unless you want your coworkers to hate you. :)Junior
@JoãoPauloAmorim why then c is 70?Colporteur
@AndriiKotliarov because k += c += k += c is 10 + 30 + 10 + 30, so, K receives all the values, and C gets only the last 3 arguments 30 + 10 + 30 = 70Bozen
Also worth reading - Eric Lippert's answer to What is the difference between i++ and ++i?Lynnell
"Doctor, doctor, it hurts when I do this!" "So don't DO that."Hospitalize
P
106

An operation like a op= b; is equivalent to a = a op b;. An assignment can be used as statement or as expression, while as expression it yields the assigned value. Your statement ...

k += c += k += c;

... can, since the assignment operator is right-associative, also be written as

k += (c += (k += c));

or (expanded)

k =  k +  (c = c +  (k = k  + c));
     10    →   30    →   10 → 30   // operand evaluation order is from left to right
      |         |        ↓    ↓
      |         ↓   40 ← 10 + 30   // operator evaluation
      ↓   70 ← 30 + 40
80 ← 10 + 70

Where during the whole evaluation the old values of the involved variables are used. This is especially true for the value of k (see my review of the IL below and the link Wai Ha Lee provided). Therefore, you are not getting 70 + 40 (new value of k) = 110, but 70 + 10 (old value of k) = 80.

The point is that (according to the C# spec) "Operands in an expression are evaluated from left to right" (the operands are the variables c and k in our case). This is independent of the operator precedence and associativity which in this case dictate an execution order from right to left. (See comments to Eric Lippert's answer on this page).


Now let's look at the IL. IL assumes a stack based virtual machine, i.e. it does not use registers.

IL_0007: ldloc.0      // k (is 10)
IL_0008: ldloc.1      // c (is 30)
IL_0009: ldloc.0      // k (is 10)
IL_000a: ldloc.1      // c (is 30)

The stack now looks like this (from left to right; top of stack is right)

10 30 10 30

IL_000b: add          // pops the 2 top (right) positions, adds them and pushes the sum back

10 30 40

IL_000c: dup

10 30 40 40

IL_000d: stloc.0      // k <-- 40

10 30 40

IL_000e: add

10 70

IL_000f: dup

10 70 70

IL_0010: stloc.1      // c <-- 70

10 70

IL_0011: add

80

IL_0012: stloc.0      // k <-- 80

Note that IL_000c: dup, IL_000d: stloc.0, i.e. the first assignment to k , could be optimized away. Probably this is done for variables by the jitter when converting IL to machine code.

Note also that all the values required by the calculation are either pushed to the stack before any assignment is made or are calculated from these values. Assigned values (by stloc) are never re-used during this evaluation. stloc pops the top of the stack.


The output of the following console test is (Release mode with optimizations on)

evaluating k (10)
evaluating c (30)
evaluating k (10)
evaluating c (30)
40 assigned to k
70 assigned to c
80 assigned to k

private static int _k = 10;
public static int k
{
    get { Console.WriteLine($"evaluating k ({_k})"); return _k; }
    set { Console.WriteLine($"{value} assigned to k"); _k = value; }
}

private static int _c = 30;
public static int c
{
    get { Console.WriteLine($"evaluating c ({_c})"); return _c; }
    set { Console.WriteLine($"{value} assigned to c"); _c = value; }
}

public static void Test()
{
    k += c += k += c;
}
Photophobia answered 13/2, 2019 at 16:21 Comment(3)
You could add the final result with the numbers in the formula for even more complete : final is k = 10 + (30 + (10 + 30)) = 80 and that c final value is set in the first parenthesis which is c = 30 + (10 + 30) = 70.Pandorapandour
Indeed if k is a local then the dead store is almost certainly removed if optimizations are on, and preserved if they are not. An interesting question is whether the jitter is permitted to elide the dead store if k is a field, property, array slot, and so on; in practice I believe it does not.Environmentalist
A console test in Release mode indeed shows that k is assigned twice if it is a property.Photophobia
E
26

First off, Henk and Olivier's answers are correct; I want to explain it in a slightly different way. Specifically, I want to address this point you made. You have this set of statements:

int k = 10;
int c = 30;
k += c += k += c;

And you then incorrectly conclude that this should give the same result as this set of statements:

int k = 10;
int c = 30;
k += c;
c += k;
k += c;

It is informative to see how you got that wrong, and how to do it right. The right way to break it down is like this.

First, rewrite the outermost +=

k = k + (c += k += c);

Second, rewrite the outermost +. I hope you agree that x = y + z must always be the same as "evaluate y to a temporary, evaluate z to a temporary, sum the temporaries, assign the sum to x". So let's make that very explicit:

int t1 = k;
int t2 = (c += k += c);
k = t1 + t2;

Make sure that is clear, because this is the step you got wrong. When breaking down complex operations into simpler operation you must make sure that you do so slowly and carefully and do not skip steps. Skipping steps is where we make mistakes.

OK, now break down the assignment to t2, again, slowly and carefully.

int t1 = k;
int t2 = (c = c + (k += c));
k = t1 + t2;

The assignment will assign the same value to t2 as is assigned to c, so let's say that:

int t1 = k;
int t2 = c + (k += c);
c = t2;
k = t1 + t2;

Great. Now break down the second line:

int t1 = k;
int t3 = c;
int t4 = (k += c);
int t2 = t3 + t4;
c = t2;
k = t1 + t2;

Great, we are making progress. Break down the assignment to t4:

int t1 = k;
int t3 = c;
int t4 = (k = k + c);
int t2 = t3 + t4;
c = t2;
k = t1 + t2;

Now break down the third line:

int t1 = k;
int t3 = c;
int t4 = k + c;
k = t4;
int t2 = t3 + t4;
c = t2;
k = t1 + t2;

And now we can look at the whole thing:

int k = 10;  // 10
int c = 30;  // 30
int t1 = k;  // 10
int t3 = c;  // 30
int t4 = k + c; // 40
k = t4;         // 40
int t2 = t3 + t4; // 70
c = t2;           // 70
k = t1 + t2;      // 80

So when we are done, k is 80 and c is 70.

Now let's look at how this is implemented in the IL:

int t1 = k;
int t3 = c;  
  is implemented as
ldloc.0      // stack slot 1 is t1
ldloc.1      // stack slot 2 is t3

Now this is a bit tricky:

int t4 = k + c; 
k = t4;         
  is implemented as
ldloc.0      // load k
ldloc.1      // load c
add          // sum them to stack slot 3
dup          // t4 is stack slot 3, and is now equal to the sum
stloc.0      // k is now also equal to the sum

We could have implemented the above as

ldloc.0      // load k
ldloc.1      // load c
add          // sum them
stloc.0      // k is now equal to the sum
ldloc.0      // t4 is now equal to k

but we use the "dup" trick because it makes the code shorter and makes it easier on the jitter, and we get the same result. In general, the C# code generator tries to keep temporaries "ephemeral" on the stack as much as possible. If you find it easier to follow the IL with fewer ephemerals, turn optimizations off, and the code generator will be less aggressive.

We now have to do the same trick to get c:

int t2 = t3 + t4; // 70
c = t2;           // 70
  is implemented as:
add          // t3 and t4 are the top of the stack.
dup          
stloc.1      // again, we do the dup trick to get the sum in 
             // both c and t2, which is stack slot 2.

and finally:

k = t1 + t2;
  is implemented as
add          // stack slots 1 and 2 are t1 and t2.
stloc.0      // Store the sum to k.

Since we do not need the sum for anything else, we do not dup it. The stack is now empty, and we're at the end of the statement.

The moral of the story is: when you are trying to understand a complicated program, always break down operations one at a time. Don't take short cuts; they will lead you astray.

Environmentalist answered 13/2, 2019 at 18:43 Comment(14)
From the C# spec for Simple assignment I cannot tell that the old value of k would be used. Can you point me to the right spot in the spec?Photophobia
@OlivierJacot-Descombes: I'm not sure why you're looking at simple assignment; the question is about compound assignment. Can you clarify the question?Environmentalist
@OlivierJacot-Descombes: There is nowhere in the spec that says that the old value is used. What the spec says is that x += y is the same as x = x + y except that any side effects of computing x only happen once. The spec for x + y says that this is evaluated as "evaluate x, evaluate y, evaluate their sum". Does that answer your question?Environmentalist
How does this tell me that in x + (x = y) the first x is the old one and not the newly assigned one (i.e. y)?Photophobia
@OlivierJacot-Descombes: Because the algorithm given is "evaluate the left, evaluate the right, sum the evaluations". Side effects such as assignment are a result of evaluation.Environmentalist
@OlivierJacot-Descombes: The relevant line of the spec is in the section "Operators" and says "Operands in an expression are evaluated from left to right. For example, in F(i) + G(i++) * H(i), method F is called using the old value of i, then method G is called with the old value of i, and, finally, method H is called with the new value of i. This is separate from and unrelated to operator precedence." (Emphasis added.) So I guess I was wrong when I said there is nowhere that "the old value is used" occurs! It occurs in an example. But the normative bit is "left to right".Environmentalist
@OlivierJacot-Descombes: The spec could go on to point out in that example that (1) the new value is assigned to i before the call to G, (2) the "finally" is a bit premature, since the call to H happens before the multiplication and addition, but I think the authors did not wish to labour the point.Environmentalist
This was the missing link. The quintessence is that we must differentiate between operand evaluation order and operator precedence. Operand evaluation goes from left to right and in the OP’s case the operator execution from right to left.Photophobia
@OlivierJacot-Descombes: That's exactly right. Precedence and associativity have nothing whatsoever to do with the order in which subexpressions are evaluated, other than the fact that precedence and associativity determine where the subexpression boundaries are. Subexpressions are evaluated left-to-right.Environmentalist
@EricLippert would the results change if K was a reference type instead with an overloaded += operator?Fennec
@johnny5: I invite you to try that experiment and report back. Your first problem will be figuring out how to overload the += operator in C#. Give it a shot and let us know how it goes.Environmentalist
Ooops looks like you can't overload assignment operators :/Fennec
@johnny5: That's correct. But you can overload +, and then you will get += for free because x += y is defined as x = x + y except x is evaluated only once. That's true regardless of whether the + is built in or user-defined. So: try overloading + on a reference type and see what happens.Environmentalist
@johnny5: Alternatively, several reference types already implement + and therefore also +=. Try running some programs on those reference types, and see if you get the same or different semantics than the lowering I described.Environmentalist
V
14

It boils down to: is the very first += applied to the original k or to the value that was computed more to the right ?

The answer is that although assignments bind from right to left, operations still proceed from left to right.

So the leftmost += is executing 10 += 70.

Viviyan answered 13/2, 2019 at 16:31 Comment(2)
This puts it nicely in a nut shell.Tuchman
Its actually the operands which are evaluated from left to right.Photophobia
T
0

I tried the example with gcc and pgcc and got 110. I checked the IR they generated, and compiler did expand the expr to:

k = 10;
c = 30;
k = c+k;
c = c+k;
k = c+k;

which looks reasonable to me.

Tripalmitin answered 19/2, 2019 at 19:43 Comment(0)
G
-1

You can solve this by counting.

a = k += c += k += c

There are two cs and two ks so

a = 2c + 2k

And, as a consequence of the operators of the language, k also equals 2c + 2k

This will work for any combination of variables in this style of chain:

a = r += r += r += m += n += m

So

a = 2m + n + 3r

And r will equal the same.

You can work out the values of the other numbers by only calculating up to their leftmost assignment. So m equals 2m + n and n equals n + m.

This demonstrates that k += c += k += c; is different to k += c; c += k; k += c; and hence why you get different answers.

Some folks in the comments seem to be worried that you might try to over generalise from this shortcut to all possible types of addition. So, I'll make it clear that this shortcut is only applicable to this situation, i.e. chaining together addition assignments for the built in number types. It doesn't (necessarily) work if you add other operators in, e.g. () or +, or if you call functions or if you've overriden +=, or if you're using something other than the basic number types. It's only meant to help with the particular situation in the question.

Georgiegeorgina answered 14/2, 2019 at 8:59 Comment(11)
This does not answer the questionFennec
@johnny5 it explains why you get the result you get, i.e. because that's how maths works.Georgiegeorgina
Math and the orders of operations that a compiler evaulates a statement are two different things. Under your logic k+=c; c+= k; k+=c should evaluate to the same result.Fennec
No, johnny 5, that's not what it means. Mathematically they are different things. The three separate operations evaluate to 3c + 2k.Georgiegeorgina
Thanks for clarification plus oneFennec
Unfortunately your "algebraic" solution is only coincidentally correct. Your technique does not work in general. Consider x = 1; and y = (x += x) + x; Is it your contention that "there are three x's and so y is equal to 3 * x"? Because y is equal to 4 in this case. Now what about y = x + (x += x); is it your contention that the algebraic law "a + b = b + a" is fulfilled and this is also 4? Because this is 3. Unfortunately, C# does not follow the rules of high school algebra if there are side effects in the expressions. C# follows the rules of a side effecting algebra.Environmentalist
@ericlippert thanks for your reply. I was not trying to make it work in general, only for the posed question.Georgiegeorgina
I note that you correctly point out that "the brackets are evaluated first" but it is important to realize that the brackets are not evaluated first in general. A less misleading way to say that would be "the parenthesized expression on the left of the operator is evaluated before the expression on the right because of the general rule in C# that things to the left are evaluated before things to the right".Environmentalist
We can illustrate the point easily. Suppose we have int F() { Console.Write("F"); return 1; } and similarly for G and H. If we say x = F() * (G() + H()); the brackets are not done first. We get the output FGH because the rule is things to the left are done before things to the right. The parentheses determine the order in which the operators run: the + runs before the *. But they do not determine the order in which the operands run; the operands run left to right, regardless. Make sure this is clear in your mind.Environmentalist
That is, the order here is as though you'd written f = F(); g = G(); h = H(); gh = g + h; x = f * gh; Again do not try to apply the rules of algebra to C#. C# can be analyzed algebraically, but you need to use an algebra that can manipulate side effecting operations.Environmentalist
@ericlippert thanks for the explanation. I will update my post accordingly. Again I'm not trying to apply to rules of algebra to C#. I am only trying to answer the question, which only mentioned the cil as a way to understand what is happening, but could easily be ignored for their specific example.Georgiegeorgina
D
-1

for this kind of chain assignments, you have to assign the values from starting at the most right side. You have to assign and calculate and assign it to left side, and go on this all the way to the final (leftmost assignment), Sure it is calculated as k=80.

Dupondius answered 15/2, 2019 at 0:49 Comment(1)
Please do not post answers that simply re-state what numerous other answers already state.Environmentalist
K
-1

Simple answer: Replace vars with values und you got it:

int k = 10;
int c = 30;
k += c += k += c;
10 += 30 += 10 += 30
= 10 + 30 + 10 + 30
= 80 !!!
Koph answered 16/2, 2019 at 9:2 Comment(1)
This answer is wrong. Though this technique works in this specific case, that algorithm does not work in general. For example, k = 10; m = (k += k) + k; does not mean m = (10 + 10) + 10. Languages with mutating expressions cannot be analyzed as though they have eager value substitution. Value substitution happens in a particular order with respect to the mutations and you must take that into account.Environmentalist

© 2022 - 2024 — McMap. All rights reserved.