Chained compound assignments with C++17 sequencing are still undefined behaviour?

Asked 4/10, 2020 at 13:54 Answered 4/10, 2020 at 19:3

Solved c++c++17 language-lawyer sequence-points

Originally, I presented a more complicated example, this one was proposed by @n. 'pronouns' m. in a now-deleted answer. But the question became too long, see edit history if you are interested.

Has the following program well-defined behaviour in C++17?

int main()
{
    int a=5;
    (a += 1) += a;
    return a;
}

I believe this expression is well-defined and evaluated like this:

The right side a is evaluated to 5.
There are no side-effects of the right side.
The left side is evaluated to a reference to a, a += 1 is well-defined for sure.
The left-side side-effect is executed, making a==6.
The assignment is evaluted, adding 5 to the current value of a, making it 11.

The relevant sections of the standard:

[intro.execution]/8:

An expression X is said to be sequenced before an expression Y if every value computation and every side effect associated with the expression X is sequenced before every value computation and every side effect associated with the expression Y.

[expr.ass]/1 (emphasis mine):

The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand; their result is an lvalue referring to the left operand. The result in all cases is a bit-field if the left operand is a bit-field. In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand. With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation.

The wording originally comes from the accepted paper P0145R3.

Now, I feel there is some ambiguity, even contradiction, in this second section.

The right operand is sequenced before the left operand.

Together with the definition of sequenced before strongly implies the ordering of side-effects, yet the previous sentence:

In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression

only explicitly sequences the assignment after value computation, not their side-effects. Thus allowing this behaviour:

The right side a is evaluated to 5.
The left side is evaluated to a reference of a, a += 1 is well-defined for sure.
The assignment is evaluted, adding 5 to the current value of a, making it 10.
The left-side side-effect is executed, making a==11 or maybe even 6 if the old values was used even for the side-effect.

But this ordering clearly violates the definition of sequenced before since the side-effects of the left operand happened after the value computation of the right operand. Thus left operand was not sequenced after the right operand which violets the above mentioned sentence. No I done goofed. This is allowed behaviour, right? I.e. the assignment can interleave the right-left evaluation. Or it can be done after both full evaluations.

Running the code gcc outputs 12, clang 11. Furthermore, gcc warns about

<source>: In function 'int main()':

<source>:4:8: warning: operation on 'a' may be undefined [-Wsequence-point]
    4 |     (a += 1) += a;
      |     ~~~^~~~~

I am terrible at reading assembly, maybe someone can at least rewrite how gcc got to 12? (a += 1), a+=a works but that seems extra wrong.

Well, thinking more about it, the right side also does evaluate to a reference to a, not just to a value 5. So Gcc could still be right, in that case clang could be wrong.

Swimmingly answered 4/10, 2020 at 13:54 Comment(18)

Reading assembly does not help to debug compile time calculations as compiler outputs the results of those. – Sinew 4/10, 2020 at 14:13

Why insist on using *= and += on the same line? Just break it on two lines. Even it it were sequenced correctly, it would still be unreadable to humans (or increase cognitive load if you want a milder statement) – Drambuie 4/10, 2020 at 14:54

@Drambuie I explicitly stated I do not care about readability of such code, I want to know why it does not work. How do you write fold expression in two lines? – Swimmingly 4/10, 2020 at 15:8

Rule #20 on the cppreference page doesn't matter, since when the constants 2 and Args are evaluated makes no difference. The other important one is rule #8: The side effect (modification of the left argument) of the built-in assignment operator and of all built-in compound assignment operators is sequenced after the value computation (but not the side effects) of both left and right arguments, and is sequenced before the value computation of the assignment expression (that is, before returning the reference to the modified object). – Fled 4/10, 2020 at 15:21

@Fled Yes, #20 is unnecessary in the first case, but I believe it is relevant in the right-left one. I did not noticed that rule. So, you are saying that res*=2 returning the reference as the value computation must happen before the ref+=... , yet remains unordered with respect when res*=2 is truly executed. Thus the increment can still happen before the multiplication? Seems to be in contradiction with #20. Nonetheless, can you make it an answer? – Swimmingly 4/10, 2020 at 15:41

Oh right, you need #20 for the order of operands of |=. I haven't written an answer because I'm still not sure what, if anything, is the issue here. – Fled 4/10, 2020 at 15:49

@Fled I edited the question with a followup, but I cannot say it made me any wiser :( Perhaps is (a += 1) += a; simply undefined, but I do not see how. – Swimmingly 4/10, 2020 at 16:9

I think that should be a separate question anyway. – Territus 4/10, 2020 at 16:14

@Territus I was thinking about it, but that will leave this unanswered and the title is already correct, it has no answers yet. Only the first presented example was unnecessarily complicated. – Swimmingly 4/10, 2020 at 16:15

Ok, but now the question is pretty complicated. If you're sure that the problem can be reduced to (a += 1) += a; then remove everything else. Otherwise, make a new question asking about that, and link to this question as the motivation if you want. (Note that the question was already asking for 2 things, an explanation, and a fix. This might be a good way to separate that). – Territus 4/10, 2020 at 16:20

@Territus I did trim it down a lot :). Yes, if this is undefined, my example was too. Otherwise I can ask a followup with the harder example. – Swimmingly 4/10, 2020 at 16:34

Yes, it's much clearer now. Yes, if this isn't UB, then posting the original question separately is a good idea. Also, I twiddled the tags a little. – Territus 4/10, 2020 at 16:35

@Territus Thank you for help so far :] – Swimmingly 4/10, 2020 at 16:41

It is to be noted that the behavior for user type is consistent both in clang and gcc and adheres to sequence before: simple case and fold expression. It doesn't close the gap for primitive types, but might be a solution to achieve consistent fold expressions. – Rountree 4/10, 2020 at 17:8

"The right side also does evaluate to a reference to a" - I think the lvalue-to-rvalue conversion required by [basic.lval]/6 for an operand should be considered a value computation of that operand. Perhaps both the glvalue identity determination and the read access of the lvalue-to-rvalue conversion are two value computations of the same expression? – Fled 4/10, 2020 at 17:38

The gcc warning could be a bug. Here's another. Clang is weird in that evaluation in static_assert is seemingly done in a different order and the results are different from evaluation outside of static_assert. This is acceptable if behaviour is undefined, however I believe that if clang considers the operations unsequenced, it should warn about it, as it is doing with -std=c++14 (where they are unsequenced). – Yurev 4/10, 2020 at 18:0

@n.'pronouns'm. UB and unsequenced are very much different. The former must be diagnosed in a constant expression. See static_assert(). – Ensheathe 4/10, 2020 at 20:5

@Ensheathe Unsequenced access and modification of the same variable is UB. – Yurev 5/10, 2020 at 9:23

In order to follow better what is actually performed, let's try to mimic the same with our own type and add some printouts:

class Number {
    int num = 0;
public:
    Number(int n): num(n) {}
    Number operator+=(int i) {
        std::cout << "+=(int) for *this = " << num
                  << " and int = " << i << std::endl;
        num += i;
        return *this;
    }
    Number& operator+=(Number n) {
        std::cout << "+=(Number) for *this = " << num
                  << " and Number = " << n << std::endl;
        num += n.num;
        return *this;
    }
    operator int() const {
        return num;
    }
};

Then when we run:

Number a {5};
(a += 1) += a;
std::cout << "result: " << a << std::endl;

We get different results with gcc and clang (and without any warning!).

gcc:

+=(int) for *this = 5 and int = 1
+=(Number) for *this = 6 and Number = 6
result: 12

clang:

+=(int) for *this = 5 and int = 1
+=(Number) for *this = 6 and Number = 5
result: 11

Which is the same result as for ints in the question. Even though it is not the same exact story: built-in assignment has its own sequencing rules, as opposed to overloaded operator which is a function call, still the similarity is interesting.

It seems that while gcc keeps the right side as a reference and turns it to a value on the call to +=, clang on the other hand turns the right side to a value first.

The next step would be to add a copy constructor to our Number class, to follow exactly when the reference is turned into a value. Doing that results with calling the copy constructor as the first operation, both by clang and gcc, and the result is the same for both: 11.

It seems that gcc delays the reference to value conversion (both in the built-in assignment as well as with user defined type without a user defined copy constructor). Is it coherent with C++17 defined sequencing? To me it seems as a gcc bug, at least for the built-in assignment as in the question, as it sounds that the conversion from reference to value is part of the "value computation" that shall be sequenced before the assignment.

As for a strange behavior of clang reported in previous version of the original post - returning different results in assert and when printing:

constexpr int foo() {
    int res = 0;
    (res = 5) |= (res *= 2);
    return res;
}

int main() {
    std::cout << foo() << std::endl; // prints 5
    assert(foo() == 5); // fails in clang 11.0 - constexpr foo() is 10
                        // fixed in clang 11.x - correct value is 5
}

This relates to a bug in clang. The failure of the assert is wrong and is due to wrong evaluation order of this expression in clang, during constant evaluation in compile time. The value should be 5. This bug is already fixed in clang trunk.

Rountree answered 4/10, 2020 at 19:3 Comment(2)

I believe I made a mistake, my second order might be allowed - assignment can interleave the right-left sequencing. In that case, yes gcc's value computation seems sketchy but clang could be right with my explanation. Maybe it is undefined precisely because of the interleaving if the value computation results in a reference. – Swimmingly 4/10, 2020 at 20:42

it looks like gcc treats (a += 1) += a; and a += (a += 1) as equivalents because of parenthesis and postpones reference conversion because of that. This behaviour is consistent for all versions of gcc, since pre-C++11 – Werbel 4/10, 2020 at 23:30

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags