Unsequenced value computations (a.k.a sequence points)

Asked 4/10, 2010 at 4:6 Answered 1/11, 2011 at 14:55

Solved c++language-lawyer side-effects sequence-points

Sorry for opening this topic again, but thinking about this topic itself has started giving me an Undefined Behavior. Want to move into the zone of well-defined behavior.

Given

int i = 0;
int v[10];
i = ++i;     //Expr1
i = i++;     //Expr2
++ ++i;      //Expr3
i = v[i++];  //Expr4

I think of the above expressions (in that order) as

operator=(i, operator++(i))    ; //Expr1 equivalent
operator=(i, operator++(i, 0)) ; //Expr2 equivalent
operator++(operator++(i))      ; //Expr3 equivalent
operator=(i, operator[](operator++(i, 0)); //Expr4 equivalent

Now coming to behaviors here are the important quotes from C++ 0x.

$1.9/12- "Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for lvalue evaluation and fetchinga value previously assigned to an object for rvalue evaluation) and initiation of side effects."

$1.9/15- "If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined."

[ Note: Value computations and side effects associated with different argument expressions are unsequenced. —end note ]

$3.9/9- "Arithmetic types (3.9.1), enumeration types, pointer types, pointer to member types (3.9.2), std::nullptr_t, and cv-qualified versions of these types (3.9.3) are collectively called scalar types."

In Expr1, the evaluation of the expression i (first argument), is unsequenced with respect to the evaluation of the expession operator++(i) (which has a side effect).

Hence Expr1 has undefined behavior.
In Expr2, the evaluation of the expression i (first argument), is unsequenced with respect to the evaluation of the expession operator++(i, 0) (which has a side effect)'.

Hence Expr2 has undefined behavior.
In Expr3, the evaluation of the lone argument operator++(i) is required to be complete before the outer operator++ is called.

Hence Expr3 has well defined behavior.
In Expr4, the evaluation of the expression i (first argument) is unsequenced with respect to the evaluation of the operator[](operator++(i, 0) (which has a side effect).

Hence Expr4 has undefined behavior.

Is this understanding correct?

P.S. The method of analyzing the expressions as in OP is not correct. This is because, as @Potatoswatter, notes - "clause 13.6 does not apply. See the disclaimer in 13.6/1, "These candidate functions participate in the operator overload resolution process as described in 13.3.1.2 and are used for no other purpose." They are just dummy declarations; no function-call semantics exist with respect to built-in operators."

Cranmer answered 4/10, 2010 at 4:6 Comment(9)

+!: Good question. I would keep an eye for the answers. – Operable 4/10, 2010 at 4:11

@Cranmer : I agree with what @James McNellis said in his answer (which he deleted afterwards). All the 4 expressions invoke UB in C++0x [IMHO]. I think you should ask this question at csc++ (comp.std.c++). :) – Carlycarlye 4/10, 2010 at 8:15

@Prasoon Saurav: Why is Expr3 having undefined behavior? I thought this should be fine. gcc/comeau/llvm(demo) also all compile without any warning. – Cranmer 4/10, 2010 at 8:29

Thats because the side effects associated with ++ [inner] and ++ [outer] are not sequenced relative to each other(although the value computations are sequenced). :) – Carlycarlye 4/10, 2010 at 8:36

Check out this. It has been mentioned that

Some more complicated cases are not diagnosed by -Wsequence-point option, and it may give an occasional false positive result,.....

. – Carlycarlye 4/10, 2010 at 8:38

@Prasoon if you say that this is undefined behavior, you will have to come up with wording of C++0x (currently n3126) that supports the point. Just quoting James Kanze won't prove the point, mate. – Caston 5/10, 2010 at 4:54

Both me and "Kai-Uwe Bux" have shown how the definedness follows from various C++0x rules. The intent may or may not be what the wording reflects, but that's an entire different story. – Caston 5/10, 2010 at 4:59

Johannes Schaub - litb: While you are at it, please tell us if this is the right way to visualize about these expressions or do I miss any case with such thinking (in terms of operator function call for native types even if these do not exist in practicality) except as in $13.6/18 – Cranmer 5/10, 2010 at 5:21

For example this thinking also explains ++i = 0 (operator=(operator++(i), 0) as well-defined behavior. – Cranmer 5/10, 2010 at 5:31

Native operator expressions are not equivalent to overloaded operator expressions. There is a sequence point at the binding of values to function arguments, which makes the operator++() versions well-defined. But that doesn't exist for the native-type case.

In all four cases, i changes twice within the full-expression. Since no ,, ||, or && appear in the expressions, that's instant UB.

§5/4:

Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.

Edit for C++0x (updated)

§1.9/15:

The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

Note however that a value computation and a side effect are two distinct things. If ++i is equivalent to i = i+1, then + is the value computation and = is the side effect. From 1.9/12:

Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects.

So although the value computations are more strongly sequenced in C++0x than C++03, ~~the side effects are not.~~ Two side effects in the same expression, unless otherwise sequenced, produce UB.

Value computations are ordered by their data dependencies anyway and, side effects absent, their order of evaluation is unobservable, so I'm not sure why C++0x goes to the trouble of saying anything, but that just means I need to read more of the papers by Boehm and friends wrote.

Edit #3:

Thanks Johannes for coping with my laziness to type "sequenced" into my PDF reader search bar. I was going to bed and getting up on the last two edits anyway… right ;v) .

§5.17/1 defining the assignment operators says

In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.

Also §5.3.2/1 on the preincrement operator says

If x is not of type bool, the expression ++x is equivalent to x+=1 [Note: see … addition (5.7) and assignment operators (5.17) …].

By this identity, ++ ++ x is shorthand for (x +=1) +=1. So, let's interpret that.

Evaluate the 1 on the far RHS and descend into the parens.
Evaluate the inner 1 and the value (prvalue) and address (glvalue) of x.
Now we need the value of the += subexpression.
- We're done with the value computations for that subexpression.
- The assignment side effect must be sequenced before the value of assignment is available!
Assign the new value to x, which is identical to the glvalue and prvalue result of the subexpression.
We're out of the woods now. The whole expression has now been reduced to x +=1.

So, then 1 and 3 are well-defined and 2 and 4 are undefined behavior, which you would expect.

The only other surprise I found by searching for "sequenced" in N3126 was 5.3.4/16, where the implementation is allowed to call operator new before evaluating constructor arguments. That's cool.

Edit #4: (Oh, what a tangled web we weave)

Johannes notes again that in i == ++i; the glvalue (a.k.a. the address) of i is ambiguously dependent on ++i. The glvalue is certainly a value of i, but I don't think 1.9/15 is intended to include it for the simple reason that the glvalue of a named object is constant, and cannot actually have dependencies.

For an informative strawman, consider

( i % 2? i : j ) = ++ i; // certainly undefined

Here, the glvalue of the LHS of = is dependent on a side-effect on the prvalue of i. The address of i is not in question; the outcome of the ?: is.

Perhaps a good counterexample is

int i = 3, &j = i;
j = ++ i;

Here j has a glvalue distinct from (but identical to) i. Is this well-defined, yet i = ++i is not? This represents a trivial transformation that a compiler could apply to any case.

1.9/15 should say

If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the prvalue of the same scalar object, the behavior is undefined.

Crossway answered 4/10, 2010 at 4:9 Comment(41)

Sorry, I mentioned C++0X a little bit late in my post – Cranmer 4/10, 2010 at 4:14

@Potato AFAIK it's just the term "sequence point" that's been deprecated in favor of more clear wording, but it's still there. – Trove 4/10, 2010 at 4:23

@NullUser: There's a concept of sequencing, but the C way of saying the machine is either in a fully-determined state or indeterminate is gone. – Crossway 4/10, 2010 at 4:25

I also thought so. But stackoverflow.com/questions/3850040/… brought in some changes to the thought process. Confusion is now on 'i = ++i;' and I have not been able to get my thinking straight on this one. With this post I am trying to verify if my thought process is right (thinking in terms of operator equivalent). – Cranmer 4/10, 2010 at 7:48

@Crossway : All four expressions invoke UB in C++0x – Carlycarlye 4/10, 2010 at 7:53

@Chubsdad: The value of ++i is an operand to the = operator, so at least according to this paragraph, the effects is sequenced. Whether there something else makes it UB, I will check Prasoon's link next. – Crossway 4/10, 2010 at 8:38

@Potatoswatter: "Native operator expressions are not equivalent to overloaded operator expressions" Is this Standardese? Are you talking here about a difference in sequence point location at the binding of values to function arguments between a built-in/native/default operator= and an overloaded operator= defined as above in user code? Is it possible to give an example of the "native-type case"? So if I have defined an overloaded++ and write i++ it's well defined behavior - otherwise undefined? And this applies to C++98/03 and C++0x? – King 4/10, 2010 at 8:40

@Prasoon: No, those unsequenced examples use postincrement. Johannes gives a preincrement example i = v[++i] and argues that the side effect is of storing (i=i+1) is unsequenced relative to the next, explicit, assignment… that is another argument, and perhaps a good one. But I'm too sleepy to independently evaluate it for now. – Crossway 4/10, 2010 at 8:43

@Peter: The beginning of the clause on overloaded operators and the beginning of the clause on expressions both discuss the difference in sequence points. For example, &&, ||, and , are sequenced quite differently. Overloaded operators are function calls, Clause 5 operators are not. (Edit: except the function call operator. I go to bed now.) – Crossway 4/10, 2010 at 8:45

@Crossway : I am not very sure but still think ++ ++i and i = ++i are both UB in C++0x. Read James Kanze's posts [at the end of that discussion]. ;) – Carlycarlye 4/10, 2010 at 8:50

@Prasoon: Yep, reading Usenet rants is a chore but §1.9/12 and 15 really are pretty clear about it. Updated answer. – Crossway 4/10, 2010 at 15:16

@Prasoon, the discussion you link to figures that expression 3 does not invoke undefined behavior in C++0x according to the draft. Even "James Kanze" agrees to that obvious thing after being told so one-hundred-and-one times (he doubts that "assign" is a side-effect... but if it isn't a modification, what is it?). James Kanze oviously does not listen to what one writes. Notice how I write in the middle of that discussion how "sequenced-before" is a transitive relation. He just ignores that, and finally when someone else mentions that, he says "Ohhh, you may have a point there, captain!". – Caston 5/10, 2010 at 4:48

The proof for the well-definedness of Expr3 and for undefinedness of Expr3 can be found here: stackoverflow.com/questions/3690141/… – Caston 5/10, 2010 at 5:2

@Johannes: I think you mean undefinedness of Expr1… anyway I don't see how you figure the side effect of the first ++ in Expr3 is sequenced differently from the = in Expr1. The two ++ operations generate two distinct assignments to two different values and 1.9/15 does not sequence them. – Crossway 5/10, 2010 at 5:18

@Crossway yes, i meant undefinedness of Expr1, sorry :) Have to head to work. In the meantime, look into the paragraph for assignment in 5.x, where it's all sequenced. Seeya :) – Caston 5/10, 2010 at 5:25

we are back full circle. Why is 1 having well-defined behavior? That's what was proved to be wrong in the usenet discussion isn't it? – Cranmer 5/10, 2010 at 9:2

@Chubdad: because the ++i is a subexpression equivalent to an assignment expression (5.3.2/1) whose side-effect is ordered "before the value computation of the assignment expression." (5.17/1) – Crossway 5/10, 2010 at 11:52

That's what I also felt initially. But the usenet discussion forum seems to conclude that behavior of expr1 is ill-formed. @litb also argues for that – Cranmer 5/10, 2010 at 14:31

@Crossway like @Cranmer says, 1 is definitely undefined behavior. Yes, the side effect of ++i is ordered before value computation of ++i. This sequences the increment side effect before the "real" assignment side effect to i. But we also have a value computation of i on the left side in i = ++i, which is not sequenced relative to value computation of the right side. This is what makes it undefined. Note that "value computation" can not only mean "read a value" but means "compute what object an lvalue refers to" for glvalue evaluation. See the stackoverflow link above. – Caston 5/10, 2010 at 18:36

Bah, sorry for the churn, old on… – Crossway 5/10, 2010 at 18:55

@Crossway I see what you mean now. – Caston 5/10, 2010 at 18:58

@Johannes: sorry, my connection went down and I went away forgetting you were waiting… posted the update. – Crossway 5/10, 2010 at 19:36

@Crossway I removed my answer because you elaborated on yours to give a reasonable rationale. – Caston 5/10, 2010 at 19:48

@Johannes: I guess it's your call, I thought it was a good answer. Do you really think the working paper is still in that much flux? Do you think a DR has been submitted, and would you want to do so? Alisdair hasn't replied to my last few mails, so I think I might have annoyed him ;v) . – Crossway 5/10, 2010 at 19:54

@Potatoswatter: j = ++ i; should also be undefined behavior. j is just an alias for 'i' isn't it.? I (captial I) fail to understand this and continue to exhibit undefined behavior...:) – Cranmer 6/10, 2010 at 2:48

@Chubsdad: Even though it's an alias, its glvalue evaluation does not require a glvalue evaluation of i. Generally speaking, evaluating a reference does not require the original object to be on hand. There's no reason it should be UB, so it makes sense there should be an easy loophole or transformation to code which is not UB. – Crossway 6/10, 2010 at 4:26

its glvalue evaluation does not require a glvalue evaluation of i. Generally speaking, evaluating a reference does not require the original object to be on hand : I really doubt this. Do you have any reference for this 'reference' :) – Cranmer 6/10, 2010 at 4:49

@Chubsdad: Consider the function f( int &i, int &j ) { j = ++ i; } … f( i, i ); Still think so? A reference is an alias because its glvalue evaluates identically, not because it syntactically refers to the original object. – Crossway 6/10, 2010 at 5:11

@Potatoswatter: My understanding is 'j' is an alias for 'i' and even though 'i' is not in scope, 'j' can be used as long as 'i' remains a valid object it originally was. I am not so sure if I understand 'glvalue evaluates identically' vs 'glvalue is the same'. – Cranmer 6/10, 2010 at 6:8

This is with reference to 5.5 - "If an expression initially has the type “reference to T” (8.3.2, 8.5.3), the type is adjusted to T prior to any further analysis. The expression designates the object or function denoted by the reference, and the expression is an lvalue or an xvalue, depending on the expression. – Cranmer 6/10, 2010 at 6:17

@Chubsdad: It designates it by having an identical lvalue; that's the long and the short of it. Lvalue-to-rvalue conversion implements references having the value of the referent object. The reference doesn't tell the compiler to go look at the referenced variable and get its lvalue, because it might not know what variable is referenced. The compiler computes the lvalue of the reference and that lvalue identifies an object. If you want to debate this further, please open a new question. – Crossway 6/10, 2010 at 6:46

@Potatoswatter: Your wish is my command :) (stackoverflow.com/questions/3870172/…) – Cranmer 6/10, 2010 at 10:52

I got it. But my doubt remains. Let's take 'i = ++i'. As per 13.6/18 it can be treated as 'operator=(i, operator++(i))'. The side effect on the scalar object ('i' due to the 2nd argument) is unsequenced relative to value computation of the same scalar object ('i' for the 1st argument). Hence the behavior should be undefined. Can you tell me why it should be well-defined (as you mentioned in your post) from this perspective of thought? – Cranmer 7/10, 2010 at 5:1

@Chubsdad: By the standard, it is undefined, as Johannes explained. (I wish he hadn't deleted his answer.) However, it's next to impossible for a compiler to produce any behavior besides the desired, because the supposedly dependent value is a constant. Again, clause 13.6 does not apply. See the disclaimer in 13.6/1, "These candidate functions participate in the operator overload resolution process as described in 13.3.1.2 and are used for no other purpose." They are just dummy declarations; no function-call semantics exist with respect to built-in operators. – Crossway 7/10, 2010 at 5:11

@Potatoswatter: So do you want to change your post as it says that Expr1 is well-formed? – Cranmer 7/10, 2010 at 5:23

@Chubsdad: No, the last edit is spent entirely on discussion of Expr1, so the issue is clearly stated. I'm a bit tired of this. – Crossway 7/10, 2010 at 5:31

@Cranmer I think what @Crossway refers to is that the Std says "using the value of the same scalar object". The value of an object is not used if you simply use an lvalue that refers to the object. You have to actually read the value, despite the weird term "value computation". But the term "value" in "value computation" does not seem to refer to the value of an object, but to the "value" of an expression. I.e the glvalue or prvalue result of an expression. But I really think the Standard should be clearer on this issue. – Caston 8/10, 2010 at 11:20

@Johannes Schaub - litb: Oh. Now I understand better. It was little too cryptic for me. I think I am starting to get a hang of the core issue but not there yet. – Cranmer 9/10, 2010 at 2:56

I have found two DRs that support this answer: open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#637 and open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#222 – Caston 10/10, 2010 at 13:56

@PrasoonSaurav All four expressions invoke UB in C++0x you are mistaking, since i = ++i is perfectly defined in c++0x. stackoverflow.com/questions/17400137/… – Statius 1/7, 2013 at 10:8

Thank you for this exhaustive answer. You mention operator new may be called prior of evaluation of constructor args. Is similar true for delete? Take for example godbolt.org/z/n1qzrWsdq. It's basically delete a->b; where deleting b triggers delete a and the compiler is accessing through a again after the pointed-to object is dead. Speculating; delete is not a function but an operator, thus the deletion of a is not sequenced against dereferencing a? – Yorkist 25/4, 2023 at 19:24

In thinking about expressions like those mentioned, I find it useful to imagine a machine where memory has interlocks so that reading a memory location as part of a read-modify-write sequence will cause any attempted read or write, other than the concluding write of the sequence, to be stalled until the sequence completes. Such a machine would hardly be an absurd concept; indeed, such a design could simplify many multi-threaded code scenarios. On the other hand, an expression like "x=y++;" could fail on such a machine if 'x' and 'y' were references to the same variable, and the compiler's generated code did something like read-and-lock reg1=y; reg2=reg1+1; write x=reg1; write-and-unlock y=reg2. That would be a very reasonable code sequence on processors where writing a newly-computed value would impose a pipeline delay, but the write to x would lock up the processor if y were aliased to the same variable.

Scabble answered 1/11, 2011 at 14:55 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Edit for C++0x (updated)

Edit #3:

Edit #4: (Oh, what a tangled web we weave)

Recommended topics

Hot tags