The difference between C and C++ regarding the ++ operator
Asked Answered
U

2

70

I have been fooling around with some code and saw something that I don't understand the "why" of.

int i = 6;
int j;

int *ptr = &i;
int *ptr1 = &j

j = i++;

//now j == 6 and i == 7. Straightforward.

What if you put the operator on the left side of the equals sign?

++ptr = ptr1;

is equivalent to

(ptr = ptr + 1) = ptr1; 

whereas

ptr++ = ptr1;

is equivalent to

ptr = ptr + 1 = ptr1;

The postfix runs a compilation error and I get it. You've got a constant "ptr + 1" on the left side of an assignment operator. Fair enough.

The prefix one compiles and WORKS in C++. Yes, I understand it's messy and you're dealing with unallocated memory, but it works and compiles. In C this does not compile, returning the same error as the postfix "lvalue required as left operand of assignment". This happens no matter how it's written, expanded out with two "=" operators or with the "++ptr" syntax.

What is the difference between how C handles such an assignment and how C++ handles it?

Unbearable answered 3/9, 2014 at 22:1 Comment(28)
As far as I know ++i doesn’t return an l-value in C. Regardless, this is UB as you modify the variable 2 times between two consecutive sequence points. In other words it’s unspecified whether the value is incremented first or it is assigned first.Munafo
@OllieFord I never understood UB to mean it might not compile. I mean, people say anything can happen, but I take it to mean anything can happen when you run the code.Parkland
@Parkland the code runes, it is UB so the program goes back in time and stops the compiling process. So… yeah…Munafo
@Parkland Wouldn't an "ideal compiler" catch all UB, with Error: UB?Jessabell
@juanchopanza: Perhaps the program goes back in time and interrupts the compilation. Edit: I see bolov had the same ideaRoguery
@OllieFord: I am using Eclipse in a Windows environment. Eclipse uses MinGW so, ostensibly, the respective compilers should behave somewhat similarly, no?Unbearable
@OllieFord An ideal compiler in a perfect world? Maybe. :)Lucilla
@OllieFord A warning (or error) on invoking UB would be great, but UB is not always detectable. Left-shifting a negative integer is UB, but the compiler doesn't know what value a particular signed integer variable may hold.Bibbye
@OllieFord "Wouldn't an "ideal compiler" catch all UB, with Error: UB? " No, they can't. UB is in situations, where something is (or was forced e.g. by improper casting) to be syntactically correct, but has wrong conditions at runtime (e.g. dereferencing uninitialized pointers, etc.). SCA tools provide such analysis for some cases (depends on their quality).Hankhanke
@Munafo I get that C doesn't return an lvalue for that operator and perhaps that's the big issue. But why does the compilation error still happen when written expanded out with brackets and two "=" operators?Unbearable
@πάνταῥεῖ "Wouldn't", not "couldn't"Jessabell
@OllieFord Well an ideal compiler still can't determine exactly what will happen at runtime, which is where UB lives. Some UB could be detected at compile-time (a warning then would be great!) but the large majority simply cannot be.Bibbye
@Moe45673: If ++ptr is not an l-value, why would you expect (ptr = ptr + 1) to be?Roguery
The result of assignment is an rvalue in C and an lvalue in C++ (and ++x is nothing more than x += 1).Bermejo
@OllieFord The subjunctive doesn't help. It's not possible to build such ideal compiler that catches up all these situations. Integration of automatic SCA might be solution for specific toolchains.Hankhanke
@Parkland they tried... But when they tried this question haven't created yet, so they failedEffy
@IwillnotexistIdonotexist The question is why doesn't the prefix version compile in C while id does in C++ (being more experienced with C++, I expected that to compile).Parkland
@Munafo I think ++ptr = ptr1 is not UB in C++ (>= 11). There is a sequenced-before relationship between the side-effect of the prefix ++ and the side-effect of =.Epp
@Epp I never fully got how = works in C++11 in respect to sequencing .Munafo
@Parkland Interesting, I tried myself and it does compile in C++. Myself I can only speak for C, and indeed it doesn't compile either way in C.Eckart
@Munafo Yeah, the sequencing rules are confusing sometimes. For =, it's still rather simple: the left and right hand side are computed, THEN the side-effect happens, THEN the value of the assignment-expression is computed. The (value) computation of the lhs and rhs are unsequenced. The value computation of the lhs in this case requires the side-effect of ++ to have happended.Epp
@IwillnotexistIdonotexist Yes, it is an lvalue in C++. I'm not sure what the reason for it not to be one in C could be.Parkland
C99 rationale, 6.5.2.4 "The C89 Committee did not endorse the practice in some implementations of considering post- increment and post-decrement operator expressions to be lvalues." The same applies (by reference) to prefix increment/decrement.Epp
@Epp it is indeed not UB, the logic used by defect report 637 really helps in making an easy to read explanation of why. These proofs are always hard to make understandable and 637 is one of the best I have read.Casillas
@Munafo you may find the defect report 637 which I quote in my answer very helpful in your understanding, it was one of the most helpful proofs I have read.Casillas
@OllieFord please see A C++ implementation that detects undefined behavior?, catching all UB would not be possible as some answers explain. John Regehr whom I link to in my answer has some of the best articles on this and related topics.Casillas
@Bibbye presumably clang runs its undefined behavior sanitizer at runtime so it can catch more cases.Casillas
@ShafikYaghmour It does have a mode that will emit extra code to catch some undefined behaviors and emit signals when this happens, and in fact it does have some compile-time UB checks, too. But I doubt it will catch everything. Of course, something is better than nothing.Bibbye
B
73

In both C and C++, the result of x++ is an rvalue, so you can't assign to it.

In C, ++x is equivalent to x += 1 (C standard §6.5.3.1/p2; all C standard cites are to WG14 N1570). In C++, ++x is equivalent to x += 1 if x is not a bool (C++ standard §5.3.2 [expr.pre.incr]/p1; all C++ standard cites are to WG21 N3936).

In C, the result of an assignment expression is an rvalue (C standard §6.5.16/p3):

An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment, but is not an lvalue.

Because it's not an lvalue, you can't assign to it: (C standard §6.5.16/p2 - note that this is a constraint)

An assignment operator shall have a modifiable lvalue as its left operand.

In C++, the result of an assignment expression is an lvalue (C++ standard §5.17 [expr.ass]/p1):

The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand.

So ++ptr = ptr1; is a diagnosable constraint violation in C, but does not violate any diagnosable rule in C++.

However, pre-C++11, ++ptr = ptr1; has undefined behavior, as it modifies ptr twice between two adjacent sequence points.

In C++11, the behavior of ++ptr = ptr1 becomes well defined. It's clearer if we rewrite it as

(ptr += 1) = ptr1;

Since C++11, the C++ standard provides that (§5.17 [expr.ass]/p1)

In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation.

So the assignment performed by the = is sequenced after the value computation of ptr += 1 and ptr1. The assignment performed by the += is sequenced before the value computation of ptr += 1, and all value computations required by the += are necessarily sequenced before that assignment. Thus, the sequencing here is well-defined and there is no undefined behavior.

Bermejo answered 3/9, 2014 at 22:23 Comment(10)
In your last quote is "the assignment" supposed to mean "the side-effect of the assignment" ?Valorous
Actually I don't understand why in C the value of a (non compound) assignment expression is said to be that of its left operand; it is actually the value of its right operand (and then obviously it is an rvalue). Sure, in cases like i=i+1 it is not the value the right operand expression would get if it were evaluated again, but a similar statement is not true for the left operand expression either, in somewhat contrived cases like int a[2]={0,3}; a[a[0]]=1.Bresee
There's one more point: every object in C++ is equivalent to an array of one object — itself — and the one-past-the-end value of any array is a valid pointer value. So there's no "messy … unallocated memory."Necrophilism
@Necrophilism good point, added a note to my answer covering that.Casillas
++ptr = ptr1; is grammatically correct in both languages (the left hand expression can be an unary-expression in C); Or what did you mean by syntactical correctness?Scramble
@Bermejo "Diagnosable" implies that the violation does not require a diagnostic, but it requires at least one diagnostic as per 5.1.1.3/1, doesn't it? I'm solely asking because I saw the word "Diagnosable" a couple of times in such contexts, and it seems that it's not the right word to use.Scramble
@Scramble Well, C doesn't use this word normatively; C++ does, and it actually means "rules whose violation must be diagnosed" (see [intro.compliance]/p1).Bermejo
"The post-C++11 C++ standard" Hmmm, I think you mean "The C++11 C++ standard". (Interesting issue about English words pre/post in question about prefix/postfix.)Abana
@chux Better? (I wanted to also include C++14.)Bermejo
@MarcvanLeeuwen: Here's an example where there's a big difference, and clearly the value of the assignment expression is not the value of its right operand: rextester.com/CIWA70704Category
C
17

In C the result of pre and post increment are rvalues and we can not assign to an rvalue, we need an lvalue(also see: Understanding lvalues and rvalues in C and C++) . We can see by going to the draft C11 standard section 6.5.2.4 Postfix increment and decrement operators which says (emphasis mine going forward):

The result of the postfix ++ operator is the value of the operand. [...] See the discussions of additive operators and compound assignment for information on constraints, types, and conversions and the effects of operations on pointers. [...]

So the result of post-increment is a value which is synonymous for rvalue and we can confirm this by going to section 6.5.16 Assignment operators which the paragraph above points us to for further understanding of constraints and results, it says:

[...] An assignment expression has the value of the left operand after the assignment, but is not an lvalue.[...]

which further confirms the result of post-increment is not an lvalue.

For pre-increment we can see from section 6.5.3.1 Prefix increment and decrement operators which says:

[...]See the discussions of additive operators and compound assignment for information on constraints, types, side effects, and conversions and the effects of operations on pointers.

also points back to 6.5.16 like post-increment does and therefore the result of pre-increment in C is also not an lvalue.

In C++ post-increment is also an rvalue, more specifically a prvalue we can confirm this by going to section 5.2.6 Increment and decrement which says:

[...]The result is a prvalue. The type of the result is the cv-unqualified version of the type of the operand[...]

With respect to pre-increment C and C++ differ. In C the result is an rvalue while in C++ the result is a lvalue which explains why ++ptr = ptr1; works in C++ but not C.

For C++ this is covered in section 5.3.2 Increment and decrement which says:

[...]The result is the updated operand; it is an lvalue, and it is a bit-field if the operand is a bit-field.[...]

To understand whether:

++ptr = ptr1;

is well defined or not in C++ we need two different approaches one for pre C++11 and one for C++11.

Pre C++11 this expression invokes undefined behavior, since it is modifying the object more than once within the same sequence point. We can see this by going to a Pre C++11 draft standard section 5 Expressions which says:

Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.57) Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined. [ Example:

 i = v[i ++]; / / the behavior is undefined
 i = 7 , i++ , i ++; / / i becomes 9
 i = ++ i + 1; / / the behavior is undefined
 i = i + 1; / / the value of i is incremented

—end example ]

We are incrementing ptr and then subsequently assigning to it, which is two modifications and in this case the sequence point occurs at the end of the expression after the ;.

For C+11, we should go to defect report 637: Sequencing rules and example disagree which was the defect report that resulted in:

i = ++i + 1;

becoming well defined behavior in C++11 whereas prior to C++11 this was undefined behavior. The explanation in this report is one of best I have even seen and reading it many times was enlightening and helped me understand many concepts in a new light.

The logic that lead to this expression becoming well defined behavior goes as follows:

  1. The assignment side-effect is required to be sequenced after the value computations of both its LHS and RHS (5.17 [expr.ass] paragraph 1).

  2. The LHS (i) is an lvalue, so its value computation involves computing the address of i.

  3. In order to value-compute the RHS (++i + 1), it is necessary to first value-compute the lvalue expression ++i and then do an lvalue-to-rvalue conversion on the result. This guarantees that the incrementation side-effect is sequenced before the computation of the addition operation, which in turn is sequenced before the assignment side effect. In other words, it yields a well-defined order and final value for this expression.

The logic is somewhat similar for:

++ptr = ptr1;
  1. The value computations of the LHS and RHS are sequenced before the assignment side-effect.

  2. The RHS is an lvalue, so its value computation involves computing the address of ptr1.

  3. In order to value-compute the LHS (++ptr), it is necessary to first value-compute the lvalue expression ++ptr and then do an lvalue-to-rvalue conversion on the result. This guarantees that the incrementation side-effect is sequenced before the assignment side effect. In other words, it yields a well-defined order and final value for this expression.

Note

The OP said:

Yes, I understand it's messy and you're dealing with unallocated memory, but it works and compiles.

Pointers to non-array objects are considered arrays of size one for additive operators, I am going to quote the draft C++ standard but C11 has almost the exact same text. From section 5.7 Additive operators:

For the purposes of these operators, a pointer to a nonarray object behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

and further tells us pointing one past the end of an array is valid as long as you don't dereference the pointer:

[...]If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

so:

++ptr ;

is still a valid pointer.

Casillas answered 4/9, 2014 at 3:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.