Thanks to comments by Deduplicator and user694733, here is a modified version of my original answer.
The C++ version has undefinedunspecified behaviour.
There is a subtle difference between "undefined" and "unspecified", in that the former allows a program to do anything (including crashing) whereas the latter allows it to choose from a set of particular allowed behaviours without dictating which choice is correct.
Except of very rare cases, you will always want to avoid both.
A good starting point to understand whole issue are the C++ FAQs Why do some people think x = ++y + y++ is bad? , What’s the value of i++ + i++? and What’s the deal with “sequence points”?:
Between the previous and next sequence point a scalar object shall
have its stored value modified at most once by the evaluation of an
expression.
(...)
Basically, in C and C++, if you read a variable twice in an expression
where you also write it, the result is undefined.
(...)
At certain specified points in the execution sequence called sequence
points, all side effects of previous evaluations shall be complete and
no side effects of subsequent evaluations shall have taken place. (...)
The “certain specified points” that are called sequence points are (...)
after evaluation of all a function’s parameters but before the first
expression within the function is executed.
In short, modifying a variable twice between two consecutive sequence points yields undefined behaviour, but a function call introduces an intermediate sequence point (actually, two intermediate sequence points, because the return statement creates another one).
This means the fact that you have a function call in your expression "saves" your Simple::X += Simple::f();
line from being undefined and turns it into "only" unspecified.
Both 1 and 11 are possible and correct outcomes, whereas printing 123, crashing or sending an insulting e-mail to your boss are not allowed behaviours; you'll just never get a guarantee whether 1 or 11 will be printed.
The following example is slightly different. It's seemingly a simplification of the original code but really serves to highlight the difference between undefined and unspecified behaviour:
#include <iostream>
int main() {
int x = 0;
x += (x += 10, 1);
std::cout << x << "\n";
}
Here the behaviour is indeed undefined, because the function call has gone away, so both modifications of x
occur between two consecutive sequence points. The compiler is allowed by the C++ language specification to create a program which prints 123, crashes or sends an insulting e-mail to your boss.
(The e-mail thing of course is just a very common humorous attempt at explaining how undefined really means anything goes. Crashes are often a more realistic result of undefined behaviour.)
In fact, the , 1
(just like the return statement in your original code) is a red herring. The following yields undefined behaviour, too:
#include <iostream>
int main() {
int x = 0;
x += (x += 10);
std::cout << x << "\n";
}
This may print 20 (it does so on my machine with VC++ 2013) but the behaviour is still undefined.
(Note: this applies to built-in operators. Operator overloading changes the behaviour back to specified, because overloaded operators copy the syntax from the built-in ones but have the semantics of functions, which means that an overloaded +=
operator of a custom type that appears in an expression is actually a function call. Therefore, not only are sequence points introduced but the entire ambiguity goes away, the expression becoming equivalent to x.operator+=(x.operator+=(10));
, which has guaranteed order of argument evaluation. This is probably irrelevant to your question but should be mentioned anyway.)
In contrast, the Java version
import java.io.*;
class Ideone
{
public static void main(String[] args)
{
int x = 0;
x += (x += 10);
System.out.println(x);
}
}
must print 10. This is because Java has neither undefined nor unspecified behaviour with regards to evaluation order. There are no sequence points to be concerned about. See Java Language Specification 15.7. Evaluation Order:
The Java programming language guarantees that the operands of
operators appear to be evaluated in a specific evaluation order,
namely, from left to right.
So in the Java case, x += (x += 10)
, interpreted from left to right, means that first something is added to 0, and that something is 0 + 10. Hence 0 + (0 + 10) = 10.
See also example 15.7.1-2 in the Java specification.
Going back to your original example, this also means that the more complex example with the static variable has defined and specified behaviour in Java.
Honestly, I don't know about C# and PHP but I would guess that both of them have some guaranteed evaluation order as well. C++, unlike most other programming languages (but like C) tends to allow much more undefined and unspecified behaviour than other languages. That's not good or bad. It's a tradeoff between robustness and efficiency. Choosing the right programming language for a particular task or project is always a matter of analysing tradeoffs.
In any case, expressions with such side effects are bad programming style in all four languages.
One final word:
I found a little bug in C# and Java.
You should not assume to find bugs in language specifications or compilers if you don't have many years of professional experience as a software engineer.
f()
. Just because different languages use the same operator syntax, does not make them operate the same way – Kendalkendallvar x = 0;function f(){x = x + 10;return 1;}console.log(x + f());
outputs1
. – Mouthpart