Does Java have undefined behavior like C++ does?
Asked Answered
C

3

7

Undefined behavior and sequence points

The link above is talking about sequence point and side effect in C++.

In a word, it means that between two sequence points, if we have more than one side effects, the order of the side effects are unspecified.

For example,

int x = 1;
int y = 2;
int z = x++ + y++;

What we can be sure is that z equals to 3. After z getting 3, x and y will increase --- there are two side effects so we don't know which one increases first.

Besides, the link above has listed all kinds of sequence points.

My question is, does Java has exactly the same case? I mean the same kinds of sequence points and the same undefined behaviors?

Clyve answered 6/10, 2016 at 2:9 Comment(2)
Does Java even have undefined behaviour? Related: programmers.stackexchange.com/q/153843/144548Sew
No, sequence points, or sequencing, is unique to C++. Java has its own kinds of undefined behavior, though.Baedeker
H
11

A major difference between "modern" C and C++, compared with most other popular languages, is that while other languages allow compilers to select among various corner-case behaviors in Unspecified fashion, the authors of the C and C++ Standard didn't want to constrain the languages to platforms where any kind of behavioral guarantee could be met easily.

Given a construct like:

int blah(int x)
{
  return x+10 > 20 ? x : 0;
}

Java precisely specifies the behavior for all values of x, including those which would cause integer wraparound; the design of early C compilers for two's-complement machines would yield the same behavior except that machines with different sizes of "int" (16 bit, 36 bit, etc.) would wrap at different places. Machines that use other representations for integers might behave differently, however.

Further, it would not have been uncommon for even "traditional" C compilers to behave as though the computations were performed on a longer type. Some machines had a few instructions that operated with longer types, and using those instructions and keeping values as longer types could sometimes be cheaper than truncating/wrapping values into the range of "int". On such machines, it would not be surprising for a function like the above to yield x even for values that were within 10 of overflowing. Note that Java tries to minimize behavioral differences among implementations, and thus does not generally allow even that level of behavioral variation.

Modern C, however, goes an extra step beyond Java. Not only does it allow for the possibility that compilers might arbitrarily keep excess precision with integer values, a modern compiler given a function like the above may infer that since the Standard would allow compilers to do anything whatsoever if a program receives inputs that would cause the function to receive a value of x greater than INT_MAX-10, a compiler should discard as irrelevant any code which would have no effect if such inputs are not received. The net effect of this is that integer overflow can disrupt the effect of preceding code in arbitrary fashion.

Java is thus two steps removed from Modern C's model of "Undefined Behavior"; it rigidly prescribes many more behaviors, and even in cases where behaviors are not rigidly defined implementations are still limited to choosing from among various possibilities. Unless one makes use of features in the Unsafe namespace or links Java with outside languages, Java programs will have much more constrained behavior, and even when using such constructs Java programs will still obey laws of time and causality in ways C programs may not.

Hixson answered 6/10, 2016 at 15:18 Comment(0)
R
4

Expression evaluation results on a single Thread are completely specified in the Java Language Specification. There is no undefined behaviour in expression evaluation (on a single thread).

In C/C++, "undefined behaviour" means that anything may happen. If you put int z = x++ + y++; in your C program, the compiler may decide to generate code that formats you hard drive and it would still be compliant with the standard. There's nothing like that in Java either.

The behaviour of some constructs in a multi-threaded application may be non-deterministic (if not properly synchronized) but it isn't completely undefined either - it is clear what (out of a number of things) could happen if you don't properly synchronize the application - but you don't know which of the things from the list happen.

And there are some Java API's that don't define their result if you don't call them in a certain specified way. Their behaviour may vary from version to version in of the library, but usually the behaviour is consistent within the same version.

Reverie answered 6/10, 2016 at 2:44 Comment(1)
This answer is correct. I have a very minor nitpick ─ technically there are expressions such as new int[1000000], new MyClass() or (Integer) 123456 which the JLS allows to have multiple possible results in the same program state. They either may create a new object and complete normally with a reference to that object, or alternatively they may complete abruptly by raising an OutOfMemoryError. The spec says this depends on if there is "sufficient memory available" but doesn't mandate how that should be determined. Likewise for StackOverflowError on method/constructor calls.Foret
S
1

If it is not specified, then it's undefined and you have no guarantees.

Java is a language and therefore a specification exists: the so called JLS (Java Language Specification).

If something is not specified (like the order of the values of a HashMap) you can not expect to keep the order. The JLS in that case is similar to the Javadocs.

Correspondingly, if something is specified but the behavior is wrong, then that is a bug.

Your case is specified in the JLS, section 15.18 Additive Operators.

Scansorial answered 6/10, 2016 at 2:27 Comment(5)
This is misleading. There is a major difference between unspecified behaviour and undefined behaviour. Unspecified behaviour means that there are two or more valid program states which can result from performing an operation. Undefined behaviour means there must be no program states in which the operation will be performed, and the programmer, not the compiler, is responsible for ensuring this. The iteration order of a HashMap is unspecified, but a compiler is not allowed by the spec to assume that correctly-written code will never iterate over a HashMap.Foret
Unspecified behaviour may also mean that the behaviour may differ between jdk/jre versions, jdk/jre-vendors and various external conditions without breaking the specification. I admit I do not get your point compleatly from the sentence "must be no program state", if I got you correctly you use "must not" like a "absolute prohibition" (to the operation). My "must not" interpretation is taken from ietf.org/rfc/rfc2119.txt . I think the compiler have responsibility to avoid such absolute prohibition. I am not sure if I got you correctly. Kind regards.Scansorial
In the case of undefined behaviour, e.g. writing to an array out of bounds in C, the spec says that the compiler may assume that no correct program ever performs an out-of-bounds write. That is, there is no valid program state in which an out-of-bounds write occurs, so the compiler is allowed to assume that any state where this would happen must be unreachable, even if the compiler cannot prove it (or even if the compiler can prove that it is reachable). The behaviour of an out-of-bounds write is not merely unspecified in the sense that there are two or more things the spec says can ...Foret
... happen. It is undefined in the sense that the compiler is permitted by the spec to assume that the programmer has ensured that an out-of-bounds write will never happen, and if an out-of-bounds write can in fact occur in reachable code, then the behaviour of the whole program is undefined, not just the behaviour of the write itself. That means even operations which occur before the out-of-bounds write may exhibit different behaviour at runtime to how the spec says those operations should behave, even when those other options' behaviour is not undefined. This is sometimes described ...Foret
... by saying that "undefined behaviour allows for time travel". In contrast, unspecified behaviour just means that the spec doesn't guarantee just one specific behaviour for the particular operation. A program exhibiting unspecified behaviour does not mean that the whole program's behaviour is unspecified, only the result of that one unspecified operation. Java has some unspecified behaviour but it has no undefined behaviour.Foret

© 2022 - 2025 — McMap. All rights reserved.