Does the Java compiler optimize an unnecessary ternary operator?
Asked Answered
H

6

26

I’ve been reviewing code where some coders have been using redundant ternary operators “for readability.” Such as:

boolean val = (foo == bar && foo1 != bar) ? true : false;

Obviously it would be better to just assign the statement’s result to the boolean variable, but does the compiler care?

Hamer answered 2/2, 2019 at 18:6 Comment(11)
Out of curiosity, do these same coders do stuff like if( foo == true ) "for readability"?Incendiarism
That may or may not be part of the original ternary that sparked this question... but that’s outside the scope of this question. :)Hamer
I'm sorry. I too have to work with coders who do this. You lose all faith very quickly, in my experience. ;)Incendiarism
There is really no way of knowing other than trying it with a java decompiler or bytecode viewer, and even then it will probably depend on compiler version. I mean, suppose someone shows up and says "no it won't", why should you believe them? My guess is that it will, and if it doesn't then the JIT most probably will, but take my word with a grain of salt.Ideograph
Do they do that with all boolean expressions? It's quite strange. I'd be curious if they do things like if ( isValid(input) ? true : false ) too.Acclivity
I would worry less about the compiler and more about coding standards and code reviews. The expressions are never* going to be your hot spots when profiling, so my primary concern would be that such expressions are actually less readable, rather than that they might slow the program down by an instruction or two.Bottrop
This is only more readable if you're incompetent in Boolean logic. ...Maybe not even then.Kaylenekayley
@Paulpro I suspect it's more about making the code read more like text: int x = y reads naturally as "set the integer x to y," but bool z = x == y reads like "set the boolean z to x equal to y," which sounds funny.Bennir
There is no such thing as "the Java compiler". In order for your question to be answerable, you need to specify which Java compiler (and ideally which precise version) you are talking about. Are you talking about ECJ? GCJ? Jikes? JScp?Verbosity
I bet they write if (foo == false) because they think ! is hard to spot in !foo...Pentheam
@Paulpro if ( (isValid(input) ? true : false) == true )Dulcedulcea
B
27

I find that unnecessary usage of the ternary operator tends to make the code more confusing and less readable, contrary to the original intention.

That being said, the compiler's behaviour in this regard can easily be tested by comparing the bytecode as compiled by the JVM.
Here are two mock classes to illustrate this:

Case I (without the ternary operator):

class Class {

    public static void foo(int a, int b, int c) {
        boolean val = (a == c && b != c);
        System.out.println(val);
    }

    public static void main(String[] args) {
       foo(1,2,3);
    }
}

Case II (with the ternary operator):

class Class {

    public static void foo(int a, int b, int c) {
        boolean val = (a == c && b != c) ? true : false;
        System.out.println(val);
    }

    public static void main(String[] args) {
       foo(1,2,3);
    }
}

Bytecode for foo() method in Case I:

       0: iload_0
       1: iload_2
       2: if_icmpne     14
       5: iload_1
       6: iload_2
       7: if_icmpeq     14
      10: iconst_1
      11: goto          15
      14: iconst_0
      15: istore_3
      16: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
      19: iload_3
      20: invokevirtual #3                  // Method java/io/PrintStream.println:(Z)V
      23: return

Bytecode for foo() method in Case II:

       0: iload_0
       1: iload_2
       2: if_icmpne     14
       5: iload_1
       6: iload_2
       7: if_icmpeq     14
      10: iconst_1
      11: goto          15
      14: iconst_0
      15: istore_3
      16: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
      19: iload_3
      20: invokevirtual #3                  // Method java/io/PrintStream.println:(Z)V
      23: return

Note that in both cases the bytecode is identical, i.e the compiler disregards the ternary operator when compiling the value of the val boolean.


EDIT:

The conversation regarding this question has gone one of several directions.
As shown above, in both cases (with or without the redundant ternary) the compiled java bytecode is identical.
Whether this can be regarded an optimization by the Java compiler depends somewhat on your definition of optimization. In some respects, as pointed out multiple times in other answers, it makes sense to argue that no - it isn't an optimization so much as it is the fact that in both cases the generated bytecode is the simplest set of stack operations that performs this task, regardless of the ternary.

However regarding the main question:

Obviously it would be better to just assign the statement’s result to the boolean variable, but does the compiler care?

The simple answer is no. The compiler doesn't care.

Baltazar answered 2/2, 2019 at 18:45 Comment(3)
+1 for "I find that unnecessary usage of the ternary operator tends to make the code more confusing and less readable, contrary to the original intention." I'd move that to the beginning, if I were you.Kaylenekayley
By "readability" the authors possibly meant "understandability for absolute noobs", which may be a valid point. if(myBoolean) is somewhat confusing for noobs whereas if(myBoolean == true) is straight-forward.Nauplius
You should never infer language behavior from a compiler example. It may be that optimization is optional and you could get different results on different compilers. Always refer to the language specification when making these decisions.Vocalist
K
9

Contrary to the answers of Pavel Horal, Codo and yuvgin I argue that the compiler does NOT optimize away (or disregard) the ternary operator. (Clarification: I refer to the Java to Bytecode compiler, not the JIT)

See the test cases.

Class 1: Evaluate boolean expression, store it in a variable, and return that variable.

public static boolean testCompiler(final int a, final int b)
{
    final boolean c = ...;
    return c;
}

So, for different boolean expressions we inspect the bytecode: 1. Expression: a == b

Bytecode

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_1
   6: goto          10
   9: iconst_0
  10: istore_2
  11: iload_2
  12: ireturn
  1. Expression: a == b ? true : false

Bytecode

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_1
   6: goto          10
   9: iconst_0
  10: istore_2
  11: iload_2
  12: ireturn
  1. Expression: a == b ? false : true

Bytecode

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_0
   6: goto          10
   9: iconst_1
  10: istore_2
  11: iload_2
  12: ireturn

Cases (1) and (2) compile to exactly the same bytecode, not because the compiler optimizes away the ternary operator, but because it essentially needs to execute that trivial ternary operator every time. It needs to specify at bytecode level whether to return true or false. To verify that, look at case (3). It is exactly the same bytecode except lines 5 and 9 which are swapped.

What happens then and a == b ? true : false when decompiled produces a == b? It is the decompiler's choice that selects the easiest path.

Furthermore, based on the "Class 1" experiment, it is reasonable to assume that a == b ? true : false is exactly the same as a == b, in the way it is translated to bytecode. However this is not true. To test that we examine the following "Class 2", the only difference with the "Class 1" being that this doesn't store the boolean result in a variable but instead immediately returns it.

Class 2: Evaluate a boolean expression and return the result (without storing it in a variable)

public static boolean testCompiler(final int a, final int b)
{
    return ...;
}
    1. a == b

Bytecode:

   0: iload_0
   1: iload_1
   2: if_icmpne     7
   5: iconst_1
   6: ireturn
   7: iconst_0
   8: ireturn
    1. a == b ? true : false

Bytecode

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_1
   6: goto          10
   9: iconst_0
  10: ireturn
    1. a == b ? false : true

Bytecode

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_0
   6: goto          10
   9: iconst_1
  10: ireturn

Here it is obvious that the a == b and a == b ? true : false expressions are compiled differently, as cases (1) and (2) produce different bytecodes (cases (2) and (3), as expected, have only their lines 5,9 swapped).

At first I found this surprising, as I was expecting all 3 cases to be the same (excluding the swapped lines 5,9 of case (3)). When the compiler encounters a == b, it evaluates the expression and returns immediately after contrary to the encounter of a == b ? true : false where it uses the goto to go to line ireturn. I understand that this is done to leave space for potential statements to be evaluated inside the 'true' case of the ternary operator: between the if_icmpne check and the goto line. Even if in this case it is just a boolean true, the compiler handles it as it would in the general case where a more complex block would be present.
On the other hand, the "Class 1" experiment obscured that fact, as in the true branch there was also istore, iload and not only ireturn forcing a goto command and resulting in exactly the same bytecode in cases (1) and (2).

As a note regarding the test environment, these bytecodes were produced with the latest Eclipse (4.10) which uses the respective ECJ compiler, different from the javac that IntelliJ IDEA uses.

However, reading the javac-produced bytecode in the other answers (which are using IntelliJ) I believe the same logic applies there too, at least for the "Class 1" experiment where the value was stored and not returned immediately.

Finally, as already pointed out in other answers (such as those by supercat and jcsahnwaldt) , both in this thread and in other questions of SO, the heavy optimizing is done by the JIT compiler and not from the java-->java-bytecode compiler, so these inspections while informative to the bytecode translation are not a good measure of how the final optimized code will execute.

Complement: jcsahnwaldt's answer compares javac's and ECJ's produced bytecode for similar cases

(As a disclaimer, I have not studied the Java compiling or disassembly that much to actually know what it does under the hood; my conclusions are mainly based on the results of the above experiments.)

Kelby answered 2/2, 2019 at 20:32 Comment(20)
Case 3 has clearly undergone some optimizations in the compiler: there is never a value on the JVM stack that represents the value of the expression a == b; it has been optimized away. The fact that such optimizations generally happen whenever the result of a boolean operator is used as a condition does not stop it from being an optimization. And the very same optimization is at work in case 2.Wallen
If an optimization had not been in place, the code would first produce the boolean value of a == b on the stack (involving a conditional jump and two load-constant instructions), and then there would be another conditional jump on the resulting value to to other load-constant instructions that represent the two branches of the ternary operator.Wallen
The concept of "evaluate a conditional and branch to either of two places based if it is true or false" may be evaluated directly without decomposing it into "evaluate a conditional branch to yield a bool, and then branch based upon that."Proudman
@supercat: Yes, it may. That is an example of an optimization that the bytcode compiler does perform.Wallen
@HenningMakholm: A compiler could have separate logic for evaluating a bool expression that occurs in a context where it controls a branch, versus where it generates a value. The language spec need not distinguish the contexts from a "user" perspective, but for code generation purposes it's trivial and isn't really an "optimization".Proudman
@supercat: You're describing a simple compiler optimization that the bytecode compiler performs. It is a good optimization, worth performing in the compiler. I don't understand why you keep claiming that it is not an optimization. It is semantically valid and results in better code than a naive translation would produce. What other criteria do you have for using the word "optimization"?Wallen
Exactly. a == b ? true : false is a == b. They literally mean the same thing, and there are no additional operations to perform in the former case. This is not optimization, it's just a case of two entirely equivalent statements producing the same translated code, which ought not to be surprising. Java code is a description of a program; it is an abstraction. It is not, by default, a one-to-one mapping of English words to CPU or JVM instructions. There is no logical branch here, so no expectation of a practical one exists.Alimentary
@LightnessRacesinOrbit: Just because an optimization produces code with the correct semantics doesn't mean it's not an optimization. On the contrary,it is inherent in the concept of optimization that it produces equivalent code. It it didn't, it would not be an "optimization" but a "compiler bug".Wallen
@HenningMakholm I'm aware of that; it's not what I was saying. My point is that not every part of the process of translating source code to an actual program is "optimisation" and I think you've drawn the line between "just the normal translation process" and "optimisation" in the wrong place. An optimisation produces logic that, though it has the same semantics, is notably dissimilar to that of the original code in its approach. This does not - it's literally identical. Though it's really just, ironically, an argument over semantics.Alimentary
@LightnessRacesinOrbit: Of course not every part of translation is an optimization. But everything that happens during translation and leads to better code than a naive translation would produce is by definition an optimization.Wallen
@HenningMakholm The naive translation is what we get. That's what I'm saying. There is no more naive translation, at least not unless it were really contrived (like introducing branches where no logical branch was ever called for! the compiler would have to deliberately introduce this additional work)Alimentary
@LightnessRacesinOrbit: No, what is shown here is not a naive translation. A naive translation would translate every expression of type boolean to code that produces a boolean value on the stack, independently of the context. To do anything differently from a native translation is by definition an "optimization" (unless it's a bug).Wallen
@HenningMakholm: In Java, something like while (condition) statement1; gets processed effectively as label1: conditionalBranch(condition, label2, label3); label2: statement1; goto label1; label3:. The processing of conditionalBranch processes certain operators by calling conditionalBranch on the operands, e.g. if `cond1`` is constant true, that operation will become an unconditional branch to label2; if constant false, that operation would be replaced by an unconditional branch to label3.Proudman
If condition is cond1 ? cond2 : cond3; it would be evaluated by evaluating cond1 and going to either a function that branches based on cond2 or one that branches based on cond3. The Standard requires that implementations squawk if the body of a while() is not reachable, and while it might be possible to make such determinations while evaluating ?: constructs based on values, it's easier to evaluate a ?: within a conditional expression as conditionally executing one of two two-way branches.Proudman
@supercat: The language specification requires that a very precisely defined constant propagation algorithm be used for the "unreachable code" error. Nothing smarter nor dumber is allowed for this purpose. It's not something that's a good idea to do based on which actual code you generate; that would would block too many little code generation optimizations you might want to do at that phase.Wallen
EDIT: I now see the edit wasn't made by the OP. I will suggest to edit myself out. If this is accepted I will gladly undo the dv. I at no point claimed this was a compiler optimization, I rather compared the bytecode to show that the compiler disregards the redundant ternary op. As for this answer, I think it makes some very solid points.Baltazar
@Baltazar No it wasn't, though when initially posting the answer I had your answer in the "optimization category" in my mind as well. I am not sure I understand the difference there: if it is the compiler that disregards the redundant ternary operator isn't it an optimization? In any case, we still result in different conclusions. I understand, it does not disregard the redundant ternary operator. [1/2, too long comment]Kelby
I get that from the "Class 2" experiment, where when immediately returning the value, the a == b and a == b ? true : false generate different bytecode. The "Class 1", due to first storing the variable instead of immediately returning, it seems it just happens to produce the same bytecode in both cases (which was a surprise for me too, that's the "weird" exclamation about in my original answer). I will edit the "Class 2" conclusions as it seems they are not clear. I 'll happily accept an edit that reflects that. [2/2, too long comment]Kelby
@tryman: Given something like if (cond1 ? cond2 : cond3) X else Y;, cond1 selects between two branching actions: branch to X or Y based upon cond2, or branch to X or Y based upon cond3. A "conditional" branch based on a constant is of course a simple jump. I suppose replacing a branch to an unconditional branch with a branch to the target may be an "optimization", but it's a trivial one. Given bool1 = bool2 ? true : false;, bool2 selects between two value-producing actions: load true or load false.Proudman
I suppose a more interesting experiment might be something like while (a==b ? false : false) or someBool = (a==b) ? false:false;. I don't have javac handy, but those cases would be interesting since the body of the loop would be statically unreachable but I don't think cond ? const1 : const1 is considered a constant expression.Proudman
F
6

Yes, the Java compiler does optimize. It can be easily verified:

public class Main1 {
  public static boolean test(int foo, int bar, int baz) {
    return foo == bar && bar == baz ? true : false;
  }
}

After javac Main1.java and javap -c Main1:

  public static boolean test(int, int, int);
    Code:
       0: iload_0
       1: iload_1
       2: if_icmpne     14
       5: iload_1
       6: iload_2
       7: if_icmpne     14
      10: iconst_1
      11: goto          15
      14: iconst_0
      15: ireturn

public class Main2 {
  public static boolean test(int foo, int bar, int baz) {
    return foo == bar && bar == baz;
  }
}

After javac Main2.java and javap -c Main2:

  public static boolean test(int, int, int);
    Code:
       0: iload_0
       1: iload_1
       2: if_icmpne     14
       5: iload_1
       6: iload_2
       7: if_icmpne     14
      10: iconst_1
      11: goto          15
      14: iconst_0
      15: ireturn

Both examples end up with exactly the same bytecode.

Flack answered 2/2, 2019 at 18:38 Comment(0)
P
4

The javac compiler does not generally attempt to optimize code before outputting bytecode. Instead, it relies upon the Java virtual machine (JVM) and just-in-time (JIT) compiler that converts the bytecode to machine code to situations where a construct would be equivalent to a simpler one.

This makes it much easier to determine if an implementation of a Java compiler is working correctly, since most constructs can only be represented by one predefined sequence of bytecodes. If a compiler produces any other bytecode sequence, it is broken, even if that sequence would behave in the same fashion as the original.

Examining the bytecode output of the javac compiler is not a good way of judging whether a construct is likely to execute efficiently or inefficiently. It would seem likely that there may be some JVM implementation where constructs like (someCondition ? true : false) would perform worse than (someCondition), and some where they would perform identically.

Proudman answered 2/2, 2019 at 20:59 Comment(2)
Examining the bytecode output does show that the compiler performs certain optimizations. It doesn't do very many kinds of optimization, but the basic optimization of representing the value of a boolean expression either as control flow or as a value on the stack (and emitting code to convert between the representations if the expression wants to produce a representation different from what the context wants to consume) is done by the bytecode compiler. (Yes, this is an optimization; a "simplest possible correct" translation from Java to bytecode wouldn't do it).Wallen
@Proudman I believe this question is relevant and generally supports your case: Optimization by java compiler?Kelby
R
1

In IntelliJ, I've compiled your code and the opened the class file, which is automatically decompiled. The result is:

boolean val = foo == bar && foo1 != bar;

So yes, the Java compiler optimizes it.

Riobard answered 2/2, 2019 at 18:38 Comment(1)
I don't think that's an accurate way to test it. The decompiler could make some adjustments to the output to make it easier to read. Different decompilers can give different results too. It makes more sense to compare the actual bytecode itself.Lownecked
G
1

I'd like to synthesize the excellent information given in the previous answers.

Let's look at what Oracle's javac and Eclipse's ecj do with the following code:

boolean  valReturn(int a, int b) { return a == b; }
boolean condReturn(int a, int b) { return a == b ? true : false; }
boolean   ifReturn(int a, int b) { if (a == b) return true; else return false; }

void  valVar(int a, int b) { boolean c = a == b; }
void condVar(int a, int b) { boolean c = a == b ? true : false; }
void   ifVar(int a, int b) { boolean c; if (a == b) c = true; else c = false; }

(I simplified your code a bit - one comparison instead of two - but the behavior of the compilers described below is essentially the same, including their slightly different results.)

I compiled the code with javac and ecj and then decompiled it with Oracle's javap.

Here's the result for javac (I tried javac 9.0.4 and 11.0.2 - they generate exactly the same code):

boolean valReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: ireturn

boolean condReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: ireturn

boolean ifReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     7
     5: iconst_1
     6: ireturn
     7: iconst_0
     8: ireturn

void valVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: istore_3
    11: return

void condVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: istore_3
    11: return

void ifVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     10
     5: iconst_1
     6: istore_3
     7: goto          12
    10: iconst_0
    11: istore_3
    12: return

And here's the result for ecj (version 3.16.0):

boolean valReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     7
     5: iconst_1
     6: ireturn
     7: iconst_0
     8: ireturn

boolean condReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: ireturn

boolean ifReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     7
     5: iconst_1
     6: ireturn
     7: iconst_0
     8: ireturn

void valVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: istore_3
    11: return

void condVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: istore_3
    11: return

void ifVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     10
     5: iconst_1
     6: istore_3
     7: goto          12
    10: iconst_0
    11: istore_3
    12: return

For five of the six functions, both compilers generate exactly the same code. The only difference is in valReturn: javac generates a goto to an ireturn, but ecj generates an ireturn. For condReturn, they both generate a goto to an ireturn. For ifReturn, they both generate an ireturn.

Does that mean that one of the compilers optimizes one or more of these cases? One might think that javac optimizes the ifReturn code, but fails to optimize valReturn and condReturn, while ecj optimizes ifReturn and and valReturn, but fails to optimize condReturn.

But I don't think that's true. Java source code compilers basically don't optimize code at all. The compiler that does optimize the code is the JIT (just-in-time) compiler (the part of the JVM that compiles byte code to machine code), and the JIT compiler can do a better job if the byte code is relatively simple, i.e. has not been optimized.

In a nutshell: No, Java source code compilers do not optimize this case, because they don't really optimize anything. They do what the specifications require them to do, but nothing more. The javac and ecj developers simply chose slightly different code generation strategies for these cases (presumably for more or less arbitrary reasons).

See these Stack Overflow questions for a few more details.

(Case in point: both compilers nowadays ignore the -O flag. The ecj options explicitly say so: -O: optimize for execution time (ignored). javac doesn't even mention the flag anymore and just ignores it.)

Gilkey answered 3/2, 2019 at 8:14 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.