Java for loop performance question
Asked Answered
A

13

22

considering this example:

public static void main(final String[] args) {
    final List<String> myList = Arrays.asList("A", "B", "C", "D");
    final long start = System.currentTimeMillis();
    for (int i = 1000000; i > myList.size(); i--) {
        System.out.println("Hello");
    }
    final long stop = System.currentTimeMillis();
    System.out.println("Finish: " + (stop - start));
}

vs

public static void main(final String[] args) {
    final List<String> myList = Arrays.asList("A", "B", "C", "D");
    final long start = System.currentTimeMillis();
    final int size = myList.size();
    for (int i = 1000000; i > size; i--) {
        System.out.println("Hello");
    }
    final long stop = System.currentTimeMillis();
    System.out.println("Finish: " + (stop - start));
}

Will this make any diffrence ? On my machine the second one seems to perform faster but i don't know if it is really accurate. Will the compiler optimze this code ? I could think that he would if the loop condition is an immutable object (e.g. String array).

Acklin answered 4/3, 2010 at 23:10 Comment(2)
If you want to measure elapsed time (and want it to be accurate) you should use System.nanoTime() BTW.Chatoyant
Will remember that good point.Acklin
S
25

If you want to test something like this, you really must optimize your microbenchmark to measure what you care about.

First, make the loop inexpensive but impossible to skip. Computing a sum usually does the trick.

Second, compare the two timings.

Here's some code that does both:

import java.util.*;

public class Test {

public static long run1() {
  final List<String> myList = Arrays.asList("A", "B", "C", "D");
  final long start = System.nanoTime();
  int sum = 0;
  for (int i = 1000000000; i > myList.size(); i--) sum += i;
  final long stop = System.nanoTime();
  System.out.println("Finish: " + (stop - start)*1e-9 + " ns/op; sum = " + sum);
  return stop-start;
}

public static long run2() {
  final List<String> myList = Arrays.asList("A", "B", "C", "D");
  final long start = System.nanoTime();
  int sum = 0;
  int limit = myList.size();
  for (int i = 1000000000; i > limit; i--) sum += i;
  final long stop = System.nanoTime();
  System.out.println("Finish: " + (stop - start)*1e-9 + " ns/op; sum = " + sum);
  return stop-start;
}

public static void main(String[] args) {
  for (int i=0 ; i<5 ; i++) {
    long t1 = run1();
    long t2 = run2();
    System.out.println("  Speedup = " + (t1-t2)*1e-9 + " ns/op\n");
  }
}

}

And if we run it, on my system we get:

Finish: 0.481741256 ns/op; sum = -243309322
Finish: 0.40228402 ns/op; sum = -243309322
  Speedup = 0.079457236 ns/op

Finish: 0.450627151 ns/op; sum = -243309322
Finish: 0.43534661700000005 ns/op; sum = -243309322
  Speedup = 0.015280534 ns/op

Finish: 0.47738474700000005 ns/op; sum = -243309322
Finish: 0.403698331 ns/op; sum = -243309322
  Speedup = 0.073686416 ns/op

Finish: 0.47729349600000004 ns/op; sum = -243309322
Finish: 0.405540508 ns/op; sum = -243309322
  Speedup = 0.071752988 ns/op

Finish: 0.478979617 ns/op; sum = -243309322
Finish: 0.36067492700000003 ns/op; sum = -243309322
  Speedup = 0.11830469 ns/op

which means that the overhead of the method call is approximately 0.1 ns. If your loop does things that take no more than 1-2 ns, then you should care about this. Otherwise, don't.

Syllogism answered 5/3, 2010 at 1:1 Comment(0)
V
10

Personally, I don't think you can draw any meaningful conclusions from a contrived example like this.

But if you really want to know, why not use javap to decompile the code and see what's different? Why guess about what the compiler's doing when you can see for yourself without asking here?

Byte code for the first case:

public class Stackoverflow extends java.lang.Object{
public Stackoverflow();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   iconst_4
   1:   anewarray       #2; //class java/lang/String
   4:   dup
   5:   iconst_0
   6:   ldc     #3; //String A
   8:   aastore
   9:   dup
   10:  iconst_1
   11:  ldc     #4; //String B
   13:  aastore
   14:  dup
   15:  iconst_2
   16:  ldc     #5; //String C
   18:  aastore
   19:  dup
   20:  iconst_3
   21:  ldc     #6; //String D
   23:  aastore
   24:  invokestatic    #7; //Method java/util/Arrays.asList:([Ljava/lang/Object;)Ljava/util/List
   27:  astore_1
   28:  invokestatic    #8; //Method java/lang/System.currentTimeMillis:()J
   31:  lstore_2
   32:  ldc     #9; //int 1000000
   34:  istore  4
   36:  iload   4
   38:  aload_1
   39:  invokeinterface #10,  1; //InterfaceMethod java/util/List.size:()I
   44:  if_icmple       61
   47:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   50:  ldc     #12; //String Hello
   52:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   55:  iinc    4, -1
   58:  goto    36
   61:  invokestatic    #8; //Method java/lang/System.currentTimeMillis:()J
   64:  lstore  4
   66:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   69:  new     #14; //class java/lang/StringBuilder
   72:  dup
   73:  invokespecial   #15; //Method java/lang/StringBuilder."<init>":()V
   76:  ldc     #16; //String Finish:
   78:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/la
   81:  lload   4
   83:  lload_2
   84:  lsub
   85:  invokevirtual   #18; //Method java/lang/StringBuilder.append:(J)Ljava/lang/StringBuilder;
   88:  invokevirtual   #19; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   91:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   94:  return
}

Byte code for the second case:

public class Stackoverflow extends java.lang.Object{
public Stackoverflow();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   iconst_4
   1:   anewarray       #2; //class java/lang/String
   4:   dup
   5:   iconst_0
   6:   ldc     #3; //String A
   8:   aastore
   9:   dup
   10:  iconst_1
   11:  ldc     #4; //String B
   13:  aastore
   14:  dup
   15:  iconst_2
   16:  ldc     #5; //String C
   18:  aastore
   19:  dup
   20:  iconst_3
   21:  ldc     #6; //String D
   23:  aastore
   24:  invokestatic    #7; //Method java/util/Arrays.asList:([Ljava/lang/Object;)Ljava/util/List;
   27:  astore_1
   28:  invokestatic    #8; //Method java/lang/System.currentTimeMillis:()J
   31:  lstore_2
   32:  aload_1
   33:  invokeinterface #9,  1; //InterfaceMethod java/util/List.size:()I
   38:  istore  4
   40:  ldc     #10; //int 1000000
   42:  istore  5
   44:  iload   5
   46:  iload   4
   48:  if_icmple       65
   51:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   54:  ldc     #12; //String Hello
   56:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   59:  iinc    5, -1
   62:  goto    44
   65:  invokestatic    #8; //Method java/lang/System.currentTimeMillis:()J
   68:  lstore  5
   70:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   73:  new     #14; //class java/lang/StringBuilder
   76:  dup
   77:  invokespecial   #15; //Method java/lang/StringBuilder."<init>":()V
   80:  ldc     #16; //String Finish:
   82:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   85:  lload   5
   87:  lload_2
   88:  lsub
   89:  invokevirtual   #18; //Method java/lang/StringBuilder.append:(J)Ljava/lang/StringBuilder;
   92:  invokevirtual   #19; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   95:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   98:  return
}

There are differences, but I'm not sure I can make a definitive statement about their effect on performance.

I'd code the second one, because it would mean (on the face of it) one method call as opposed to one per loop iteration. I don't know if the compiler can optimize it away, but I'm certain that I can do it pretty easily. So I do, regardless of its effect on wall time.

Viquelia answered 4/3, 2010 at 23:18 Comment(0)
S
10

I once worked on a project where my first task was to track down some insanely slow code (it was on a brand new 486 machine and it took about 20 minutes to execute):

for(size_t i = 0; i < strlen(data); i++)
{
    // do something with data[i]
}

The solution was (got it down to something like two minutes or less):

size_t length = strlen(data);

for(int i = 0; i < length; i++)
{
    // do something with data[i]
}

The issue is that "data" was over 1 million characters, and strlen has to count each one all the time.

In the case of Java the "size()" method probably returns a variable, and as such, the VM will inline it. On a VM like the one on Android it probably does not. So the answer is "it depends".

My personal preference is to never call a method more than one time if it is supposed to return the same result each time. That way if the method does involve a calculation it is performed only one time and then it is never an issue.

Spiraea answered 4/3, 2010 at 23:28 Comment(3)
Good points, but I would point out that the case of strlen is a bit different because it actually has to traverse data to figure out its length. So, inlining won't save you. You definitely need to save the result in that case. However, as you say, if you just always save the value, the optimization is guaranteed.Radiative
Err and this is java and the string length is cached in the object.Vista
I know it is Java... at the time the code I was showing was written Java was not public. The point of the story was that if a method (or function) is supposed to return the same value each time that there is no point in calling it over and over again... and infact it can be "harmful" since it may involve costly calculations each time it is called. In this case (as I said in my post!) it is likely just a variable return (infact I cannot imagine a case where it would not be just a variable return for this) but even on some VMs (like the one Andoid uses) the call may not be inlined.Spiraea
S
5

Note that the javac compiler has about nothing to do with optimization. The "important" compiler is the JIT compiler which lives within the JVM.

In your example, in the most generic case, the myList.size() is a simple method dispatch, which returns the contents of a field in the List instance. This is negligible work compared to what is implied by System.out.println("Hello") (at least one system call, hence hundreds of clock cycles, compared to no more than a dozen for the method dispatch). I very much doubt that your code could exhibit a meaningful difference in speed.

On a more general basis, the JIT compiler should recognize this call to size() as a call to a known instance, so that it may perform the method dispatch with a direct function call (which is faster), or even inline the size() method call, reducing the call to a simple instance field access.

Sissie answered 4/3, 2010 at 23:18 Comment(3)
Agreed on the System.out.println() call being the most expensive. I wonder what difference it would make if you used a StringBuilder to add those Hellos together and then print them using a single System.out.println() call outside the loop.Johannessen
@Thomas Pornin: yet the javac compiler does some optimization, like removing if ( cond ) {...} calls when, say, cond is a final boolean set to false. So it's not entirely true that javac has nothing to do with optimization.Pontianak
@cocotwo: the removal of if (false) {...} is part of the Java specification (code within the removed braces must not result in symbolic references to additional classes in the produced .class file) so I am not sure it can be called "optimization": it is not something that the compiler may or may not do, at its whim.Sissie
S
2

It can't optimize it, because mylist.size() could change during loop execution. Even if it's final, this just means that the reference is final (meaning you can't reassign myList to some other object), but methods on myList, such as remove() and add() are still available. Final does not make object immutable.

Stumper answered 4/3, 2010 at 23:18 Comment(1)
Good point! But depends on the situation. The examples provided, like duffymo answered, are too ambiguous.Caudad
C
1

The second one should be faster because the .size() does not have to be called every time the loop is performed. Its much faster to say 1+2=3 once than saying it many times.

Cluj answered 4/3, 2010 at 23:18 Comment(3)
@Spiraea how should that be possible? How should the compiler know whether the list is not changing in size? Can you provide a reference?Ezraezri
Look at the code, it cannot be changed. Also the inlining doesn't mean that, it means replacing the method call with a direct variable access.Spiraea
The size can still change even if a list is final.Acklin
S
1

It makes sense that the second implementation is faster, because you store a single, final, local copy of the variable. The compiler would have to figure out that the size can't change inside the loop in order for the performance to be roughly equivalent.

One question is -- does this kind of micro-optimization really matter? If it does, go with what is running faster in your tests and doesn't rely on a compiler optimization.

Slab answered 4/3, 2010 at 23:19 Comment(0)
P
1

Java compiler would have optimized it so, but didn't do so by seeing the funny condition. If you wrote it like this there would be no issue.

for (int i = myList.size(); i < 1000000; i--) {
    System.out.println("Hello");
}
Poohpooh answered 4/3, 2010 at 23:28 Comment(2)
You should do i >= 0 and i--, I guess.Caudad
I suspect you intended to count down to 0.Illdisposed
H
1

Almost certainly what you are seeing here is a difference in HotSpot inlining. With a simpler loop it is more likely to inline, and therefore get rid of all the redundant rubbish. It might do the same inlining, but do it earlier on or with less effort. Generally with Java microbenchmarks you should run the code multiple times, from which you can work out start up times, average times and deviations.

Himmler answered 4/3, 2010 at 23:46 Comment(0)
C
0

In cases of "compiler optimization", the best you can do is for-each loops:

for(final String x : myList) { ... }

Which lets the compiler provide the fastest implementation.

Edit:

The difference between your code examples is in the second argument of the for-loop. In the first example, the VM will do a method call (more expensive) and is thus slower (only significant when there are a lot of iterations). In your second example, the VM will do a stack pop (less expensive, and local variables are on the stack), and thus faster (only significant when there are a lot of iterations: for just one iteration, the first one is faster, in terms of memory usage).

Also: "Premature optimization is the root of all evil." Donald Knuth's infamous law.

Caudad answered 4/3, 2010 at 23:13 Comment(7)
In this case he isn't iterating over an array, though.Trinitrotoluene
That's not relevant here because the OP isn't actually iterating over a collection or array.Uund
But I guessed that was the intention of the question; why else did the questioner used an Array.asList? On the other hand: the for-each syntax can also be used for arrays.Caudad
The code Pindatjuh posted will work in this example, since we're using the size of an array as the number of times to loop.Evalyn
Actually, the for-each statement requires the compiler to use the iterator of myList (which must implement the Iterable interface). If the iterator is performing poorly, there's nothing the Java compiler can do to optimize the performance.Cabriolet
It's famous, not infamous; the latter has negative connotations and is used when someone is famous for doing something bad or undesirable. Although... Knuth is taken badly out of context for this, and has been used to poison the thinking of an entire generation of programmers, so maybe it is an infamous quote after all.Little
+1 for "Premature optimization is the root of all evil." Make the code optimally readable first; then measure; then only if measurement demonstrates it's too slow optimize. Which amounts to hardly ever, in most programmers' experience nowadays.Illdisposed
O
0

The difference is one method call less for each iteration, so the second version should run slightly faster. Although if you use Just-In-Time compiler, he may optimize that - figuring out that it doesn't change during the loop. Standard Java implementation features JIT, but not every Java implementation does.

Ogden answered 4/3, 2010 at 23:15 Comment(2)
Standard Java uses a just-in-time compiler.Carrel
I rather meant that not every Java implementation does that. Edited.Ogden
R
0

As always with such things, you're going to have to run them both to see which is faster given the implementation that you're using. However, the first one has the potential performance penalty of having to call size() every iteration, and a function call is more expensive than simply checking the variable directly. However, it's possible that that function call might be optimized away depending on your code and what the compiler does, so you'd have to run tests to see.

However, as was pointed out by Pindatjuh, it's better to use a foreach loop when you're going to be iterating over the whole collection like that. It should let the compiler optimize things better and is less error-prone.

Radiative answered 4/3, 2010 at 23:19 Comment(0)
A
0

With the last example you will not need to resolve the current size of the array so it will be slightly faster then the first example.

Just remember that this is only useful if you don't change the number of values in your array.

In Android it is recommended to use the latest example in there example, Designing for Performance. http://developer.android.com/guide/practices/design/performance.html#foreach

Annulus answered 4/3, 2010 at 23:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.