Groovy: Is for..in significantly faster than .each?

Asked 5/6, 2015 at 8:38 Answered 6/6, 2015 at 6:18

I'm curious if for..in should be preferred to .each for performance reasons.

Haug answered 5/6, 2015 at 8:38 Comment(7)

What have you tried so far? Note that JMH (openjdk.java.net/projects/code-tools/jmh) has a groovy template which makes it easy (but not trivial) to create a Benchmark in groovy. Be sure to return with your findings! – Inconsiderate 5/6, 2015 at 8:46

@Inconsiderate thanks for mentioning jmh! I'll give it a shot. – Haug 5/6, 2015 at 8:55

For me changing all each to for-in is probably the last on the list of optimization techniques. Just saying :-) Or to put it other way, one unnecessary DB call can equal to 1000s of optimized for-ins. – Kilgore 5/6, 2015 at 9:4

Is this really the bottleneck in your application? – Hammel 5/6, 2015 at 9:5

Related: #24010208 – Hemorrhoidectomy 5/6, 2015 at 10:50

Does it (really) matter? – Kyle 5/6, 2015 at 11:3

@Kilgore I asked this out of curiosity. I'm learning stuff as a consequence :) – Haug 5/6, 2015 at 14:12

For .. in is part of the standard language flow control.

Instead each calls a closure so with extra overhead.

.each {...} is syntax sugar equivalent to the method call .each({...})

Moreover, due to the fact that it is a closure, inside each code block you can't use break and continue statements to control the loop.

http://kunaldabir.blogspot.it/2011/07/groovy-performance-iterating-with.html

Updated benchmark Java 1.8.0_45 Groovy 2.4.3:

3327981 each {}
320949 for(){

Here is another benchmark with 100000 iterations:

lines = (1..100000)
// with list.each {}
start = System.nanoTime()
lines.each { line->
    line++;
}
println System.nanoTime() - start

// with loop over list
start = System.nanoTime()
for (line in lines){
    line++;
}
println System.nanoTime() - start

results:

261062715 each{}
64518703 for(){}

Heigho answered 5/6, 2015 at 9:9 Comment(7)

you are right but it's linked a gist, everyone can run the benchmark again. I just do it with java8 and the result is: 3056480 / 8181870 / 3327981 / 320949 – Heigho 5/6, 2015 at 10:1

How long is the list? – Cowgill 5/6, 2015 at 10:31

the list size is 66 , quite short. – Heigho 5/6, 2015 at 10:35

"inside each code block you can't use break and continue statements" - That is wrong. There is an issue of understanding the language and understanding why those don't do what some people think they do inside of the closure in that context, but it is perfectly legal and there are valid reason to use break and continue inside of a closure passed to the each method. For example, if you had a loop or a switch statement inside of the closure. – Stairs 7/6, 2015 at 16:27

Most often, when someone concludes that break and continue don't work there, that demonstrates a misunderstanding of what is happening when a closure is passed as an argument to the each method. If you wrote a method that contained a continue and you invoked that method from inside of a loop, what would you expect that continue to mean? That is similar to what is happening when continue is used inside of a closure that is passed to the each method. – Stairs 7/6, 2015 at 16:35

Obviously I mean that is not possible to use break and continue to modify the loop with a closure as the inner code block like .each{} Obviously in the closure we can create loops and use break and continue. – Heigho 7/6, 2015 at 16:49

To be clear IMHO the fact that we can't use break and continue in loop with closure block code is very good. It enforce the use of this constructs in the correct way: lambda calculus way. – Heigho 7/6, 2015 at 17:20

Let's have a theoretical look at things in terms of what calls are done dynamic and what calls are done more directly with Java logic (I will call those static calls).

In case of for-in, Groovy operates on an Iterator, to get it, we have one dynamic call to iterator(). If I am not mistaken, the hasNext and next calls are done using normal Java method call logic. Thus for each iteration we have here 2 static calls only. Depending on the benchmark it has to be noted that that first iterator() call can cause serious initialization times, since that may init the meta class system and that takes a moment.

In case of each we have the dynamic call to each itself, as well as the object creation for the open block (instance of Closure). each(Closure) will then call iterator() as well, but uncached... well all one-time cost. During the loop hasNext and next are done using Java logic that makes 2 static calls. The call into Closure instance is done with java standard logic for the method call, which then will invoke doCall using a dynamic call.

To sum it up, per iteration for-in uses only 2 static calls, while each has 3 static and 1 dynamic call. The dynamic call is much slower than multiple static calls and much more difficult to optimize for the JVM, thus dominating the timing. As a result of this each should always be slower, as long as the open block requires the dynamic call.

Because of the complicated logic for Closure#call it is difficult to optimize the dynamic call away. And that is annoying, because it is not really needed and will be removed as soon as we find a workaround. Should we ever succeed in that, each might still be slower, but it is a much more difficult thing, since bytecode sizes and invocation profiles play a role here. In theory they could be equal then (ignoring the init time), but the JVM has much more work to do. Of course the same applies for for examples stream based lambda processing in Java8,

Jambeau answered 6/6, 2015 at 6:18 Comment(3)

In Java8 forEach with closure seems fastest than standard for loop. gist.github.com/frhack/5db0485f9847e6b673be – Heigho 8/6, 2015 at 17:4

@Heigho you have to be careful with microbenchmarks like that. you have to give it a proper warmup time and best is to test only one construct at the same time. I am positive the times in both are very similar if tested with those constraints. But there are of course differences in code size. The forEach version has the for-each method plus the lambda that has to be optimized. The normal for-loop version has the loop body, but also the method the loop body is in to optimize. Which is why it is not good to have the loop plain in the test with everything else. – Jambeau 10/6, 2015 at 4:24

It results in a big codesize for the part to optimize and can bring the compiler to not to do that. In the end the forEach does the same logic steps as the normal loop, but with more method calls in between. So it takes longer to optimize... but in the end both should be equal. Everything else would mean wasted potential to me. There is for me no technical reason your example should behave all that different – Jambeau 10/6, 2015 at 4:27

Recommended topics

Hot tags