I'm curious if for..in
should be preferred to .each
for performance reasons.
For .. in
is part of the standard language flow control.
Instead each
calls a closure so with extra overhead.
.each {...}
is syntax sugar equivalent to the method call .each({...})
Moreover, due to the fact that it is a closure, inside each
code block you can't use break
and continue
statements to control the loop.
http://kunaldabir.blogspot.it/2011/07/groovy-performance-iterating-with.html
Updated benchmark Java 1.8.0_45 Groovy 2.4.3:
- 3327981 each {}
- 320949 for(){
Here is another benchmark with 100000 iterations:
lines = (1..100000)
// with list.each {}
start = System.nanoTime()
lines.each { line->
line++;
}
println System.nanoTime() - start
// with loop over list
start = System.nanoTime()
for (line in lines){
line++;
}
println System.nanoTime() - start
results:
- 261062715 each{}
- 64518703 for(){}
break
and continue
inside of a closure passed to the each
method. For example, if you had a loop or a switch statement inside of the closure. –
Stairs break
and continue
don't work there, that demonstrates a misunderstanding of what is happening when a closure is passed as an argument to the each
method. If you wrote a method that contained a continue
and you invoked that method from inside of a loop, what would you expect that continue
to mean? That is similar to what is happening when continue
is used inside of a closure that is passed to the each
method. –
Stairs Let's have a theoretical look at things in terms of what calls are done dynamic and what calls are done more directly with Java logic (I will call those static calls).
In case of for-in
, Groovy operates on an Iterator, to get it, we have one dynamic call to iterator(). If I am not mistaken, the hasNext and next calls are done using normal Java method call logic. Thus for each iteration we have here 2 static calls only. Depending on the benchmark it has to be noted that that first iterator() call can cause serious initialization times, since that may init the meta class system and that takes a moment.
In case of each
we have the dynamic call to each itself, as well as the object creation for the open block (instance of Closure). each(Closure) will then call iterator() as well, but uncached... well all one-time cost. During the loop hasNext and next are done using Java logic that makes 2 static calls. The call into Closure instance is done with java standard logic for the method call, which then will invoke doCall using a dynamic call.
To sum it up, per iteration for-in
uses only 2 static calls, while each
has 3 static and 1 dynamic call. The dynamic call is much slower than multiple static calls and much more difficult to optimize for the JVM, thus dominating the timing. As a result of this each
should always be slower, as long as the open block requires the dynamic call.
Because of the complicated logic for Closure#call it is difficult to optimize the dynamic call away. And that is annoying, because it is not really needed and will be removed as soon as we find a workaround. Should we ever succeed in that, each
might still be slower, but it is a much more difficult thing, since bytecode sizes and invocation profiles play a role here. In theory they could be equal then (ignoring the init time), but the JVM has much more work to do. Of course the same applies for for examples stream based lambda processing in Java8,
© 2022 - 2024 — McMap. All rights reserved.
each
tofor-in
is probably the last on the list of optimization techniques. Just saying :-) Or to put it other way, one unnecessary DB call can equal to 1000s of optimizedfor-ins
. – Kilgore