Scala performance question

Asked 4/2, 2011 at 14:28 Answered 10/2, 2011 at 21:1

Solved performance scala scala-collections

In the article written by Daniel Korzekwa, he said that the performance of following code:

list.map(e => e*2).filter(e => e>10)

is much worse than the iterative solution written using Java.

Can anyone explain why? And what is the best solution for such code in Scala (I hope it's not a Java iterative version which is Scala-fied)?

Traipse answered 4/2, 2011 at 14:28 Comment(0)

The reason that particular code is slow is because it's working on primitives but it's using generic operations, so the primitives have to be boxed. (This could be improved if List and its ancestors were specialized.) This will probably slow things down by a factor of 5 or so.

Also, algorithmically, those operations are somewhat expensive, because you make a whole list, and then make a whole new list throwing a few components of the intermediate list away. If you did it in one swoop, then you'd be better off. You could do something like:

list collect (case e if (e*2>10) => e*2)

but what if the calculation e*2 is really expensive? Then you could

(List[Int]() /: list)((ls,e) => { val x = e*2; if (x>10) x :: ls else ls }

except that this would appear backwards. (You could reverse it if need be, but that requires creating a new list, which again isn't ideal algorithmically.)

Of course, you have the same sort of algorithmic problems in Java if you're using a singly linked list--your new list will end up backwards, or you have to create it twice, first in reverse and then forwards, or you have to build it with (non-tail) recursion (which is easy in Scala, but inadvisable for this sort of thing in either language since you'll exhaust the stack), or you have to create a mutable list and then pretend afterwards that it's not mutable. (Which, incidentally, you can do in Scala also--see mutable.LinkedList.)

Bilge answered 4/2, 2011 at 15:2 Comment(3)

This answer gets the problem, but offers no nice solution. – Mychal 4/2, 2011 at 15:52

@Mychal - There isn't really one given the current state of the library. view/force is not going to save you when you're working with primitives. – Bilge 4/2, 2011 at 16:2

If e*2 were expensive, then the cost of having the intermediate step would diminish. Maybe memory issues, if you deal with large amounts of data. – Glenglencoe 5/2, 2011 at 9:54

Basically it's traversing a list twice. Once for multiplying every element with two. And then another time to filter the results.

Think of it as one while loop producing a LinkedList with the intermediate results. And then another loop applying the filter to produce the final results.

This should be faster:

list.view.map(e => e * 2).filter(e => e > 10).force

Seraphine answered 4/2, 2011 at 14:36 Comment(2)

This answer misses the point, although the solution happens to be correct. – Mychal 4/2, 2011 at 15:51

I could be clever now and point out that nowhere in the sample code it says it involves Ints. But I have to admit that the answer would have been way better if it would have mentioned that there is quite a bit of boxing and unboxing going on. – Seraphine 4/2, 2011 at 16:9

The solution lies mostly with JVM. Though Scala has a workaround in the figure of @specialization, that increases the size of any specialized class hugely, and only solves half the problem -- the other half being the creation of temporary objects.

The JVM actually does a good job optimizing a lot of it, or the performance would be even more terrible, but Java does not require the optimizations that Scala does, so JVM does not provide them. I expect that to change to some extent with the introduction of SAM not-real-closures in Java.

But, in the end, it comes down to balancing the needs. The same while loop that Java and Scala do so much faster than Scala's function equivalent can be done faster yet in C. Yet, despite what the microbenchmarks tell us, people use Java.

Ablate answered 4/2, 2011 at 20:46 Comment(0)

Scala approach is much more abstract and generic. Therefore it is hard to optimize every single case.

I could imagine that HotSpot JIT compiler might apply stream- and loop-fusion to the code in the future if it sees that the immediate results are not used.

Additionally the Java code just does much more.

If you really just want to mutate over a datastructure, consider transform. It looks a bit like map but doesn't create a new collection, e. g.:

val array = Array(1,2,3,4,5,6,7,8,9,10).transform(_ * 2)
// array is now WrappedArray(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)

I really hope some additional in-place operations will be added ion the future...

Dirac answered 4/2, 2011 at 16:38 Comment(0)

To avoid traversing the list twice, I think the for syntax is a nice option here:

val list2 = for(v <- list1; e = v * 2; if e > 10) yield e

Exemplary answered 4/2, 2011 at 18:33 Comment(0)

Rex Kerr correctly states the major problem: Operating on immutable lists, the stated piece of code creates intermediate lists in memory. Note that this is not necessarily slower than equivalent Java code; you just never use immutable datastructures in Java.

Wilfried Springer has a nice, Scala idomatic solution. Using view, no (manipulated) copies of the whole list are created.

Note that using view might not always be ideal. For example, if your first call is filter that is expected to throw away most of the list, is might be worthwhile to create the shorter version explicitly and use view only after that in order to improve memory locality for later iterations.

Mychal answered 4/2, 2011 at 15:56 Comment(3)

This is not an answer. It is a comment about two other real answers. – Sergeant 5/2, 2011 at 2:9

Oh, and maybe you just never use immutable datastructures in Java. I actually do use them a lot when appropriate. – Sergeant 5/2, 2011 at 2:13

1)I found both referenced answers lacking individually, so I decided to post a joint answer. The fact that I give proper references does not make this a comment. 2) The standard Java collections framework seems to be quite short on immutable implementations, providing only an unmodifiable (!= immutable) wrapper. Maybe that's why. – Mychal 5/2, 2011 at 12:38

list.filter(e => e*2>10).map(e => e*2)

This attempt reduces first the List. So the second traversing is on less elements.

Judicature answered 10/2, 2011 at 21:1 Comment(0)

Recommended topics

Hot tags