Why are so few things @specialized in Scala's standard library?
Asked Answered
S

3

43

I've searched for the use of @specialized in the source code of the standard library of Scala 2.8.1. It looks like only a handful of traits and classes use this annotation: Function0, Function1, Function2, Tuple1, Tuple2, Product1, Product2, AbstractFunction0, AbstractFunction1, AbstractFunction2.

None of the collection classes are @specialized. Why not? Would this generate too many classes?

This means that using collection classes with primitive types is very inefficient, because there will be a lot of unnecessary boxing and unboxing going on.

What's the most efficient way to have an immutable list or sequence (with IndexedSeq characteristics) of Ints, avoiding boxing and unboxing?

Stillage answered 29/3, 2011 at 19:14 Comment(2)
Wow, cheez, i assumed that List and IndexedSeq are specialized. Doing some :javap -c with the new scala 2.9 REPL shows they still do boxing all the time. The constructors are optimized though it seems (List(1,2,3) uses a wrapIntArray).Johannejohannes
That's really unwanted overhead when dealing with huge datasets of numbers.Kennedy
M
18

Specialization has a high cost on the size of classes, so it must be added with careful consideration. In the particular case of collections, I imagine the impact will be huge.

Still, it is an on-going effort -- Scala library has barely started to be specialized.

Mun answered 29/3, 2011 at 21:39 Comment(4)
This is correct. Video from Scala Days 2010 on @Specialized, where Iulian Dragos explains why it's not everywhere: days2010.scala-lang.org/node/138/151Trappist
I've also grepped through the source of the Scala 2.9 RC1 library and unfortunately it looks like nothing has changed yet!Stillage
@Daniel Specialization has a high cost on the size of classes, so it must be added with careful consideration As a naive, non-compiler engineer, why does the size of classes matter? I am not making an argument, but looking to learn why.Waggery
@KevinMeredith Well, for one thing, up to Java 7, it was limited to 64 KB. It can also present problems when coding for Android.Mun
O
17

Specialized can be expensive ( exponential ) in both size of classes and compile time. Its not just the size like the accepted answer says.

Open your scala REPL and type this.

import scala.{specialized => sp}
trait S1[@sp A, @sp B, @sp C, @sp D] { def f(p1:A): Unit }

Sorry :-). Its like a compiler bomb.

Now, lets take a simple trait

trait Foo[Int]{ }

The above will result in two compiled classes. Foo, the pure interface and Foo$1, the class implementation.

Now,

trait Foo[@specialized A] { }

A specialized template parameter here gets expanded/rewritten for 9 different primitive types ( void, boolean, byte, char, int, long, short, double, float ). So, basically you end up with 20 classes instead of 2.

Going back to the trait with 5 specialized template parameters, the classes get generated for every combination of possible primitive types. i.e its exponential in complexity.

2 * 10 ^ (no of specialized parameters)

If you are defining a class for a specific primitive type, you should be more explicit about it such as

trait Foo[@specialized(Int) A, @specialized(Int,Double) B] { }

Understandably one has to be frugal using specialized when building general purpose libraries.

Here is Paul Phillips ranting about it.

Oder answered 8/3, 2013 at 17:57 Comment(2)
But you don't always need to specialze for all eight primitive types plus void; you can also just specialize for Int, for example, using @specialized(Int).Stillage
yes, thats right. Added more notes to my answer. But, thats the case when you are doing something specific to a data type. Building general purpose libraries like the the scala language api needs to be more general purpose.Oder
S
6

Partial answer to my own question: I can wrap an array in an IndexedSeq like this:

import scala.collection.immutable.IndexedSeq

def arrayToIndexedSeq[@specialized(Int) T](array: Array[T]): IndexedSeq[T] = new IndexedSeq[T] {
  def apply(idx: Int): T = array(idx)
  def length: Int = array.length
}

(Ofcourse you could still modify the contents if you have access to the underlying array, but I would make sure that the array isn't passed to other parts of my program).

Stillage answered 29/3, 2011 at 19:19 Comment(5)
Since you are hiding behind IndexedSeq that still doesn't convince the compiler not to use autoboxing. For instance, if i run the :javap -c on the following: object Test { val a = arrayToIndexedSeq( Array(1,2,3)); val i: Int = a(0)} -- you will see that val i = a(0) does boxing (in SeqLike)....Johannejohannes
@Sciss Ok... grrr! Is there another way to get rid of the unnecessary autoboxing? Is directly using mutable arrays the only way to avoid it?Stillage
I don't know. I guess we need to make a petition to add more specializations to the collections. Martin Odersky kind of promised that here: scala-lang.org/node/7285 ; the problem i guess is due to the heavy factoring out of traits in the new collections that if you want Vector to be specialized, you will end up having to specialize a dozen involved traits, which in turn creates high bloat on the size of the library. Still, I think we should vote for that, it definitely sucks to have weak performance for Vector which is really the number crunching swiss army knife in scala.Johannejohannes
@Sciss I've carefully looked at the output of scalac -print ... and following the sequence of methods being called, it looks like it actually does avoid boxing when I call apply() on it. (Using scala 2.8.1). There are some calls to scala.Int.box() but those are not called for that code path.Stillage
Your collection however does have one advantage: it does not box stored elements, and that's a huge memory saving if the collection is (moderately) large. An integer is 4 bytes, boxing will probably add at least other 16 bytes to that (for object headers and the pointers to objects). When you extract one int, only that is boxed.Cherri

© 2022 - 2024 — McMap. All rights reserved.