What is the difference between Seq and IndexedSeq/LinearSeq in Scala?
Asked Answered
L

2

14

In Scala Collection documentation, there is some clue to this question:

Trait Seq has two subtraits LinearSeq, and IndexedSeq. These do not add any new operations, but each offers different performance characteristics: A linear sequence has efficient head and tail operations, whereas an indexed sequence has efficient apply, length, and (if mutable) update operations.

But this does not address me when to use IndexedSeq instead of Seq? I need some real example of IndexedSeq or LinearSeq where these collections do better than Seq.

Lonee answered 18/3, 2016 at 18:22 Comment(0)
C
21

Seq is the super-trait, so it's more generic, it has the characteristics common to all sequences, both Linear and Indexed.

If you're wondering what kind of sequence is created by the Seq.apply method in the companion object of Seq, we can have a look at the implementation.

Keep in mind that if you use Seq.apply, you're implying that you just need a Seq and that your code doesn't care if it's Linear or Indexed

The tl;dr answer is then: you use LinearSeq or IndexedSeq when you need to have a certain performance characteristics, you use the more general Seq when you don't care about the difference

This is the companion object of Seq:

object Seq extends SeqFactory[Seq] {
  implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, Seq[A]] = ReusableCBF.asInstanceOf[GenericCanBuildFrom[A]]

  def newBuilder[A]: Builder[A, Seq[A]] = immutable.Seq.newBuilder[A]
}

The newBuilder[A] method is what's used to build the Seq, as you can verify in the Seq.apply method (defined on the trait GenericCompanion):

def apply[A](elems: A*): CC[A] = {
   if (elems.isEmpty) empty[A]
   else {
     val b = newBuilder[A]
     b ++= elems
     b.result()
   }
 }

Now the question is: what does an immutable.Seq.newBuilder[A] build? Let's go see, this time on the immutable.Seq companion object:

object Seq extends SeqFactory[Seq] {
// stuff
  def newBuilder[A]: Builder[A, Seq[A]] = new mutable.ListBuffer
}

It builds a mutable ListBuffer! Why is that? That's because a mutable.ListBuffer also happens to be a Builder[A, Seq[A]], that is a class that is used by the collection library to build new collections.

The actual output collection comes from this line (as you can see above):

b.result()

So, what's the return type of the ListBuffer.result()? Let's go see in the ListBuffer:

// Implementation of abstract method in Builder
def result: List[A] = toList

here you go: It's a list.

Seq(1,2,3) returns a List[Int] under hood, but the whole point here is that if you're using Seq(), you don't need to know what kind of collection you have because you're implying that the more abstract interface is enough for your needs

Canned answered 18/3, 2016 at 18:38 Comment(1)
Just to add - if you use IndexedSeq.apply (factory method), it will construct a Vector for you. If you use Seq.apply or LinearSeq.apply, it will give you a List. Since in many cases a Vector is preferred, you should use IndexedSeq.apply when in doubt.Impolitic
R
13

Seq is simply super trait of IndexedSeq and LinearSeq. Seq is ordered list, as opposed to Set, which is unique & unordered usually, and as opposed to Map, which is key value pair.
Now comes IndexedSeq vs LinearSeq.
IndexedSeq subclasses i.e. Vector, Range, String, are all fast index based access, and usually, fast updates. This is possible because they use data structures suitable for this like array internally (like in String) or a tree (actually a 32 fan out tree used by Vector.. for more details see this), or Range, which is just three numbers.. begin, end, step.. from which calculations of index are straightforward.
As opposed to this, LinearSeq are all slow index based and access, and slow in updates. Usually because they have a linked list kind of structure. Prepend is fast for all of them i.e. Queue, Stack, List.. but append is notoriously slow as it has to go to the edge of the internal linked list structure.. except for queue which uses a trick with it's own issues. The trick is described here.
So broadly, IndexedSeq are those data structures that have fast index based access and update, and LinearSeq are those data structures that have fast prepend.

Radiobroadcast answered 27/6, 2017 at 17:41 Comment(1)
LinearSeq also has fast head and tail ops.Disdain

© 2022 - 2024 — McMap. All rights reserved.