Scala: Parallel collection in object initializer causes a program to hang
Asked Answered
F

1

18

I've just noticed a disturbing behavior. Let's say I have a standalone program consisting of a sole object:

object ParCollectionInInitializerTest {
  def doSomething { println("Doing something") }

  for (i <- (1 to 2).par) {
    println("Inside loop: " + i)
    doSomething
  }

  def main(args: Array[String]) {
  }
}

The program is perfectly innocent and, when the range used in for loop is not a parallel one, executes properly, with the following output:

Inside loop: 1
Doing something
Inside loop: 2
Doing something

Unfortunately, when using the parallel collection, the program just hangs without ever invoking the doSomething method, so the output is as follows:

Inside loop: 2
Inside loop: 1

And then the program hangs.
Is this just a nasty bug? I'm using scala-2.10.

Fag answered 2/3, 2013 at 15:36 Comment(1)
related: #27550171Humanity
S
28

This is an inherent problem which can happen in Scala when releasing a reference to the singleton object before the construction is complete. It happens due to a different thread trying to access the object ParCollectionInInitializerTest before it has been fully constructed. It has nothing to do with the main method, rather, it has to do with initializing the object that contains the main method -- try running this in the REPL, typing in the expression ParCollectionInInitializerTest and you'll get the same results. It also doesn't have anything to do with fork-join worker threads being daemon threads.

Singleton objects are initialized lazily. Every singleton object can be initialized only once. That means that the first thread that accesses the object (in your case, the main thread) must grab a lock of the object, and then initialize it. Every other thread that comes subsequently must wait for the main thread to initialize the object and eventually release the lock. This is the way singleton objects are implemented in Scala.

In your case the parallel collection worker thread tries accessing the singleton object to invoke doSomething, but cannot do so until the main thread completes initializing the object -- so it waits. On the other hand, the main thread waits in the constructor until the parallel operation completes, which is conditional upon all the worker threads completing -- the main thread holds the initialization lock for the singleton all the time. Hence, a deadlock occurs.

You can cause this behaviour with futures from 2.10, or with mere threads, as shown below:

def execute(body: =>Unit) {
  val t = new Thread() {
    override def run() {
      body
    }
  }

  t.start()
  t.join()
}

object ParCollection {

  def doSomething() { println("Doing something") }

  execute {
    doSomething()
  }

}

Paste this into the REPL, and then write:

scala> ParCollection

and the REPL hangs.

Sheedy answered 2/3, 2013 at 15:59 Comment(2)
Concurrent blocking execution and lazy initialization don't play nicely together. This is a more general problem in Scala (and Java, for that matter). See this SIP: docs.scala-lang.org/sips/pending/…Sheedy
Not meant as retort, but I do think this is a pitfall for developers. I am very grateful for your initial answer and comments, extremely so. I believe the answer was crystal clear to me by the way.En

© 2022 - 2024 — McMap. All rights reserved.