Asynchronous IO in Scala with futures
Asked Answered
P

3

69

Let's say I'm getting a (potentially big) list of images to download from some URLs. I'm using Scala, so what I would do is :

import scala.actors.Futures._

// Retrieve URLs from somewhere
val urls: List[String] = ...

// Download image (blocking operation)
val fimages: List[Future[...]] = urls.map (url => future { download url })

// Do something (display) when complete
fimages.foreach (_.foreach (display _))

I'm a bit new to Scala, so this still looks a little like magic to me :

  • Is this the right way to do it? Any alternatives if it is not?
  • If I have 100 images to download, will this create 100 threads at once, or will it use a thread pool?
  • Will the last instruction (display _) be executed on the main thread, and if not, how can I make sure it is?

Thanks for your advice!

Peoria answered 27/10, 2012 at 6:13 Comment(0)
S
137

Use Futures in Scala 2.10. They were joint work between the Scala team, the Akka team, and Twitter to reach a more standardized future API and implementation for use across frameworks. We just published a guide at: http://docs.scala-lang.org/overviews/core/futures.html

Beyond being completely non-blocking (by default, though we provide the ability to do managed blocking operations) and composable, Scala's 2.10 futures come with an implicit thread pool to execute your tasks on, as well as some utilities to manage time outs.

import scala.concurrent.{future, blocking, Future, Await, ExecutionContext.Implicits.global}
import scala.concurrent.duration._

// Retrieve URLs from somewhere
val urls: List[String] = ...

// Download image (blocking operation)
val imagesFuts: List[Future[...]] = urls.map {
  url => future { blocking { download url } }
}

// Do something (display) when complete
val futImages: Future[List[...]] = Future.sequence(imagesFuts)
Await.result(futImages, 10 seconds).foreach(display)

Above, we first import a number of things:

  • future: API for creating a future.
  • blocking: API for managed blocking.
  • Future: Future companion object which contains a number of useful methods for collections of futures.
  • Await: singleton object used for blocking on a future (transferring its result to the current thread).
  • ExecutionContext.Implicits.global: the default global thread pool, a ForkJoin pool.
  • duration._: utilities for managing durations for time outs.

imagesFuts remains largely the same as what you originally did- the only difference here is that we use managed blocking- blocking. It notifies the thread pool that the block of code you pass to it contains long-running or blocking operations. This allows the pool to temporarily spawn new workers to make sure that it never happens that all of the workers are blocked. This is done to prevent starvation (locking up the thread pool) in blocking applications. Note that the thread pool also knows when the code in a managed blocking block is complete- so it will remove the spare worker thread at that point, which means that the pool will shrink back down to its expected size.

(If you want to absolutely prevent additional threads from ever being created, then you ought to use an AsyncIO library, such as Java's NIO library.)

Then we use the collection methods of the Future companion object to convert imagesFuts from List[Future[...]] to a Future[List[...]].

The Await object is how we can ensure that display is executed on the calling thread-- Await.result simply forces the current thread to wait until the future that it is passed is completed. (This uses managed blocking internally.)

Sis answered 27/10, 2012 at 11:4 Comment(11)
Thanks for the in-depth answer! If I understand correctly, if you don't specify "blocking", then the thread pool can potentially run out of workers and block forever if every worker stays busy indefinitely? Also, can I create my own ExecutionContext to force the completion callback (but not the actual background process, of course) to be executed asynchronously on a specific thread (I.e. the UI thread, using a framework-specific method)?Peoria
Technically: Avoid blocking at all costs. Only do blocking if you have no other choice.Brakeman
In my understanding, network calls are blocking, are they not? If each network call has a timeout, would that count as blocking as well?Peoria
Network calls would technically not be blocking, if you were to use an AsyncIO library like Java's NIO.Sis
For Futures, there is no reason for multiple futures with timeouts not to be handled in a non-blocking way.Sis
But to answer your earlier question- yes, the default FJPool can run out of workers if all threads block and you do not use managed blocking. And yes, you can create your own ExecutionContext, using Swing's invokeLater, for example, and explicitly pass that to the foreach on futImages instead of using the Await.resultSis
Thanks, that's what I wanted to know ;) I'll play around with all that stuff and see what I can do with it!Peoria
I'm a noob in this and I have a question - when do the imageFuts futures start executing? I suppose not in the map command, because you are attaching the "listeners" after this starts executing?Evangelist
How does actually the stuff behind blocking works? Does it have its own thread pool or it simply creates new thread when we submit a task via blocking?Nap
why are you using blocking in there url => future { blocking { download url } }, why not to use just url => future { download url }?Octagonal
By the way, implicit managed blocking solution for Await isn't cool in practice. It allowed very bad and stupid architectural solution (with blocking inside actors) in our project and caused serious performance leaks - #28045471. As a result our routing framework worked only with fj-pool and were creating about 5-6 threads per new message (in case of high-load). And thanks to fj the guy (architect) who did it don't even remember his mistake as it was so easy to do it.Stites
S
5
val all = Future.traverse(urls){ url =>
  val f = future(download url) /*(downloadContext)*/
  f.onComplete(display)(displayContext)
  f
}
Await.result(all, ...)
  1. Use scala.concurrent.Future in 2.10, which is RC now.
  2. which uses an implicit ExecutionContext
  3. The new Future doc is explicit that onComplete (and foreach) may evaluate immediately if the value is available. The old actors Future does the same thing. Depending on what your requirement is for display, you can supply a suitable ExecutionContext (for instance, a single thread executor). If you just want the main thread to wait for loading to complete, traverse gives you a future to await on.
Selfdeceit answered 27/10, 2012 at 8:39 Comment(0)
C
3
  1. Yes, seems fine to me, but you may want to investigate more powerful twitter-util or Akka Future APIs (Scala 2.10 will have a new Future library in this style).

  2. It uses a thread pool.

  3. No, it won't. You need to use the standard mechanism of your GUI toolkit for this (SwingUtilities.invokeLater for Swing or Display.asyncExec for SWT). E.g.

    fimages.foreach (_.foreach(im => SwingUtilities.invokeLater(new Runnable { display im })))
    
Companionate answered 27/10, 2012 at 6:57 Comment(8)
Thanks for the answer, I'm happy to know my approach is sensible! I'm actually trying out Scala for Android, so this'll come in handy, compared to the horrendous Java syntax!Peoria
Regarding #3, I was thinking and trying out a few simple test cases right before you wrote your answer, and it seems that it does execute on the main thread. I just created a simple future{"test"} and ran foreach(s => println(Thread.currentThread.getName()) on it, which printed main. Am I misunderstanding something?Peoria
@Peoria I just did the same twice in the Scala console (for the same future) and got Thread-15 and Thread-16. It may depend on Scala version.Companionate
I think the Scala console spawns threads for each command you type. I just tried println(...getName()); f.foreach(s => ...getName()) (in one line) and got two times Thread-20. Weird.Peoria
Yes, it seems so. At the very least, since the docs don't say it's called in the main thread, I wouldn't assume so.Companionate
I'm going to ask on the Scala mailing list, in case they know. It would certainly make my code cleaner and easier to read!Peoria
@Peoria I am trying Futures on Android and they seem to execute on main thread disregarding execution context like you said above. Were you able to understand why?Hamsun
@Hamsun No, I wasn't, but as Alexey said, I wouldn't assume that it is so all the time. It may have to do with the fact that future{"test"} completes near-instantly, and the later completion blocks just retrieve the result, hence no use for a secondary thread. Longer-running futures may run their completion block in the worker itself, iirc the docs say nothing about this.Peoria

© 2022 - 2024 — McMap. All rights reserved.