How atomic are GHC's thunks?
Asked Answered
R

3

51

How does GHC handle thunks that are accessed by multiple threads (either explicit threads, or the internal ones that evaluate sparks)? Can it happen that multiple threads evaluate the same thunk, duplicating work? Or, if thunks are synchronized, how, so that performance doesn't suffer?

Roselani answered 12/8, 2015 at 18:30 Comment(3)
I don't have time to write up an answer, but here's a blog post (that I found by googling) that discusses this in some detail and points at a paper that goes into more detail. (Though the paper is about ghc 6.10, so perhaps newer GHCs do something different now). mainisusuallyafunction.blogspot.com/2011/10/…Dotard
IIRC, they are not atomic, and might be evaluated twice, but the probability of having a duplication due to a race condition is very low. Even if they are evaluated twice, purity makes that safe. I read about this a long time ago but I forgot where... :-/Hebdomad
@Dotard That blog post is pretty nice, although confusingly, I think its one use of the word "eager" is almost the exact opposite of GHC terminology.Angadresma
M
45

From the blog post @Lambdageek linked, the GHC Commentary and the GHC User's Guide I piece together the following:

GHC tries to prevent reevaluating thunks, but because true locking between threads is expensive, and thunks are usually pure and so harmless to reevaluate, it normally does so in a sloppy manner, with a small chance of duplicating work anyhow.

The method it uses for avoiding work is to replace thunks by a blackhole, a special marker that tells other threads (or sometimes, the thread itself; that's how <<loop>> detection happens) that the thunk is being evaluated.

Given this, there are at least three options:

  • By default, it uses "lazy blackholing", where this is only done before the thread is about to pause. It then "walks" its stack and creates "true" blackholes for new thunks, using locking to ensure each thunk only gets one thread blackholing it, and aborting its own evaluation if it discovers another thread has already blackholed a thunk. This is cheaper as it doesn't need to consider thunks whose evaluation time is so short that it fits completely between two pauses.

  • With the -feager-blackholing-flag, blackholes are instead created as soon as a thunk starts evaluating, and the User's Guide recommends this if you are doing a lot of parallelism. However, because locking on every thunk would be too expensive, these blackholes are cheaper "eager" ones, which are not synchronized with other threads (although other threads can still see them if there's not a race condition). Only when the thread pauses do these get turned into "true" blackholes.

  • A third case, which the blog post was particularly about, is used for special functions like unsafePerformIO, for which it's harmful to evaluate a thunk more than once. In this case, the thread uses a "true" blackhole with real locking, but creates it immediately, by inserting an artificial thread pause before the real evaluation.

Muscadine answered 12/8, 2015 at 19:48 Comment(0)
E
22

To summarize the article linked to in the comments: the thunks in GHC are not strictly atomic between multiple threads: it is possible in a race condition for multiple threads to evaluate the same thunk, duplicating work. But, this is not a big deal in practice because:

  1. Guaranteed purity means it never matters whether a thunk is evaluated twice, in terms of program semantics. unsafePerformIO is a case where that might be an issue, but it turns out that unsafePerformIO is careful to avoid re-running the same IO action.
  2. Because all values are thunks, most thunks turn out to be quite small, so occasionally duplicating the work to force one is not a big deal in terms of program speed. You might imagine that it's expensive to duplicate, for example, last [1,2..10000000] because that whole computation is expensive. But of course really the outermost thunk just resolves to another thunk, something like:

    case [1,2..10000000] of 
      [x] -> x
      (_:xs) -> last xs
      [] -> error "empty list"
    

    and if two threads duplicate the work to turn the call to last into a use of case, that's pretty cheap in the grand scheme of things.

Enticement answered 12/8, 2015 at 19:40 Comment(0)
E
13

Yes, sometimes the same thunk can be evaluated by multiple threads. GHC runtime tries to minimize the probability of duplicated work, so in practice it is rare. Please see "Haskell on a Shared-Memory Multiprocessor" paper for low level details, mostly the "Lock-free thunk evaluation" section. (I'd recommend the paper for every professional haskell developer btw.)

Edgell answered 12/8, 2015 at 19:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.