Why do we need 'seq' or 'pseq' with 'par' in Haskell?
Asked Answered
O

3

31

I'm trying to understand why we need all parts of the standard sample code:

a `par` b `pseq` a+b

Why won't the following be sufficient?

a `par` b `par` a+b

The above expression seems very descriptive: Try to evaluate both a and b in parallel, and return the result a+b. Is the reason only that of efficiency: the second version would spark off twice instead of once?

How about the following, more succinct version?

a `par` a+b

Why would we need to make sure b is evaluated before a+b as in the original, standard code?

Opal answered 2/1, 2011 at 2:0 Comment(0)
O
30

Ok. I think the following paper answers my question: http://community.haskell.org/~simonmar/papers/threadscope.pdf

In summary, the problem with

a `par` b `par` a+b 

and

a `par` a+b

is the lack of ordering of evaluation. In both versions, the main thread gets to work on a (or sometimes b) immediately, causing the sparks to "fizzle" away immediately since there is no more need to start a thread to evaluate what the main thread has already started evaluating.

The original version

a `par` b `pseq` a+b

ensures the main thread works on b before a+b (or else would have started evaluating a instead), thus giving a chance for the spark a to materialize into a thread for parallel evaluation.

Opal answered 2/1, 2011 at 2:26 Comment(3)
This is correct, and also explains why seq is insufficient for this problem. seq makes no guarantees about ordering of evaluation. In seq b (a+b), the main thread may evaluate a before b so long as b is in WHNF when (a+b) is evaluated.Sexed
I don't see how that argument describes the problem with par a (par b (a + b)) - sure, either a or b will immediately be evaluated, and the corresponding spark will fizzle, but the other spark should be very much alive, producing parallelism. Of course, creating then fizzling a spark might not be the most efficient way to do this, but it works and leaves the evaluation order question to the compiler.Tuberosity
In the case of par a (a + b), it's still possible to get a "lucky" parallelisation, if the runtime instead chooses b first. Then the a spark won't get fizzled. This is mentioned in the PDF: community.haskell.org/~simonmar/papers/threadscope.pdf (page 2)Margertmargery
P
15
a `par` b `par` a+b 

will evaluate a and b in parallel and returns a+b, yes.

However, the pseq there ensures both a and b are evaluated before a+b is.

See this link for more details on that topic.

Punctuate answered 2/1, 2011 at 2:19 Comment(0)
C
6

a `par` b `par` a+b creates sparks for both a and b, but a+b is reached immediately so one of the sparks will fizzle (i.e., it is evaluated in the main thread). The problem with this is efficiency, as we created an unnecessary spark. If you're using this to implement parallel divide & conquer then the overhead will limit your speedup.

a `par` a+b seems better because it only creates a single spark. However, attempting to evaluate a before b will fizzle the spark for a, and as b does not have a spark this will result in sequential evaluation of a+b. Switching the order to b+a would solve this problem, but as code this doesn't enforce ordering and Haskell could still evaluate that as a+b.

So, we do a `par` b `pseq` a+b to force evaluation of b in the main thread before we attempt to evaluate a+b. This gives the a spark chance to materialise before we try evaluating a+b, and we haven't created any unnecessary sparks.

Czarevna answered 19/5, 2014 at 9:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.