Parallel "any" or "all" in Haskell
Asked Answered
C

1

8

A pattern I have come across a number of times now is one where a list of values needs to be checked by mapping some test over it and seeing if any or all of the elements passed. The typical solution is just to use the convenient built-ins all and any.

The problem is that these evaluate in serial. In many cases it would be much faster to evaluate in parallel with the process being complete once any thread finds a "False" for all or a "True" for any. I'm pretty sure that short-circuiting behavior can't be implemented using Control.Parallel as it requires inter-process communication and I don't understand anywhere near enough of Control.Concurrent to implement this yet.

It's a pretty common pattern in math (e.g. Miller-Rabin Primality) so I feel like someone has probably come up with a solution for this already, but for obvious reasons doing a google search for "parallel or/and/any/all on list haskell" doesn't return many relevant results.

Calibre answered 10/2, 2020 at 23:38 Comment(10)
You may find Parallel and Concurrent Programming in Haskell useful, particularly Chapters 2, 3 and 4.Hooves
This is possible with unamb libraryLeath
@Hooves I'v seen that book referenced a few times, I'll have to pick it up somewhere.Calibre
@Leath Fascinating; I'll mess around with this. If I write a good parallel all/any with this I'll post it as an answer.Calibre
@Calibre It’s actually freely available online, which is why I recommended it: you can use the chapter links I gave to read those chapters.Hooves
@Hooves Oh, rad! I'm real tired right now and those chapter links didn't parse as links in my head. Thanks for your help!Calibre
Before trying to parallelize anything, consider how many conditions you can test in the time it takes to fork a new process.Blasphemous
@Blasphemous You don't fork a process, but a thread. In ghc these are green threads and a fairly cheap to fork.Dominion
Threads don't do anything for something that is compute-bound.Blasphemous
@Blasphemous what are you talking about? We aren't talking about bash here! We can do concurrency and parallelism with threads (be it pthreads in C or green threads in Haskell) You don start multiple webservers in order to handle concurrent web requests, instead you run multiple threads in a single process! Same applies to parallelism. You spin up as many threads as you have CPUs and split up your work evenly, thus taking care of CPU bound tasks. Try this library to convince yourself github.com/lehins/haskell-schedulerDominion
A
2

In many realistic programs, you can use parallel strategies for this purpose. That's because, even though there is no explicit mechanism to cancel unneeded computations, this will happen implicitly when the garbage collector runs. As a concrete example, consider the following program:

import Control.Concurrent
import Control.Parallel.Strategies
import Data.Int
import System.Mem

lcgs :: Int32 -> [Int32]
lcgs = iterate lcg
  where lcg x = 1664525 * x + 1013904223

hasWaldo :: Int32 -> Bool
hasWaldo x = waldo `elem` take 40000000 (lcgs x)

waldo :: Int32
waldo = 0

main :: IO ()
main = do
  print $ or (map hasWaldo [1..100] `using` parList rseq)

This uses a parallel list strategy to search for waldo = 0 (which will never be found) in the output of 100 PRNG streams of 40 million numbers each. Compile and run it:

ghc -threaded -O2 ParallelAny.hs
./ParallelAny +RTS -s -N4

and it pegs four cores for about 16s, eventually printing False. Note in the statistics that all 100 sparks are "converted" and so run to completion:

SPARKS: 100(100 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

Now, change waldo to a value that can be found early:

waldo = 531186389   -- lcgs 5 !! 50000

and modify main to keep the thread alive for 10 seconds:

main :: IO ()
main = do
  print $ or (map hasWaldo [1..100] `using` parList rseq)
  threadDelay 10000000

You'll observe that it prints True almost immediately, but 4 cores remain pegged at 100% CPU (at least for a little while), illustrating that unneeded computations keep running and are not short-circuited, just as you might have feared.

BUT, things change if you force a garbage collection after getting the answer:

main :: IO ()
main = do
  print $ or (map hasWaldo [1..100] `using` parList rseq)
  performGC
  threadDelay 10000000

Now, you'll see that the CPU drops to idle shortly after printing True, and the statistics show that most of the computations were garbage collected before running:

SPARKS: 100(9 converted, 0 overflowed, 0 dud, 91 GC'd, 0 fizzled)

In realistic programs, an explicit performGC will not be needed, as GCs will be performed regularly as a matter of course. Some unnecessary computations will continue to run after the answer is found, but in many realistic scenarios, the fraction of unnecessary computations will not be a particularly important factor.

In particular, if the list is large and each individual test of a list element is fast, parallel strategies will have excellent real-world performance and is easy to implement into the bargain.

Arteriovenous answered 2/3, 2020 at 22:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.