Cats effect - parallel composition of independent effects

I want to combine multiple IO values that should run independently in parallel.

val io1: IO[Int] = ???
val io2: IO[Int] = ???

As I see it, I have to options:

Use cats-effect's fibers with a fork-join pattern

val parallelSum1: IO[Int] = for {
  fiber1 <- io1.start
  fiber2 <- io2.start
  i1 <- fiber1.join
  i2 <- fiber2.join
} yield i1 + i2

Use the Parallel instance for IO with parMapN (or one of its siblings like parTraverse, parSequence, parTupled etc)
```
val parallelSum2: IO[Int] = (io1, io2).parMapN(_ + _)
```

Not sure about the pros and cons of each approach, and when should I choose one over the other. This becomes even more tricky when abstracting over the effect type IO (tagless-final style):

def io1[F[_]]: F[Int] = ???
def io2[F[_]]: F[Int] = ???

def parallelSum1[F[_]: Concurrent]: F[Int] = for {
  fiber1 <- io1[F].start
  fiber2 <- io2[F].start
  i1 <- fiber1.join
  i2 <- fiber2.join
} yield i1 + i2

def parallelSum2[F[_], G[_]](implicit parallel: Parallel[F, G]): F[Int] =
  (io1[F], io2[F]).parMapN(_ + _)

The Parallel typeclass requires 2 type constructors, making it somewhat more cumbersome to use, without context bounds and with an additional vague type parameter G[_]

Your guidance is appreciated :)

Amitay

I want to combine multiple IO values that should run independently in parallel.

The way I view it, in order to figure out "when do I use which?", we need to return the the old parallel vs concurrent discussion, which basically boils down to (quoting the accepted answer):

Concurrency is when two or more tasks can start, run, and complete in overlapping time periods. It doesn't necessarily mean they'll ever both be running at the same instant. For example, multitasking on a single-core machine.

Parallelism is when tasks literally run at the same time, e.g., on a multicore processor.

We often like to provide an example of concurrency when we we do IO like operations, such as creating an over the wire call, or talking to disk.

Question is, which one do you want when you say you want to execute "in parallel", is it the former or the latter?

If we're referring to the former, then using Concurrent[F] both conveys the intention by the signature and provides the proper execution semantics. If it's the latter, and we, for example, want to process a collection of elements in parallel, then going with Parallel[F, G] would be the better solution.

It is often quite confusing when we think about the semantics of this regarding IO, because it has both instances for Parallel and Concurrent and we mostly use it to opaquely define side effecting operations.

As a side note, the reason behind Parallel taking two unary type constructors is because of the fact that M (in Parallel[M[_], F[_]]) in always a Monad instance, and we need a way to prove the Monad has an Applicative[F] instance as well for parallel executions, because when we think of a Monad we always talk about sequential execution semantics.

Recommended topics

Hot tags