What's the conceptual difference between Machines and Conduits (or other similar libraries)?
Asked Answered
B

1

29

I'd like to learn the concept, so that I'd be able to understand and use libraries such as machines.

I tried to follow Rúnar Bjarnason's talk on machines, but there is too little information, basically just a bunch of data types. I can't even understand what k is in

newtype Machine k o = Step k o (Machine k o)
data Step k o r = Stop
                | Yield o r
                | forall t . Await (t -> r) (k t) r

or what's t is and why it's quantified. Or, what's the conceptual difference between conduit-like libraries and machines?

Bick answered 24/6, 2013 at 17:36 Comment(0)
C
45

conduit and pipes are both far more mature than machines, but -- that said -- machines is trying to take a different path than conduit and pipes.

With machines, I'm trying for a relatively simple API in terms of type arguments. Both conduit and pipes have chosen to unify all of their concepts by using 5-6 different type variable arguments.

Machines takes a different approach of parameterizing a machine (or Plan) on its "input language", which puts all the onus on a single extra argument (or two in the case of a Plan). Also by choosing to parameterize the input language in this way it opens up possibilities of using machines that can accept input (non-)deterministically from multiple input sources. The result is basically just a free monad with an extra 'emit' instruction!

In exchange for a slightly more rigorous policy about how you build and use machines, it can eventually offer better safety about asymptotics of the resulting code.

That said, pipes and conduit have had a lot of real world use and machines is more or less a playground for me, Rúnar Bjarnason and Paul Chiusano.

It is currently suitable for working on input that you intend to consume entirely, but less so for working with complicated resources or for parsing than what you get with those other two APIs.

Now, about that quantifier!

t there is actually existentially quantified. By doing so we can make the Monad for machines not care about the functoriality of the k parameter. This is important because of the way Source is implemented. If I didn't need Source to work, then we could use the simpler

data Step k o r = Stop
                | Yield o r
                | Await (k r) r

This would have the unfortunate side-effect that when you went to compose a Machine with a Source, that the compiler wouldn't know what Functor instance to pick and you'd be swimming in unnecessary type annotations.

The existential quantification there is a trick I picked up when working on the kan-extensions package. It is a generalization of one of the Yoneda types from there.

Cultivation answered 24/6, 2013 at 18:4 Comment(8)
Thank you, could you perhaps elaborate a bit about k? Why does Await consist of k r, especially, why is k :: * -> * and parametrized by r? I tried to examine Tee, Wye and Is to see some examples of different ks, but still I'm far from understanding it.Bick
k a is the type of a request you are making of whatever is feeding you data. The a represents the result of the request. Await effectively has two arguments: "The request with what to do if the request succeeds" and "what do to on a failure". For a single input you could use k = (->) i` so that Machine ((->) i) is given a function from (i -> r) that you can supply the input to. With Tee you wind up with one of two functions so you can block on either input separately. Wye permits you to block on either or both inputs at the same time. Is is just used like (->) above.Cultivation
One more question, why there are Plans and Machines? Why the distinction?Bick
By separating Plan from Machine we can do two things. First Plan is in CPS/Codensity form internally. This makes (>>=) cheap even if you have left associated binds (unlike with pipes) while Machine is in a more traditional ADT form that makes pattern matching cheaper. That means its cheap to build a Plan, but cheap to run a Machine. The pipes approach of using one type for both means one role suffers. In particular if you pipe then bind, you pay for how many steps were taken in the pipe to get over the (>>=)!Cultivation
Also, by separating Plan from Machine, we get two different monads we can talk about. The monad for Plan is basically a free monad of requests in the input language followed by emissions on the output channel, while the monad for Machine is more like a list monad where you chain together the resulting machines forgetfully. Each has different uses but notably we can get by with pretty much standard combinators for working with each rather than make up our own classes/types for everything.Cultivation
Thanks! Now I understand. I had the same problem when implementing scala-conduit. Each monadic bind causes a new traversal of the whole structure.Bick
NP: Note the version of scala-machines that I think is on Runar's account actually suffers the cost of binding as it is based on the simpler unified Plan/Machine model, but then the codensity trick doesn't work in Scala, so there it'd have to trampoline the Plan to get the right asymptotics.Cultivation
If you look at the code for Trampoline in scalaz you'll see how you can avoid the complete structure re-traversal and how to avoid blowing your stack for non-trivial inputs. Sadly scalable/working monadic code in scala comes at a real cost. Trampolining can be 50x slower, so you have to choose between 3 options in scala: fast and loose while crashing on non-trivial inputs, slow but correct with frames on the heap, or littering your code with side-effects and non-local reasoning.Cultivation

© 2022 - 2024 — McMap. All rights reserved.