What's the benefit of conduit's leftovers?
Asked Answered
P

2

13

I'm trying to understand the differences between conduit and pipes. Unlike pipes, conduit has the concept of leftovers. What are leftovers useful for? I'd like to see some examples where leftovers are essential.

And since pipes don't have the concept of leftovers, is there any way to achieve a similar behavior with them?

Pontus answered 6/3, 2013 at 21:38 Comment(0)
M
16

Gabriella's point that leftovers are always part of parsing is interesting. I'm not sure I would agree, but that may just depend on the definition of parsing.

There are a large category of use cases which require leftovers. Parsing is certainly one: any time a parse requires some kind of lookahead, you'll need leftovers. One example of this is in the markdown package's getIndented function, which isolates all of the upcoming lines with a certain indentation level, leaving the rest of the lines to be processed later.

But a much more mundane set of examples lives in conduit itself. Any time you're dealing with packed data (like ByteString or Text), you'll need to read a chunk, analyze it somehow, use leftover to push back the extra, and then do something with the original content. Perhaps the simplest example of this is dropWhile.

In fact, I consider leftover to be such a core, basic feature of a streaming library that the new 1.0 interface for conduit doesn't even expose the option to users of disabling leftovers. I know of very few real-world use cases that don't need it in one way or another.

Meekins answered 7/3, 2013 at 5:22 Comment(5)
Thanks for the explanation, I'm quite convinced now. Meanwhile, I was thinking about how to implement leftovers on top of a conduit-like library that doesn't have them natively. The idea is that a conduit with leftovers can be represented as a conduit returning (Maybe i, r). My attempt (for conduit) is here.Pontus
I think you have the right intuition, your implementation is very similar to how things work internally. I think you discovered the double-leftover issue, which is why leftovers are allowed to be stacked as multiple Leftover constructors in conduit.Meekins
Yet another attempt (which I'll most likely use in my Scala library) for leftovers is to view leftovers as a kind of feedback: For Pipe Void i (Either i o) u m r we send any Left i back to its input using a an internal method that converts such a pipe into a standard one.Pontus
Could you please point to a simple example of a leftover? I looked at your dropWhile link, but it 404s. In the dropWhile case, what is the leftover?If possible, could you please see stackoverflow.com/q/44402598/409976? Thanks!Outlook
Hit spacebar to proceed through the slides from here: snoyman.com/reveal/conduit-yesod#/13/1. It's also covered in the Conduit tutorial, which I'd recommend reading: haskell-lang.org/library/conduit.Meekins
B
15

I'll answer for pipes. The short answer to your question is that the upcoming pipes-parse library will have support for leftovers as part of a more general parsing framework. I find that almost every case where people want leftovers they actually want a parser, which is why I frame the leftovers problem as a subset of parsing. You can find the current draft of the library here.

However, if you want to understand how pipes-parse gets it to work, the simplest possible way to implement leftovers is to just use StateP to store the pushback buffer. This requires only defining the following two functions:

import Control.Proxy
import Control.Proxy.Trans.State

draw :: (Monad m, Proxy p) => StateP [a] p () a b' b m a
draw = do
    s <- get
    case s of
        []   -> request ()
        a:as -> do
            put as
            return a

unDraw :: (Monad m, Proxy p) => a -> StateP [a] p () a b' b m ()
unDraw a = do
    as <- get
    put (a:as)

draw first consults the pushback buffer to see if there are any stored elements, popping one element off the stack if available. If the buffer is empty, it instead requests a new element from upstream. Of course, there's no point having a buffer if we can't push anything back, so we also define unDraw to push an element onto the stack to save for later.

Edit: Oops, I forgot to include a useful example of when leftovers are useful. Like Michael says, takeWhile and dropWhile are useful cases of leftovers. Here's the drawWhile function (analogous to what Michael calls takeWhile):

drawWhile :: (Monad m, Proxy p) => (a -> Bool) -> StateP [a] p () a b' b m [a]
drawWhile pred = go
  where
    go = do
        a <- draw
        if pred a
        then do
            as <- go
            return (a:as)
        else do
            unDraw a
            return []

Now imagine that your producer was:

producer () = do
    respond 1
    respond 3
    respond 4
    respond 6

... and you hooked that up to a consumer that used:

consumer () = do
    evens <- drawWhile odd
    odds  <- drawWhile even

If the first drawWhile odd didn't push back the final element it drew, then you would drop the 4, which wouldn't get correctly passed onto to the second drawWhile even statement`.

Butanol answered 6/3, 2013 at 22:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.