How do I make this function consume its input bit stream lazily?
Asked Answered
L

2

7

I'm imagining a function like

takeChunkUntil :: [a] -> ([a] -> Bool) -> ([a], [a])

Hopefully lazy.

It takes elements from the first list until the group of them satisfies the predicate, then returns that sublist as well as the remaining elements.

TO ANSWER SOME QUESTIONS:
The ultimate goal is to make something that reads Huffman codes lazily. So if you have a string of bits, here represented as Bool, bs, you can write take n $ decode huffmanTree bs to take the first n coded values while consuming only as much of bs as necessary. If you would like I'll post more details and my attempted solutions. This could get long :) (Note I'm a tutor who was given this problem by a student, but I didn't try to help him as it was beyond me, however I'm very curious now.)

CONTINUED: Here goes the whole thing:

Definition of Huffman tree:

data BTree a = Leaf a | Fork (BTree a) (BTree a) deriving (Show, Eq)

Goal: write a lazy decode function that returns a pair of the decoded values and a Boolean indicating if there were any values left over that were not fully long enough to be decoded into a value. Note: we are using Bool to represent a bit: True =1, False = 0.

decode :: BTree a -> [Bool] -> ([a], Bool)

Here's the essence: The first function I wrote was a function that decodes one value. Returns Nothing if the input list was empty, otherwise returns the decoded value and the remaining "bit".

decode1 :: BTree a -> [Bool] -> Maybe (a, [Bool])
decode1 (Leaf v) bs = Just (v, bs)
decode1 _ [] = Nothing
decode1 (Fork left right) (b:bs) 
  | b         = decode1 right bs
  | otherwise = decode1 left bs

First, I figured that I needed some kind of tail recursion to make this lazy. Here's what doesn't work. I think it doesn't, anyway. Notice how it's recursive, but I'm passing a list of "symbols decoded so far" and appending the new one. Inefficient and maybe (if my understanding is right) won't lead to tail recursion.

decodeHelp :: BTree a -> [a] -> [Bool] -> ([a],Bool)
decodeHelp t symSoFar bs = case decode1 t bs of
    Nothing -> (symSoFar,False)
    Just (s,remain) -> decodeHelp t (symSoFar ++ [s]) remain

So I thought, how can I write a better kind of recursion in which I decode a symbol and append it to the next call? The key is to return a list of [Maybe a], in which Just a is a successfully decoded symbol and Nothing means no symbol could be decoded (i.e. remaining booleans were not sufficient)

decodeHelp2 :: BTree a -> [Bool] -> [Maybe a]
decodeHelp2 t bs = case decode1 t bs of
    Nothing -> [Nothing]
    Just (s, remain) -> case remain of
        [] -> []
        -- in the following line I can just cons Just s onto the
        -- recursive call. My understand is that's what make tail
        -- recursion work and lazy.
        _  -> Just s : decodeHelp2 t remain 

But obviously this is not what the problem set wants out of the result. How can I turn all these [Maybe a] into a ([a], Bool)? My first thought was to apply scanl

Here's the scanning function. It accumulates Maybe a into ([a], Bool)

sFunc :: ([a], Bool) -> Maybe a -> ([a], Bool)
sFunc (xs, _) Nothing = (xs, False)
sFunc (xs, _) (Just x) = (xs ++ [x], True)

Then you can write

decodeSortOf :: BTree a -> [Bool] -> [([a], Bool)]
decodeSortOf t bs = scanl sFunc ([],True) (decodeHelp2 t bs)

I verified this works and is lazy:

take 3 $ decodeSortOf xyz_code [True,False,True,True,False,False,False,error "foo"] gives [("",True),("y",True),("yz",True)]

But this is not the desired result. Help, I'm stuck!

Levitation answered 25/10, 2019 at 18:14 Comment(6)
Can you show one of these awkward implementations?Instrumentality
@WillemVanOnsem, it's the group that satisfies the predicate. This is an inherently inefficient function.Hygienics
@dfeuer: yes, but not per se a non-lazy one :)Instrumentality
@WillemVanOnsem I never got any of them to work, hence didn't bother to copy in. But it could be instructive.Levitation
@composerMike: the fact that these do not (fully) work is not that important. Usually it helps to show what can be improved :)Instrumentality
What should takeChunkUntil return if the predicate is never satisfied?Kenrick
H
3

Here's a hint. I've swapped the argument order to get something more idiomatic, and I've changed the result type to reflect the fact that you may not find an acceptable chunk.

import Data.List (inits, tails)

takeChunkUntil :: ([a] -> Bool) -> [a] -> Maybe ([a], [a])
takeChunkUntil p as = _ $ zip (inits as) (tails as)
Hygienics answered 25/10, 2019 at 18:23 Comment(6)
Thanks for great idea with zipping the inits and tails. Are inits and tails lazy? They must be. I think you put find (p . fst) into the hole, right? Thanks for giving as hint. Much more fun!Levitation
You don't really need Maybe, do you? I don't think there's any difference between (x, []) and Nothing for input list x.Kenrick
@Kenrick There's certainly a difference, namely: Just (x, []) says that x satisfies the predicate, while Nothing says it doesn't.Guillot
I'm thinking of the function as being more like take, i.e. take n == fst . takeChunkUntil (\ys -> length ys == n); if you reach the end of the list without satisfying the predicate, you just take the entire list.Kenrick
@chepner, I would want to see the use case for this function to know which of these two boundary options to use.Tourniquet
@Tourniquet Agreed. The description in the question is vague, implying that a satisfactory prefix exists.Kenrick
I
2

We can use explicit recursion here, where if the predicate is satisfied, we prepend to the first item of the tuple. If not, we make a 2-tuple where we put the (remaining) list in the second item of the 2-tuple. For example:

import Control.Arrow(first)

takeChunkUntil :: ([a] -> Bool) -> [a] -> ([a], [a])
takeChunkUntil p = go []
    where go _ [] = ([], [])
          go gs xa@(x:xs) | not (p (x:gs)) = first (x:) (go (x:gs) xs)
                          | otherwise = ([], xa)

We here make the assumption that the order of the elements in the group is not relevant to the predicate (since we each time pass the list in reverse order). If that is relevant, we can use a difference list for example. I leave that as an exercise.

This works on an infinite list as well, for example:

Prelude Control.Arrow> take 10 (fst (takeChunkUntil (const False) (repeat 1)))
[1,1,1,1,1,1,1,1,1,1]
Instrumentality answered 25/10, 2019 at 18:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.