Why are difference lists not an instance of foldable?
Asked Answered
L

2

10

The dlist package contains the DList data type, which has lots of instances, but not Foldable or Traversable. In my mind, these are two of the most "list-like" type classes. Is there a performance reason that DList is not an instance of these classes?

Also, the package does implement foldr and unfoldr, but none of the other folding functions.

Levitt answered 23/3, 2013 at 17:3 Comment(0)
S
12

DList a is a newtype wrapper around [a] -> [a], which has an a in a contravariant position, so it cannot implement Foldable or Traversable, or even Functor directly. The only way to implement them is to convert to and from regular lists (see the foldr implementation), which defeats the performance advantage of difference lists.

Shonda answered 23/3, 2013 at 17:34 Comment(3)
Further to Sjoerd's answer, a DList is only efficient for building - if you have built your list and want to process it, you should covert it with toList then process the regular list.Environ
So why don't we simply define fold (DL f) = fold (f [])? We can forget about how DLists are implemented and simply view them as some representation of sequence of elements, and then implementing Foldable makes sense. Implementing Functor and Traversable in this way would probably have some pitfalls, but Foldable seems quite reasonable.Vaquero
Yeah, Foldable might not be too bad, the package has foldr already and that's enough after all. I guess the reason it isn't implemented is because the last update was in 2009, when Foldable was not a well known type class yet.Shonda
P
23

One alternative you should consider instead of DList is to use Church-encoded lists. The idea is that you represent a list as an opaque value that knows how to execute a foldr over a list. This requires using the RankNTypes extension:

{-# LANGUAGE RankNTypes #-}

import Prelude 
import Control.Applicative
import Data.Foldable (Foldable)
import qualified Data.Foldable as F
import Data.Traversable (Traversable)
import qualified Data.Traversable as T

-- | Laws:
--
-- > runList xs cons nil == xs
-- > runList (fromList xs) f z == foldr f z xs
-- > foldr f z (toList xs) == runList xs f z
newtype ChurchList a = 
    ChurchList { runList :: forall r. (a -> r -> r) -> r -> r }

-- | Make a 'ChurchList' out of a regular list.
fromList :: [a] -> ChurchList a
fromList xs = ChurchList $ \k z -> foldr k z xs

-- | Turn a 'ChurchList' into a regular list.
toList :: ChurchList a -> [a]
toList xs = runList xs (:) []

-- | We can construct an empty 'ChurchList' without using a @[]@.
nil :: ChurchList a 
nil = ChurchList $ \_ z -> z

-- | The 'ChurchList' counterpart to '(:)'.  Unlike 'DList', whose
-- implementation uses the regular list type, 'ChurchList' doesn't
-- rely on it at all.
cons :: a -> ChurchList a -> ChurchList a
cons x xs = ChurchList $ \k z -> k x (runList xs k z)

-- | Append two 'ChurchList's.  This runs in O(1) time.  Note that
-- there is no need to materialize the lists as @[a]@.
append :: ChurchList a -> ChurchList a -> ChurchList a
append xs ys = ChurchList $ \k z -> runList xs k (runList ys k z)

-- | Map over a 'ChurchList'.  No need to materialize the list.
instance Functor ChurchList where
    fmap f xs = ChurchList $ \k z -> runList xs (\x xs' -> k (f x) xs') z

-- | The 'Foldable' instance is trivial, given the 'ChurchList' law.
instance Foldable ChurchList where
    foldr f z xs = runList xs f z

instance Traversable ChurchList where
    traverse f xs = runList xs step (pure nil)
        where step x rest = cons <$> f x <*> rest

The downside to this is that there is no efficient tail operation for a ChurchList—folding a ChurchList is cheap, but taking repeated tails is costly...

Plagal answered 23/3, 2013 at 23:38 Comment(13)
the tail of a ChurchList can be computed, lazily, in constant time.Skardol
Note that I said "taking repeated tails"; if you're just taking the tail once, the simple churchTail = fromList . tail . toList doesn't look too bad. But now consider what happens with churchTail . churchTail: you get a ChurchList backed by a []-list that's constructed from a ChurchList backed by a []-list. The heart of the problem is that a ChurchList and its churchTail don't share structure like a []-list and its tail do. I don't believe that more sophisticated implementations of churchTail that don't use toList/fromList can avoid this either.Plagal
True, repeated tails are costly for other implementations as well. BTW I don't think that the append operation of a ChurchList is any better than that of a normal list, is it?Skardol
also, singleton x = cons x nil = ChurchList $ \k z -> k x (runList nil k z) = ChurchList $ \k z -> k x z. then, snoc xs x = append xs $ singleton x = ChurchList $ \k z -> runList xs k (runList (singleton x) k z) = ChurchList $ \k z -> runList xs k (k x z). Also O(1). @Skardol why? I think it is much more like the append for DList (i.e. function composition), not []. Recently this was discussed here https://mcmap.net/q/183524/-haskell-foldl-39-poor-performance-with and first, here https://mcmap.net/q/183523/-why-are-difference-lists-more-efficient-than-regular-concatenation-in-haskell. I think it applies here as well - everything is O(1) until the first toList.Splint
repeated tails is bad because functions are opaque. but maybe re-write rules can be employed to convert the inefficient churchTail . churchTail = fromList . tail . toList . fromList . tail . toList into the efficient fromList . tail . tail . toList by collapsing toList . fromList to id?Splint
@WillNess No, it's not. A bunch of appends on a ChurchList will have the same problem when left-associatively composed.Skardol
@Skardol could you give me a sample misbehaving expression please, to chew on?Splint
@Skardol gist.github.com/WillNess/5266509#file-churchlist-hs where am I wrong?Splint
@WillNess for example the expression (((a `append` b) `append` c) `append` d) to compute the whole list, the list a will have to be traversed 3 times just as with normal lists. Difference lists were implemented mainly to solve this problem.Skardol
@Skardol I disagree. I've updated the gist. the definition g = (((a `append` b) `append` c) `append` d) gets forced into WHNF on the first runList g k z and then rearranges itself into the right-nested runList a k (runList b k (runList c k (runList d k z))), just like the func-compo chains of DList do. So a is run only once. Do you agree? (and if we don't name them, the unnamed thunks will get created instead, and forced).Splint
@Skardol thanks for the confirmation. :) This means, that this answer does provide a Foldable alternative to DList, without loosing any of its benefits. (!)Splint
@WillNess has it been wrapped up in a package, with the rewrite rules?Halfhardy
@AlpMestanogullari I've no idea. :)Splint
S
12

DList a is a newtype wrapper around [a] -> [a], which has an a in a contravariant position, so it cannot implement Foldable or Traversable, or even Functor directly. The only way to implement them is to convert to and from regular lists (see the foldr implementation), which defeats the performance advantage of difference lists.

Shonda answered 23/3, 2013 at 17:34 Comment(3)
Further to Sjoerd's answer, a DList is only efficient for building - if you have built your list and want to process it, you should covert it with toList then process the regular list.Environ
So why don't we simply define fold (DL f) = fold (f [])? We can forget about how DLists are implemented and simply view them as some representation of sequence of elements, and then implementing Foldable makes sense. Implementing Functor and Traversable in this way would probably have some pitfalls, but Foldable seems quite reasonable.Vaquero
Yeah, Foldable might not be too bad, the package has foldr already and that's enough after all. I guess the reason it isn't implemented is because the last update was in 2009, when Foldable was not a well known type class yet.Shonda

© 2022 - 2024 — McMap. All rights reserved.