I really like the idea of working with catamorphisms/anamorphisms in a generic way, but it seems to me it has a significant performance drawback:
Suppose we want to work with a tree structure in the categorical way - to describe different folding using a generic catamorphism function:
newtype Fix f = Fix { unfix :: f (Fix f) }
data TreeT r = Leaf | Tree r r
instance Functor TreeT where
fmap f Leaf = Leaf
fmap f (Tree l r) = Tree (f l) (f r)
type Tree = Fix TreeT
catam :: (Functor f) => (f a -> a) -> (Fix f -> a)
catam f = f . fmap (catam f) . unfix
Now we can write functions like:
depth1 :: Tree -> Int
depth1 = catam g
where
g Leaf = 0
g (Tree l r) = max l r
Unfortunately, this approach has a significant drawback: During the computation, new instances of TreeT Int
are created at every level in fmap
just to be immediately consumed by g
. Compared to the classical definition
depth2 :: Tree -> Int
depth2 (Fix Leaf) = 0
depth2 (Fix (Tree l r)) = max (depth1 l) (depth1 r)
our depth1
will be always slower making unnecessary strain on the GC. One solution would be to use hylomorphisms and combine creation and folding trees together. But often we don't want to do that, we may want a tree to be created on one place and then passed somewhere else to be folded later. Or, to be folder several times with different catamorphisms.
Is there a way to make GHC optimize depth1
? Something like inlining catam g
and then fusing/deforesting g . fmap ...
inside?
+1
somewhere in theTree
case ofg
(ordepth2
) for the function to calculate the depth of the tree? Otherwise, I can't see howdepth1
ordepth2
can return anything but zero. – Brownleedepth1
should actually bedepth2
indepth2
's definition. – Brownlee