(Disclaimer: This is outside my area of expertise. I believe I'm correct (with caveats provided at different points), but ... Verify it yourself.)
A catamorphism can be thought of as a function that replaces constructors of a data type with other functions.
(In this example, I will be using the following data types:
data [a] = [] | a : [a]
data BinTree a = Leaf a | Branch (BinTree a) (BinTree a)
data Nat = Zero | Succ Nat
)
For example:
length :: [a] -> Nat
length = catamorphism
[] -> 0
(_:) -> (1+)
(Sadly, the catamorphism {..}
syntax is not available in Haskell (I saw something similar in Pola). I've been meaning to write a quasiquoter for it.)
So, what is length [1,2,3]
?
length [1,2,3]
length (1 : 2 : 3 : [])
length (1: 2: 3: [])
1+ (1+ (1+ (0 )))
3
That said, for reasons that will become apparent later, it is nicer to define it as the trivially equivalent:
length :: [a] -> Nat
length = catamorphism
[] -> Zero
(_:) -> Succ
Let's consider a few more example catamorphisms:
map :: (a -> b) -> [a] -> b
map f = catamorphism
[] -> []
(a:) -> (f a :)
binTreeDepth :: Tree a -> Nat
binTreeDepth = catamorphism
Leaf _ -> 0
Branch -> \a b -> 1 + max a b
binTreeRightDepth :: Tree a -> Nat
binTreeRightDepth = catamorphism
Leaf _ -> 0
Branch -> \a b -> 1 + b
binTreeLeaves :: Tree a -> Nat
binTreeLeaves = catamorphism
Leaf _ -> 1
Branch -> (+)
double :: Nat -> Nat
double = catamorphism
Succ -> Succ . Succ
Zero -> Zero
Many of these can be nicely composed to form new catamorphisms. For example:
double . length . map f = catamorphism
[] -> Zero
(a:) -> Succ . Succ
double . binTreeRightDepth = catamorphism
Leaf a -> Zero
Branch -> \a b -> Succ (Succ b)
double . binTreeDepth
also works, but it is almost a miracle, in a certain sense.
double . binTreeDepth = catamorphism
Leaf a -> Zero
Branch -> \a b -> Succ (Succ (max a b))
This only works because double
distributes over max
... Which is pure coincidence. (The same is true with double . binTreeLeaves
.) If we replaced max
with something that didn't play as nicely with doubling... Well, let's define ourselves a new friend (that doesn't get along as well with the others). For a binary operators that double
doesn't distribute over, we'll use (*)
.
binTreeProdSize :: Tree a -> Nat
binTreeProdSize = catamorphism
Leaf _ -> 0
Branch -> \a b -> 1 + a*b
Let's try to establish sufficient conditions for two catamorphisms two compose. Clearly, any catamorphism will quite happily be composed with length
, double
and map f
because they yield their data structure without looking at the child results. For example, in the case of length
, you can just replace Succ
and Zero
with what ever you want and you have your new catamorphism.
- If the first catamorphism yields a data structure without looking at what happens to its children, two catamorphisms will compose into a catamorphism.
Beyond this, things become more complicated. Let's differentiate between normal constructor arguments and "recursive arguments" (which we will mark with a % sign). So Leaf a
has no recursive arguments, but Branch %a %b
does. Let's use the term "recursive-fixity" of a constructor to refer to the number of recursive arguments it has. (I've made up both these terms! I have no idea what proper terminology is, if there is one! Be wary of using them elsewhere!)
If the first catamorphism maps something into a zero recursive fixity constructor, everything is good!
a | b | cata(b.a)
===============================|=========================|================
F a %b %c .. -> Z | Z -> G a b .. | True
If we map children directly into a new constructor, we're also good.
a | b | cata(b.a)
===============================|=========================|=================
F a %b %c .. -> H %c %d .. | H %a %b -> G a b .. | True
If we map into a recursive fixity one constructor...
a | b | cata(b.a)
===============================|=========================|=================
F a %b %c .. -> A (f %b %c..) | A %a -> B (g %a) | Implied by g
| | distributes over f
But it isn't iff. For example, if there exist g1
g2
such that g (f a b..) = f (g1 a) (g2 b) ..
, that also works.
From here, the rules will just get messier, I expect.
h :: F1 A -> A
? – Cromagnon