Or to be specific, why do we use foldr to encode lists and iteration to encode numbers?
Sorry for the longwinded introduction, but I don't really know how to name the things I want to ask about so I'll need to give some exposition first. This draws heavily from this C.A.McCann post that just not quite satisfies my curiosity and I'll also be handwaving the issues with rank-n-types and infinite lazy things.
One way to encode datatypes as functions is to create a "pattern matching" function that receives one argument for each case, each argument being a function that receives the values corresponding to that constructor and all arguments returning a same result type.
This all works out as expected for non-recursive types
--encoding data Bool = true | False
type Bool r = r -> r -> r
true :: Bool r
true = \ct cf -> ct
false :: Bool r
false = \ct cf -> cf
--encoding data Either a b = Left a | Right b
type Either a b r = (a -> r) -> (b -> r) -> r
left :: a -> Either a b r
left x = \cl cr -> cl x
right :: b -> Either a b r
right y = \cl cr -> cr y
However, the nice analogy with pattern matching breaks down with recursive types. We might be tempted to do something like
--encoding data Nat = Z | S Nat
type RecNat r = r -> (RecNat -> r) -> r
zero = \cz cs -> cz
succ n = \cz cs -> cs n
-- encoding data List a = Nil | Cons a (List a)
type RecListType a r = r -> (a -> RecListType -> r) -> r
nil = \cnil ccons -> cnil
cons x xs = \cnil ccons -> ccons x xs
but we can't write those recursive type definitions in Haskell! The usual solution is to force the callback of the cons/succ case to be applied to all levels of recursion instead of just the first one (ie, writing a fold/iterator). In this version we use the return type r
where the recursive type would be:
--encoding data Nat = Z | S Nat
type Nat r = r -> (r -> r) -> r
zero = \cz cf -> cz
succ n = \cz cf -> cf (n cz cf)
-- encoding data List a = Nil | Cons a (List a)
type recListType a r = r -> (a -> r -> r) -> r
nil = \z f -> z
cons x xs = \z f -> f x (xs z f)
While this version works, it makes defining some functions much harder. For example, writing a "tail" function for lists or a "predecessor" function for numbers is trivial if you can use pattern matching but gets tricky if you need to use the folds instead.
So onto my real questions:
How can we be sure that the encoding using folds is as powerful as the hypothetical "pattern matching encoding"? Is there a way to take an arbitrary function definition via pattern matching and mechanically convert it to one using only folds instead? (If so, this would also help make tricky definitions such as tail or foldl in terms of foldr as less magical)
Why doesn't the Haskell type system allow for the recursive types needed in the "pattern matching" encoding?. Is there a reason for only allowing recursive types in datatypes defined via
data
? Is pattern matching the only way to consume recursive algebraic datatypes directly? Does it have to do with the type inferencing algorithm?
O(1)
(I think that's even more efficient than the binary version computers usually use.) – Benzol