Is it possible to create a type-level representation of generic ADTs?
Asked Answered
T

2

5

Using Church encoding, it is possible to represent any arbitrary algebraic datatype without using the built-in ADT system. For example, Nat can be represented (example in Idris) as:

-- Original type

data Nat : Type where
    natSucc : Nat -> Nat
    natZero : Nat

-- Lambda encoded representation

Nat : Type
Nat = (Nat : Type) -> (Nat -> Nat) -> Nat -> Nat

natSucc : Nat -> Nat
natSucc pred n succ zero = succ (pred n succ zero)

natZero : Nat
natZero n succ zero = zero

Pair can be represented as:

-- Original type
data Pair_ : (a : Type) -> (b : Type) -> Type where
    mkPair_ : (x:a) -> (y:b) -> Pair_ a b

-- Lambda encoded representation

Par : Type -> Type -> Type
Par a b = (t:Type) -> (a -> b -> t) -> t

pair : (ta : Type) -> (tb : Type) -> (a:ta) -> (b:tb) -> Par ta tb
pair ta tb a b k t = t a b

fst : (ta:Type) -> (tb:Type) -> Par ta tb -> ta
fst ta tb pair = pair ta (\ a, b => a)

snd : (ta:Type) -> (tb:Type) -> Par ta tb -> tb
snd ta tb pair = pair tb (\ a, b => b)

And so on. Now, writing those types, constructors, matchers is a very mechanical task. My question is: would it be possible to represent an ADT as a specification on the type level, then derive the types themselves (i.e., Nat/Par), as well as the constructors/destructors automatically from those specifications? Similarly, could we use those specifications to derive generics? Example:

NAT : ADT
NAT = ... some type level expression ...

Nat : Type
Nat = DeriveType NAT

natSucc : ConstructorType 0 NAT
natSucc = Constructor 0 NAT

natZero : ConstructorType 1 NAT
natZero = Constructor 1 NAT

natEq : EqType NAT
natEq = Eq NAT

natShow : ShowType NAT
natShow = Show NAT

... and so on
Tips answered 28/6, 2016 at 12:54 Comment(4)
Note that, as far as I know, these Church-encodings lack dependent eliminations. E.g. if Bool = (A : Type) -> A -> A -> A, and tru,fls are defined accordingly, you can't prove (b: Bool) -> (b = tru \/ b = fls), while you could with inductive types.Etruria
You can do that for a quite more expressive type system: full dependent types + inductive families. Derived Nat then is an eliminator Nat = (P : Nat -> Type) -> (forall n. P n -> P (succ n)) -> P 0 -> forall n. P n. I wrote a blog post about this. EqType is derivable too (an example). How much types do you want to cover? Just System F types and no GADTs?Torbert
@chi: indeed, this seems to be a source proving your claim: Induction Is Not Derivable in Second Order Dependent Type TheoryManager
@Cactus, you can recover eliminators of Church-encoded data types using internalized parametricity. No references, though.Torbert
T
6

Indexed descriptions are not harder than polynomial functors. Consider this simple form of propositional descriptions:

data Desc (I : Set) : Set₁ where
  ret : I -> Desc I
  π   : (A : Set) -> (A -> Desc I) -> Desc I
  _⊕_ : Desc I -> Desc I -> Desc I
  ind : I -> Desc I -> Desc I

π is like Emb followed by |*|, but it allows the rest of a description depend on a value of type A. _⊕_ is the same thing as |+|. ind is like Rec followed by |*|, but it also receives the index of a future subterm. ret finishes a description and specifies the index of a constructed term. Here is an immediate example:

vec : Set -> Desc ℕ
vec A = ret 0
      ⊕ π ℕ λ n -> π A λ _ -> ind n $ ret (suc n)

The first constructor of vec doesn't contain any data and constructs a vector of length 0, hence we put ret 0. The second constructor receives the length (n) of a subvector, some element of type A and a subvector and it constructs a vector of length suc n.

Constructing fixed points of descriptions is similar to those of polynomial functors too

⟦_⟧ : ∀ {I} -> Desc I -> (I -> Set) -> I -> Set
⟦ ret i   ⟧ B j = i ≡ j
⟦ π A D   ⟧ B j = ∃ λ x -> ⟦ D x ⟧ B j
⟦ D ⊕ E   ⟧ B j = ⟦ D ⟧ B j ⊎ ⟦ E ⟧ B j
⟦ ind i D ⟧ B j = B i × ⟦ D ⟧ B j

data μ {I} (D : Desc I) j : Set where
  node : ⟦ D ⟧ (μ D) j -> μ D j

Vec is simply

Vec : Set -> ℕ -> Set
Vec A = μ (vec A)

Previously it was adt Rec t = t, but now terms are indexed, hence t is indexed too (it's called B above). ind i D carries the index of a subterm i to which μ D is applied then. So when interpreting the second constructor of vectors, Vec A is applied to the length of a subvector n (from ind n $ ...), thus a subterm has type Vec A n.

In the final ret i case it's required that a constructed term has the same index (i) as expected (j).

Deriving eliminators for such data types is slightly more complicated:

Elim : ∀ {I B} -> (∀ {i} -> B i -> Set) -> (D : Desc I) -> (∀ {j} -> ⟦ D ⟧ B j -> B j) -> Set
Elim C (ret i)   k = C (k refl)
Elim C (π A D)   k = ∀ x -> Elim C (D x) (k ∘ _,_ x)
Elim C (D ⊕ E)   k = Elim C D (k ∘ inj₁) × Elim C E (k ∘ inj₂)
Elim C (ind i D) k = ∀ {y} -> C y -> Elim C D (k ∘ _,_ y)

module _ {I} {D₀ : Desc I} (P : ∀ {j} -> μ D₀ j -> Set) (f₀ : Elim P D₀ node) where
  mutual
    elimSem : ∀ {j}
            -> (D : Desc I) {k : ∀ {j} -> ⟦ D ⟧ (μ D₀) j -> μ D₀ j}
            -> Elim P D k
            -> (e : ⟦ D ⟧ (μ D₀) j)
            -> P (k e)
    elimSem (ret i)    z       refl    = z
    elimSem (π A D)    f      (x , e)  = elimSem (D x) (f  x) e
    elimSem (D ⊕ E)   (f , g) (inj₁ x) = elimSem D f x
    elimSem (D ⊕ E)   (f , g) (inj₂ y) = elimSem E g y
    elimSem (ind i D)  f      (d , e)  = elimSem D (f (elim d)) e

    elim : ∀ {j} -> (d : μ D₀ j) -> P d
    elim (node e) = elimSem D₀ f₀ e

I elaborated the details elsewhere.

It can be used like this:

elimVec : ∀ {n A}
        -> (P : ∀ {n} -> Vec A n -> Set)
        -> (∀ {n} x {xs : Vec A n} -> P xs -> P (x ∷ xs))
        -> P []
        -> (xs : Vec A n)
        -> P xs
elimVec P f z = elim P (z , λ _ -> f)

Deriving decidable equality is more verbose, but not harder: it's just a matter of requiring that every Set which π receives has decidable equality. If all non-recursive content of your data type has decidable equality, then your data type has it as well.

The code.

Torbert answered 30/6, 2016 at 8:24 Comment(0)
M
3

To get you started, here's some Idris code that represents polynomial functors:

infix 10 |+|
infix 10 |*|

data Functor : Type where
  Rec : Functor
  Emb : Type -> Functor
  (|+|) : Functor -> Functor -> Functor
  (|*|) : Functor -> Functor -> Functor

LIST : Type -> Functor
LIST a = Emb Unit |+| (Emb a |*| Rec)

TUPLE2 : Type -> Type -> Functor
TUPLE2 a b = Emb a |*| Emb b

NAT : Functor
NAT = Rec |+| Emb Unit

Here's a data-based interpretation of their fixed points (see e.g. 3.2 in http://www.cse.chalmers.se/~ulfn/papers/afp08/tutorial.pdf for further details)

adt : Functor -> Type -> Type
adt Rec t = t
adt (Emb a) _ = a
adt (f |+| g) t = Either (adt f t) (adt g t)
adt (f |*| g) t = (adt f t, adt g t)

data Mu : (F : Functor) -> Type where
  Fix : {F : Functor} -> adt F (Mu F) -> Mu F

and here's a Church representation-based interpretation:

Church : Functor -> Type
Church f = (t : Type) -> go f t t
  where
    go : Functor -> Type -> (Type -> Type)
    go Rec t = \r => t -> r
    go (Emb a) t = \r => a -> r
    go (f |+| g) t = \r => go f t r -> go g t r -> r
    go (f |*| g) t = go f t . go g t

So we can do e.g.

-- Need the prime ticks because otherwise clashes with Nat, zero, succ from the Prelude...
Nat' : Type
Nat' = Mu NAT

zero' : Nat'
zero' = Fix (Right ())

succ' : Nat' -> Nat'
succ' n = Fix (Left n)

but also

zeroC : Church NAT
zeroC n succ zero = (zero ())

succC : Church NAT -> Church NAT
succC pred n succ zero = succ (pred n succ zero)
Manager answered 29/6, 2016 at 3:1 Comment(3)
Can this be extended beyond fixpoints of polynomial functors? eg. using descriptions?Stooge
Insightful and much simpler than I thought. I second I'd love this to be expanded to explain descriptions in simple terms.Tips
@Benjamin Hodgson, I added a descriptions-related answer.Torbert

© 2022 - 2024 — McMap. All rights reserved.