Is it possible to write a data structure or data structures that represent only closed terms in Haskell or any other language?
Asked Answered
H

3

7

Using the De Bruijn notation, it is possible to define lambda terms as:

data BTerm = BVar Int | BLam BTerm | BApp BTerm BTerm

Or using the usual notation, data Term = Var String | Lam String Term | App Term Term

These two datatypes allow construction of both closed terms and terms containing free variables.

Is it possible to define a datatype that allows construction of only closed terms. ie only the terms such as: \x.x, \x. x x, \x.\y. x y, \x.\y. y, \x.\y.\z.z(x y)

Hangover answered 29/5, 2021 at 6:8 Comment(1)
You may be interested in Richard Eisenberg's Glambda, a de Bruijn-based typed lambda calculus using GADTs. There's also a related talk with Glambda around 1:02:03. In his implementation, the type Exp includes a type parameter representing the expression's typing context, so an Exp '[] t with an empty typing context represents a closed term (with t the type of the term itself). As a consequence, eval :: Exp '[] t -> Val t never has to worry about encountering an undefined variable.Radiosurgery
E
7

You can use GADTs to force the list of free variables to be empty. The free variables can be kept in a type-level list. Below, I chose to use De Bruijn indices to represent variables.

We start by defining how to append two type level lists:

{-# LANGUAGE KindSignatures, DataKinds, TypeFamilies, TypeOperators,
    GADTs, ScopedTypeVariables, TypeApplications #-}
{-# OPTIONS -Wall #-}

import GHC.TypeLits
import Data.Proxy

-- Type level lists append
type family (xs :: [Nat]) ++ (ys :: [Nat]) :: [Nat] where
   '[]       ++ ys = ys
   (x ': xs) ++ ys = x ': (xs ++ ys)

Then we compute the free variables of \ t given those of t.

-- Adjust Debuijn indices under a lambda:
-- remove zeros, decrement positives
type family Lambda (xs :: [Nat]) where
   Lambda '[]       = '[]
   Lambda (0 ': xs) = Lambda xs
   Lambda (x ': xs) = x-1 ': Lambda xs

Finally our GADT:

-- "BTerm free" represents a lambda term with free variables "free"
data BTerm (free :: [Nat]) where
   BVar :: KnownNat n => BTerm '[n]
   BLam :: BTerm free -> BTerm (Lambda free)
   BApp :: BTerm free1 -> BTerm free2 -> BTerm (free1 ++ free2)

A type for closed terms is now trivial to define:

-- Closed terms have no free variables
type Closed = BTerm '[]

We are done. Let's write some tests. We start from a Show instance to be able to actually print the terms.

showBVar :: forall n. KnownNat n => BTerm '[n] -> String
showBVar _ = "var" ++ show (natVal (Proxy @n))

instance Show (BTerm free) where
   show t@BVar       = showBVar t
   show (BLam t)     = "\\ " ++ show t
   show (BApp t1 t2) = "(" ++ show t1 ++ ")(" ++ show t2 ++ ")"

And here's a couple of tests:

-- \x. \y. \z. z (x y)
-- Output: \ \ \ (var0)((var2)(var1))
test1 :: Closed
test1 = BLam (BLam (BLam (BApp z (BApp x y))))
   where
   z = BVar @0
   y = BVar @1
   x = BVar @2

-- \x. \y. x y (\z. z (x y))
-- Output: \ \ ((var1)(var0))(\ (var0)((var2)(var1)))
test2 :: Closed
test2 = BLam (BLam (BApp (BApp x' y') (BLam (BApp z (BApp x y)))))
   where
   z  = BVar @0
   y  = BVar @1
   x  = BVar @2
   y' = BVar @0
   x' = BVar @1
Excommunicative answered 29/5, 2021 at 9:20 Comment(3)
Proxy, really? Why not BVar :: KnownNat n => BTerm '[n] and then z = BVar @0?Fleurdelis
@Fleurdelis Removed :) I hate proxies, but I had some issues in show BVar = ... since I can't easily catch the n. I had to use an external helper function. Do you know if it's possible to do without the helper?Excommunicative
Good question, I don't see a way to avoid an explicitly quantified definition there. But it's not really a big deal to do that just once for the purpose of show. (Speaking of which... show supposed to produce syntactically valid Haskell??)Fleurdelis
R
4

If you want to encode it in some form that looks close to arbitrary lambda expressions, I suspect it will require some type-level programming to track the current lambda depth. This will make it hard to combine terms without knowing at compile time what types those terms have.

But if you don't mind something equivalent that looks quite different, it is well known that SKI combinators are equivalent to lambda calculus. And since SKI offers no explicit lambdas or indeed variable references at all, there's no way to encode a non-closed term.

Reluct answered 29/5, 2021 at 6:48 Comment(0)
A
4

Sure, this is even implemented as a library in bound, which is a generalisation of a representation of De Bruijn indices using a (polymorphically) recursive data type:

data Var bound free
  = Bound bound
  | Free free

newtype Scope bound term a
  = Scope { runScope :: term (Var bound (term a)) }

data Term var
  = Var var
  | App (Term var) (Term var)
  | Lam (Scope () Term var)
  -- == Lam (Term (Var () (Term var)))

The type of terms is indexed by the type of variables in those terms, both bound and free. When defining a new nested scope in a lambda term, we change that type: for bound variables, we just add another level of nesting, here annotated with () to give plain De Bruijn indices; for free variables, we just pass the type along, but also add a level of nesting with Term. The nesting at the type level reflects the number of De Bruijn levels that indices may refer to.

Now Term Void is the type of a term with no free variables; it may still have bound variables, by virtue of the fact that the recursion is polymorphic: Lam (Scope (Var (Bound ()))) :: Term Void represents (λx. x).

This method works in plain Haskell 98 (i.e. without GADTs), although there are advantages to adding some fancy types, e.g. I regularly use them for typed ASTs and statically typed typecheckers. Ed Kmett gave a nice overview of the design of bound at School of Haskell. There are related libraries in this space such as unbound and unbound-generics.

Arminius answered 29/5, 2021 at 19:9 Comment(1)
Interesting. Compared to the GADT solution, this (roughly put) computes the types in the opposite direction. Instead of having constructors K :: T a -> T b -> T (F a b) we have K :: T (Fa x) -> T (Fb x) -> T x which only requires polymorphic recursion, and no GADTs. The GADT solution looks easier to use in construction, e.g. App t1 t2 works even if t1 and t2 do not share the same free vars, while this solution requires a weakening. I guess it's the other way around on elimination.Excommunicative

© 2022 - 2024 — McMap. All rights reserved.