How to specify the type for a heterogenous collection in a GADT formulated AST?
Asked Answered
N

1

6

I'd like to make a typed AST for a dynamic language. At present, I'm stuck on handling collections. Here's a representative code sample:

{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE ExistentialQuantification #-}

data Box = forall s. B s

data BinOp = Add | Sub | Mul | Div
             deriving (Eq, Show)

data Flag = Empty | NonEmpty

data List :: Flag -> * -> * where
    Nil :: List Empty a
    Cons :: a -> List f a -> List NonEmpty a

data Expr ty where
    EInt :: Integer -> Expr Integer
    EDouble :: Double -> Expr Double
--    EList :: List -> Expr List

While I can construct instances of List well enough:

*Main> :t (Cons (B (EInt 1)) (Cons (B (EDouble 2.0)) Nil))
(Cons (B (EInt 1)) (Cons (B (EDouble 2.0)) Nil))
  :: List Box 'NonEmpty

I'm not at all sure how to encode this type in Expr for EList. Am I even on the right path here?

Nicknickel answered 14/12, 2015 at 4:21 Comment(2)
You have no heterogeneous collections - the type Box is useless. You cannot do anything with the value inside, other than call seq on it I guess.. It seems this question is not about heterogeneous collections in Haskell per se, but modelling a type system in Haskell which supports heterogeneous collections. However, if you mean "dynamic" as in "dynamically typed", then it doesn't make sense for the representation of a dynamically typed language to statically typed in the meta-language (Haskell).Wolverhampton
Possible duplicate of List of any DataKind in GADT.Saarinen
A
6

One way to approach this problem is to tag values with run-time type representatives. I'm channelling Stephanie Weirich, here. Let's have a small example. First, give a representation to some types. That's typically done with a singleton construction.

data Type :: * -> * where
  Int   :: Type Int
  Char  :: Type Char
  List  :: Type x -> Type [x]

So Type Int contains one value, which I've also called Int, because it acts as the run-time representative of the type Int. If you can see colour even in monochrome things, the Int left of the :: is red, and the Int after Type is blue.

Now we can do existential packaging, preserving utility.

data Cell :: * where
 (:::) :: x -> Type x -> Cell

A Cell is a value tagged with a run-time representative of its type. You can recover the utility of the value by reading its type tag. Indeed, as types are first-order structures, we can check them for equality in a useful way.

data EQ :: k -> k -> * where
  Refl :: EQ x x

typeEQ :: Type x -> Type y -> Maybe (EQ x y)
typeEQ Int Int = Just Refl
typeEQ Char Char = Just Refl
typeEQ (List s) (List t) = case typeEQ s t of
  Just Refl -> Just Refl
  Nothing -> Nothing
typeEQ _ _ = Nothing

A Boolean equality on type representatives is no use: we need the equality test to construct the evidence that the represented types can be unified. With the evidence-producing test, we can write

gimme :: Type x -> Cell -> Maybe x
gimme t (x ::: s) = case typeEQ s t of
  Just Refl -> Just x
  Nothing   -> Nothing

Of course, writing the type tags is a nuisance. But why keep a dog and bark yourself?

class TypeMe x where
  myType :: Type x

instance TypeMe Int where
  myType = Int

instance TypeMe Char where
  myType = Char

instance TypeMe x => TypeMe [x] where
  myType = List myType

cell :: TypeMe x => x -> Cell
cell x = x ::: myType

And now we can do things like

myCells :: [Cell]
myCells = [cell (length "foo"), cell "foo"]

and then get

> gimme Int (head myCells)
Just 3

Of course, it would all be so much tidier if we didn't have to do the singleton construction and could just pattern-match on such types as we might choose to retain at run-time. I expect we'll get there when the mythical pi quantifier becomes less mythical.

Aminoplast answered 14/12, 2015 at 14:28 Comment(2)
Since there is a dog anyway, we can write gimme :: TypeMe t => Cell -> Maybe t.Saarinen
@user3237465 Yes. Of course both explicit Type t -> and implicit TypeMe t => versions exist. It's not obvious to me that implicit is better than explicit in this respect. But it's certainly good to be light on our feet when choosing the variant we want. It's often good to write an explicit worker, then expose an implicit wrapper.Aminoplast

© 2022 - 2024 — McMap. All rights reserved.