Why doesn't your original code work?
This type cannot work:
translate :: BaseExpr -> Expr a
Remember what type variables mean in Haskell. This type for translate
is saying "for any type a
the caller of translate
chooses, it will take a BaseExpr
and turn it into an Expr a
".
So if I want to pass BaseExprVar "variableName"
to translate
and receive an Expr Nest
, I should be able to do that; I'll just choose a
to be Nest
. It doesn't matter that this particular BaseExpr
is just a variable, not a nested expression, and there is no value of type Expr Nest
that faithfully represents this. With a polymorphic GADT as a return type, that's not my problem, it's translate
's problem, since the type for translate
promised to be able to come up with such a value. translate
can't force me to choose Var
for a
.
That isn't the type you want. You don't want the type of Expr
that is returned to be chosen by the caller of translate
, you want it to be chosen by the implementation of translate
(so it can return whatever type is appropriate to the contents of the BaseExpr
).
To have the implementation of a function choose a type parameter, you need to either use an existential wrapper, or use higher-rank types and continuation passing.
Existential wrapper
data SomeExpr
where SomeExpr :: Expr a -> SomeExpr
translate :: BaseExpr -> SomeExpr
translate (BaseExprVar id) = Some (ExprVar id)
translate (BaseExprNest expr)
= case translate expr of
Some e -> Some (ExprNest e)
Notice that the type parameter of Expr
does not appear in the type for translate
, so a caller of translate
doesn't need to specify what it should be.
To use the output of translate
, you will need to pattern match on the Some
constructor, which will give you a value of type Expr a
for some unknown a
(as opposed to the original attempt at translate
, which gives you a value of type Expr a
for some particular a
that you choose). Inside the pattern match, you therefore have to handle any possible value of a
, and the resulting value cannot have a type that depends on a
(we're only allowed to "see" it inside the pattern match).
You can see an example of this above in the recursive call; translate expr
returned Some Expr
, but what we needed to wrap up in the ExprNest
constructor is Expr a
. By pattern matching on Some
, we get the contained Expr a
, which we can then put inside the ExprNest
constructor to get a value of type Expr Nest
. Then we hide that back in a Some
constructor, and finally have the right type to return from translate
.
You can of course also generalise the existential wrapper so that it works on more types than just Expr
, as shown in @Iceland_jack's answer. There's nothing special to the form I've used here; it is itself just another use of GADT syntax (to embed a value with a type parameter that isn't exposed in the overall type). A general one is nice, but often you'll want some type class constraints on the hidden type so that you have more things you can do with the value when you pattern match on it, so you'll need to get more complicated to make a type general enough to reuse any time you need an existential wrapper.
Higher-rank types
The other way to handle the requirement for translate
to return different types based on the data it is processing, is to use higher rank types. For this, you'll need the {-# LANGUAGE RankNTypes #-}
extension.
translate :: BaseExpr -> (forall a. Expr a -> r) -> r
translate (BaseExprVar id) handler = handler (ExprVar id)
translate (BaseExprNest expr) handler
= handler (translate expr ExprNest)
Here, instead of using an extra wrapper type to "hide" the type variable, we restructure translate
so that it takes a "handler function" for Expr
s. Now translate
doesn't return any form of Expr
at all, rather it makes one, passes it to handler
, and returns what the handler returns.
Why does this work? The trick is that handler
is required to be passed to translate
as a polymorphic function. The nested forall a.
inside the type of one of translate
's arguments means that the caller doesn't choose a
like it normally would, instead the caller is required to pass a function that works for any possible a
. The code of the handler function is similar to the "inside of the pattern match" in the existential wrapper version; it has to handle any possible a
and can't return anything whose type depends on a
.
Because the caller had to pass a handler
that works for any a
, the code for translate
can use it at any type it likes, including at different types on different branches. So the first case for translate
can pick Var
and use handler
as Expr Var -> r
, while the second case can pick Nest
and use handler as Expr Nest -> r
.
Note a subtlety in the second case. Now when translate
needs to recursively call itself, it needs to pass a "handler function". We could try passing the handler we received in from the top, but that's not right. We're supposed to be calling that handler only once on the complete Expr
, not calling it at every level of the BaseExpr
. And besides, that would only give us an r
, which is not what we need to wrap up in the ExprNest
constructor. Instead we need to realise that ExprNest
is itself a function of type forall a. Expr a -> Expr Nest
, which fits with the type we need for handler functions! Translating the nested expression with the ExprNest
constructor as the handler will translate whatever is in it, and then feed it to the constructor, returning us the complete Expr Nest
value we need to feed to the top level handler.
In this particular example the higher rank version ended up using less boilerplate than the existential wrapper version. That's often the case, but a possible downside is it makes you restructure some of your calling code into continuation passing form; you can't just call translate
and then use the resulting value, you have to construct a function for what you would do with a translated expression if you had one, and then pass that function to translate
. Sometimes that coding style is harder to read/write.
translate :: BaseExpr -> exists a. Expr a
:) – Isreal