In unextended Haskell, the declaration
data A = B
defines two new entities, one each at the computation and type level:
- At the type level, a new base type named
A
(of kind *
), and
- At the computation level, a new base computation named
B
(of type A
).
When you turn on DataKinds
, the declaration
data A = B
now defines four new entities, one at the computation level, two at the type level, and one at the kind level:
- At the kind level, a new base kind
A
,
- At the type level, a new base type
'B
(of kind A
),
- At the type level, a new base type
A
(of kind *
), and
- At the computation level, a new base computation
B
(of type A
).
This is a strict superset of what we had before: the old (1) is now (3), and the old (2) is now (4).
These new entities explain the following interactions you described:
:type Zero
Zero :: Nat
:kind 'Zero
'Zero :: Nat
:type 'Zero
Syntax error on 'Zero
I think it's clear how it explains the first two. The last one it explains because 'Zero
is a type-level thing -- you can't ask for the type of a type, only the type of a computation!
Now, in Haskell, every place where a name occurs, it is clear from the surrounding syntax whether that name is intended to be a computation-level name, a type-level name, or a kind-level name. For this reason, it is somewhat annoying to have to include the tick mark in 'B
at the type level -- after all, the compiler knows we're at the type level and therefore can't be referring to the unlifted computation-level B
. So for convenience, you are permitted to leave off the tick mark when that is unambiguous.
From now on, I will distinguish between the "back end" -- where there are only the four entities described above and which are always unambiguous -- and the "surface syntax", which is the stuff you type into a file and pass off to GHC that includes ambiguity but is more convenient. Using this terminology, in the surface syntax, one may write the following things, with the following meanings:
Surface syntax Level Back end
Name computation Name
Name type Name if that exists; 'Name otherwise
'Name type 'Name
Name kind Name
---- all other combinations ---- error
This explains the first interaction you had (and the only one left unexplained above):
:kind Zero
Zero :: Nat
Because :kind
must be applied to a type-level thing, the compiler knows the surface syntax name Zero
must be a type-level thing. Since there is no type-level back end name Zero
, it tries 'Zero
instead, and gets a hit.
How can it be ambiguous? Well, notice above that we defined two new entities at the type level with one declaration. For simplicity of introduction, I named the new entities on the left- and right-hand side of the equation different things. But let's see what happens if we just tweak the declaration slightly:
data A = A
We still introduce four new back end entities:
- Kind
A
,
- Type
'A
(of kind A
),
- Type
A
(of kind *
), and
- Computation
A
(of type A
).
Whoops! There is now both an 'A
and an A
at the type level. If you leave off the tick mark in the surface syntax, it will use (3), and not (2) -- and you can explicitly choose (2) with the surface syntax 'A
.
What's more, this doesn't have to happen all from a single declaration. One declaration may produce the ticked version and another the non-ticked version; for example
data A = B
data C = A
will introduce a type-level back end name A
from the first declaration and a type-level back end name 'A
from the second declaration.
:kind 'Zero
instead of:kind Zero
in type-level? Anytime the nameZero
appear in type-level, compiler replace it with'Zero
? – Glutenous