Confused on DataKinds extension
Asked Answered
G

2

8

I learn type programming of Haskell from Basic Type Level Programming in Haskell but when it introduce DataKinds extension, there are something seems confusing from the example:

{-# LANGUAGE DataKinds #-}
data Nat = Zero | Succ Nat

Now, Nat is promoted to Kind, it is okay. But how about Zero and Succ?

I try to get some information from GHCi, so I type:

:kind Zero

it gives

Zero :: Nat

that is okay, Zero is a type has kind Nat, right? and I try:

:type Zero

it still gives:

Zero :: Nat

that means Zero has type Nat, that is impossible, since Nat is a kind not type, right? Does Nat is both a type and a kind ??

And other confusing thing is, above blog also mentioned that, when create Nat kind, there are two new types: 'Zero and 'Succ are created automatically. When I try it from GHCi again:

:kind 'Zero

gives

'Zero :: Nat

and

:type 'Zero

gives

 Syntax error on 'Zero

Okay, it proves that 'Zero is a type. but what is the purpose of creating 'Zero and 'Succ'??

Glutenous answered 17/12, 2018 at 9:52 Comment(0)
V
14

In unextended Haskell, the declaration

data A = B

defines two new entities, one each at the computation and type level:

  1. At the type level, a new base type named A (of kind *), and
  2. At the computation level, a new base computation named B (of type A).

When you turn on DataKinds, the declaration

data A = B

now defines four new entities, one at the computation level, two at the type level, and one at the kind level:

  1. At the kind level, a new base kind A,
  2. At the type level, a new base type 'B (of kind A),
  3. At the type level, a new base type A (of kind *), and
  4. At the computation level, a new base computation B (of type A).

This is a strict superset of what we had before: the old (1) is now (3), and the old (2) is now (4).

These new entities explain the following interactions you described:

:type Zero
Zero :: Nat
:kind 'Zero
'Zero :: Nat
:type 'Zero
Syntax error on 'Zero

I think it's clear how it explains the first two. The last one it explains because 'Zero is a type-level thing -- you can't ask for the type of a type, only the type of a computation!

Now, in Haskell, every place where a name occurs, it is clear from the surrounding syntax whether that name is intended to be a computation-level name, a type-level name, or a kind-level name. For this reason, it is somewhat annoying to have to include the tick mark in 'B at the type level -- after all, the compiler knows we're at the type level and therefore can't be referring to the unlifted computation-level B. So for convenience, you are permitted to leave off the tick mark when that is unambiguous.

From now on, I will distinguish between the "back end" -- where there are only the four entities described above and which are always unambiguous -- and the "surface syntax", which is the stuff you type into a file and pass off to GHC that includes ambiguity but is more convenient. Using this terminology, in the surface syntax, one may write the following things, with the following meanings:

Surface syntax    Level           Back end
Name              computation     Name
Name              type            Name if that exists; 'Name otherwise
'Name             type            'Name
Name              kind            Name
---- all other combinations ----  error

This explains the first interaction you had (and the only one left unexplained above):

:kind Zero
Zero :: Nat

Because :kind must be applied to a type-level thing, the compiler knows the surface syntax name Zero must be a type-level thing. Since there is no type-level back end name Zero, it tries 'Zero instead, and gets a hit.

How can it be ambiguous? Well, notice above that we defined two new entities at the type level with one declaration. For simplicity of introduction, I named the new entities on the left- and right-hand side of the equation different things. But let's see what happens if we just tweak the declaration slightly:

data A = A

We still introduce four new back end entities:

  1. Kind A,
  2. Type 'A (of kind A),
  3. Type A (of kind *), and
  4. Computation A (of type A).

Whoops! There is now both an 'A and an A at the type level. If you leave off the tick mark in the surface syntax, it will use (3), and not (2) -- and you can explicitly choose (2) with the surface syntax 'A.

What's more, this doesn't have to happen all from a single declaration. One declaration may produce the ticked version and another the non-ticked version; for example

data A = B
data C = A

will introduce a type-level back end name A from the first declaration and a type-level back end name 'A from the second declaration.

Vaginitis answered 17/12, 2018 at 13:5 Comment(4)
Thanks for you answer, Does the tricky here is that compiler tries :kind 'Zero instead of :kind Zero in type-level? Anytime the name Zero appear in type-level, compiler replace it with 'Zero?Glutenous
@Glutenous No, not any time, as explained in my answer: the compiler first checks if Zero exists, and only if it does not exist will it check whether 'Zero exists instead.Vaginitis
@Glutenous I have added some precision about this question, distinguishing between the back end, where names are always unambiguous, and the surface syntax that we use to communicate with the compiler. See especially the new ASCII-art table that begins "Surface syntax Level Back end". In this terminology, the surface syntax Zero may correspond to the back end Zero or 'Zero, depending on what declarations are in scope.Vaginitis
Thank you very much! it is clear and now I eventually understand why add data Zero declaration in leftaroundabout's answer will cause name clashes if doesn't use 'Zeor instead of Zero.Glutenous
C
5

{-# LANGUAGE DataKinds #-} does not change what the data keyword usually does: it still creates a type Nat and two value constructors Zero and Succ. What this extension does do is, it allows you to use such types as if they were kinds, and values as if they were types.

Thus, if you use Nat in a type-level expressions, it'll just use it as a boring Haskell98 type. Only if you use it in an unabiguously kind-level expression, do you get the kind version.

This automatic lifting could sometimes cause name clashes, which is, I think, the reason for the ' syntax:

{-# LANGUAGE DataKinds #-}
data Nat = Zero | Succ Nat
data Zero

Now, Zero by itself is the plain (empty) data type Zero :: *, so

*Main> :k Zero
Zero :: *

In principle, thanks to DataKinds, Zero is now also a type lifted from the value constructor Zero :: Nat, but this is shadowed by the data Zero. Thus the quoting-syntax, which always refers to lifted types, never to directly defined ones:

*Main> :k 'Zero
'Zero :: Nat
Coretta answered 17/12, 2018 at 10:13 Comment(7)
The clash is also particularly evident in (a,b) :: * vs '(a,b) :: (*, *). Ditto for [a] :: * vs '[a] :: [*] (even if these are special syntactic cases).Bismuth
@Coretta Thank for your answer, Does it mean that when the context need Nat as a type then Nat is a type and Zero and Succ are constructor. And when the context need Nat as a kind, Nat become a kind, and Zero and Succ become a type? Otherwise, I still don't understand why we need 'Zero. In what situation the clashes you mentioned in the answer will occur?Glutenous
@Glutenous Yes, everything you asserted is correct. But your question puzzles me; doesn't the answer already give an explicit example where name clashes can occur?Vaginitis
@DanielWagner I don't understand the example in answer why the name clashes.Glutenous
@Glutenous it clashes because I defined both a data type called Zero and a value constructor called Zero. In normal Haskell2010, these wouldn't clash because type-level language and value-level language are completely separate, but -XDataKinds “copies” the value-level Zero into the type-level language, so Zero in a type-level context is now ambiguous. To prevent DataKinds from breaking any existing code, they made it default to the Haskell2010 meaning, i.e. Zero :: *. Thus to also get access to the new meaning of promoted-to-type value-level constructor, we need the ' syntax.Coretta
@Glutenous I started writing up an explanation of the name clashes, but it turned out to be a bit long for a comment. So I've made an answer with my attempt at a careful explanation of what's going on here.Vaginitis
@Coretta Thank for your further explanation, I still unable to comprehend it, maybe I need spend much time to study it.Glutenous

© 2022 - 2024 — McMap. All rights reserved.