Can someone explain the type syntax used in this OCaml program?
Asked Answered
E

2

5

The types below are taken from this question

(* contains an error, later fixed by the OP *)
type _ task =
| Success : 'a -> 'a task
| Fail : 'a -> 'a task
| Binding : (('a task -> unit) -> unit) -> 'a task
| AndThen : ('a -> 'b task) * 'a task -> 'b task
| OnError : ('a -> 'b task) * 'a task -> 'b task

type _ stack =
| NoStack : 'a stack
| AndThenStack : ('a -> 'b task) * 'b stack -> 'a stack
| OnErrorStack : ('a -> 'b task) * 'b stack -> 'a stack

type 'a process = 
{ root: 'a task 
; stack: 'a stack 
}

I'm relatively new to OCaml, but I've never seen : syntax used like that before. For example, I've seen polymorphic types defined like this, using the of syntax

type 'a expr =
    | Base  of 'a
    | Const of bool
    | And   of 'a expr list
    | Or    of 'a expr list
    | Not   of 'a expr

In the original question, It's not obvious to me how the variants are even constructed, since it looks like each one does not accept an argument. Take this simplified example

type 'a stack =
  | Foo : int stack
  | Bar : string stack
;;
type 'a stack = Foo : int stack | Bar : string stack

Try making an int stack using Foo

Foo 5;;
Error: The constructor Foo expects 0 argument(s),
       but is applied here to 1 argument(s)

However, without an argument

Foo;;
- : int stack = Foo

Ok, but where's the int? How to store data in this type?

In the OP's program below, he/she is matching on the types "normally", eg Success value -> ... or Fail value -> .... Again, how is this value constructed if the variant constructor doesn't accept an argument?

let rec loop : 'a. 'a process -> unit = fun proc ->
match proc.root with
| Success value -> 
    let rec step = function
    | NoStack -> ()
    | AndThenStack (callback, rest) -> loop {proc with root = callback value; stack = rest }
    | OnErrorStack (_callback, rest) -> step rest  <-- ERROR HERE
    in
    step proc.stack
| Fail value -> 
    let rec step = function
    | NoStack -> ()
    | AndThenStack (_callback, rest) -> step rest
    | OnErrorStack (callback, rest) -> loop {proc with root = callback value; stack = rest }
    in
    step proc.stack
| Binding callback -> callback (fun task -> loop {proc with root = task} )
| AndThen (callback, task) -> loop {root = task; stack = AndThenStack (callback, proc.stack)}
| OnError (callback, task) -> loop {root = task; stack = OnErrorStack (callback, proc.stack)}

Can someone help me filling in my knowledge gap?

Equatorial answered 29/5, 2018 at 19:41 Comment(3)
The constructors that you've defined don't accept values, but the ones in the original program do. For a constructor to accept arguments, it should have a function type (such as Foo : int -> int stack or even Foo : int * int stack -> int stack).Bustos
Interesting. Can you explain the syntax in more general terms? What is the feature called? Can you infer why the OP would be using this instead of the of form?Equatorial
In general these things are called GADTs (generalized algebraic data types). I am not aware of the good canonical tutorial introduction, but mads-hartmann.com/ocaml/2015/01/05/gadt-ocaml.html and blog.janestreet.com/why-gadts-matter-for-performance are both interesting reads.Ascanius
D
6

Those types are Generalized Algebraic Data Types. Also known as GADTs . GADTs make it possible to refine the relationship between constructors and types.

In your example, GADTs are used as a way to introduce existentially quantified type: removing irrelevant constructors, one could write

type 'a task =
| Done of 'a
| AndThen : ('a -> 'b task) * 'a task -> 'b task

Here, AndThen is a constructor that takes two arguments: a callback of type 'a -> 'b task and a 'a task and returns a task of type 'b task. A striking feature of this definition is that the type variable 'a only appears inside the arguments of the constructor. A natural question is then if I have a value AndThen(f,t): 'a task, what is the type of f?

And the answer is that the type of f is partially unknown, I only know that there is a type ty such that both f: ty -> 'a task and t: ty. But at this point all information beyond the mere existence of ty has been lost. For this reason, the type ty is called an existentially quantified type.

But here, this small information is still enough to manipulate meaningfully such value. I can define a function step

let rec step: type a. a task -> a task = function
| Done _ as x -> x
| AndThen(f,Done x) -> f x
| AndThen(f, t) -> AndThen(f, step t)

which try to apply the function f in the constructor AndThen if possible, using the information than the constructor AndThen always stores pair of callback and task that are compatible.

For instance

let x: int task = Done 0
let f: int -> float task =  fun x -> Done (float_of_int (x + 1))
let y: float task = AndThen(f,x)
;; step y = Done 1.
Disquietude answered 29/5, 2018 at 21:41 Comment(0)
E
4

I think comments and @octachron's answer provide enough details, but I would like to show my two favorite examples of GADTs power.

First of all you can write function which will return (kind of) different types!

type _ expression = Int : int -> int expression
                  | Bool : bool -> bool expression

let evaluate : type t. t expression -> t = function
  | Int i -> i
  | Bool b -> b

type t. t expression -> t means that function take some expression of some type and return a value of that type.

# evaluate (Int 42);;
- : int = 42
# evaluate (Bool true);;
- : bool = true

Clearly you can't achieve this with simple algebraic types.

The second point is that with GADTs compiler has enough information about the value you pass to the function, so you can "throw away" some senseless branch in pattern matching:

let negation : bool expression -> bool = function
  | Bool b -> not b

Because of bool expression -> bool OCaml know that negation works only for bool so there is no warnings about the absence of Int i branch. And if you try to call it with Int i you will see a type error:

# negation (Int 42);;
Characters 9-17:
  negation (Int 42);;
       ^^^^^^^^

Error: This expression has type int expression but an expression was expected of type bool expression Type int is not compatible with type bool

With simple algebraic types this example will look like this:

let negation = function
  | Bool b -> not b
  | Int i -> failwith "not a bool"

You not only have one useless branch, but also if you accidentally pass Int 42 it will fail only at runtime.

Escurial answered 30/5, 2018 at 7:37 Comment(1)
These examples are great! GADT is a topic that can easily throw off anybody looking into them for the first time, but these can be turned into a great introduction!Macro

© 2022 - 2024 — McMap. All rights reserved.