Below are alternative explanations with a varying amount of detail, depending on how much of a hurry you’re in. ;-)
I will use the following code (drawn from that other question) as a running example. Here, the type annotation on the definition of reduce
is actually required to make it typecheck.
(* The type [('a, 'c) fun_chain] represents a chainable list of functions, i.e.
* such that the output type of one function is the input type of the next one;
* ['a] and ['c] are the input and output types of the whole chain.
* This is implemented as a recursive GADT (generalized algebraic data type). *)
type (_, _) fun_chain =
| Nil : ('a, 'a) fun_chain
| Cons : ('a -> 'b) * ('b, 'c) fun_chain -> ('a, 'c) fun_chain
(* [reduce] reduces a chain to just one function by composing all
* functions of the chain. *)
let rec reduce : type a c. (a, c) fun_chain -> a -> c =
fun chain x ->
begin match chain with
| Nil -> x
| Cons (f, chain') -> reduce chain' (f x)
end
The short story
On let-definitions, an annotation like : 'a -> 'a
does not force polymorphism: the type-checker may refine the unification variable 'a
to something. This bit of syntax is misleading indeed, because the same annotation on a val-declaration i.e. in a module signature does enforce polymorphism.
: type a. …
is a type annotation with explicit (forced) polymorphism. You can think of this as the universal quantifier (∀ a, “for all a“). For instance,
let some : type a. a -> a option =
fun x -> Some x
means that “for all” type a, you can give an a to some
and then it will return an a option
.
The code at the beginning of this answer makes use of advanced features of the type system, namely, polymorphic recursion and branches with different types, and that leaves type inference at a loss. In order to have a program typecheck in such a situation, we need to force polymorphism like this. Beware that in this syntax, a
is a type name (no leading quote) rather than a type unification variable.
: 'a. …
is another syntax that forces polymorphism, but it is practically subsumed by : type a. …
so you will hardly need it at all.
The pragmatic story
: type a. …
is a short-hand syntax that combines two features:
- an explicitly polymorphic annotation
: 'a. …
- useful for ensuring a definition is as general as intended
- required when recursion is done with type parameters different from those of the initial call (“polymorphic recursion” i.e. recursion on “non-regular” ADTs)
- a locally abstract type
(type a) …
- required when different branches have different types (i.e. when pattern-matching on “generalized” ADTs)
- allows you to refer to type
a
from inside the definition, typically when building a first-class module (I won’t say more about this)
Here we use the combined syntax because our definition of reduce
falls under both situations in bold.
We have polymorphic recursion because Cons
builds a (a, c) fun_chain
from a (b, c) fun_chain
: the first type parameter differs (we say that fun_chain
is a “non-regular” ADT).
We have branches with different types because Nil
builds a (a, a) fun_chain
whereas Cons
builds a (a, c) fun_chain
(we say that fun_chain
is a “generalized” ADT, or GADT for short).
Just to be clear: : 'a. …
and : type a. …
produce the same signature for the definition. Choosing one syntax or the other only has an influence on how its body is typechecked. For most intents and purposes, you can forget about : 'a. …
and just remember the combined form : type a. …
. Alas, the latter does not completely subsume the former, there are rare situations where writing : type a. …
wouldn’t work and you would need : 'a. …
(see @octachron’s answer) but, hopefully, you won’t stumble upon them often.
The long story
Explicit polymorphism
OCaml type annotations have a dirty little secret: writing let f : 'a -> 'a = …
doesn’t force f
to be polymorphic in 'a
. The compiler unifies the provided annotation with the inferred type and is free to instantiate the type variable 'a
while doing so, leading to a less general type than intended. For instance let f : 'a -> 'a = fun x -> x+1
is an accepted program and leads to val f : int -> int
. To ensure the function is indeed polymorphic (i.e. to have the compiler reject the definition if it is not general enough), you have to make the polymorphism explicit, with the following syntax:
let f : 'a. 'a -> 'a = …
For a non-recursive definition, this is merely the human programmer adding a constraint which makes more programs be rejected.
In the case of a recursive definition however, this has another implication. When typechecking the body, the compiler will unify the provided type with the types of all occurrences of the function being defined. Type variables which are not marked as polymorphic will be made equal in all recursive calls. But polymorphic recursion is precisely when we recurse with differing type parameters; without explicit polymorphism, that would either fail or infer a less general type than intended. To make it work, we explicitly mark which type variables should be polymorphic.
Note that there is a good reason why OCaml cannot typecheck polymorphic recursion on its own: there is undecidability around the corner (see Wikipedia for references).
As an example, let’s do the job of the typechecker on this faulty definition, where polymorphism is not made explicit:
(* does not typecheck! *)
let rec reduce : ('a, 'c) fun_chain -> 'a -> 'c =
fun chain x ->
begin match chain with
| Nil -> x
| Cons (f, chain') -> reduce chain' (f x)
end
We start with reduce : ('a, 'c) fun_chain -> 'a -> 'c
and chain : ('a, 'c) fun_chain
for some type variables 'a
and 'c
.
In the first branch, chain = Nil
, so we learn that in fact chain : ('c, 'c) fun_chain
and 'a == 'c
. We unify both type variables. (That doesn’t matter right now, though.)
In the second branch, chain = Cons (f, chain')
so there exists an arbitrary type b
such that f : 'a -> b
and chain' : (b, 'c) fun_chain
. Then we must typecheck the recursive call reduce chain'
, so the expected argument type ('a, 'c) fun_chain
must unify with the provided argument type (b, 'c) fun_chain
; but nothing tells us that b == 'a
. So we reject this definition, preferably (as is tradition) with a cryptic error message:
Error: This expression has type ($Cons_'b, 'c) fun_chain
but an expression was expected of type ('c, 'c) fun_chain
The type constructor $Cons_'b would escape its scope
If now we make polymorphism explicit:
(* still does not typecheck! *)
let rec reduce : 'a 'c. ('a, 'c) fun_chain -> 'a -> 'c =
…
Then typechecking the recursive call is not a problem anymore, because we now know that reduce
is polymorphic with two “type parameters” (non-standard terminology), and these type parameters are instantiated independently at each occurrence of reduce
; the recursive call uses b
and 'c
even though the enclosing call uses 'a
and 'c
.
Locally abstract types
But we have a second problem: the other branch, for constructor Nil
, has made 'a
be unified with 'c
. Hence we end up inferring a less general type than what the annotation mandated, and we report an error:
Error: This definition has type 'c. ('c, 'c) fun_chain -> 'c -> 'c
which is less general than 'a 'c. ('a, 'c) fun_chain -> 'a -> 'c
The solution is to turn the type variables into locally abstract types, which cannot be unified (but we can still have type equations about them). That way, type equations are derived locally to each branch and they do not transpire outside of the match with
construct.