Variance and injectivity annotations give back some information about the relationship between an abstract type constructor type 'a t
and its argument. For instance, a type constructor could either
- produce or contain an
'a
:
type 'a t = 'a * 'a
type 'a t = 'a -> unit
- ignore its argument all together:
type 'a t = int
- contain a viewable and mutable reference to 'a
type 'a t = { get: unit -> 'a; store: 'a -> unit }
All those type constructors can be abstracted away with a signature to:
module Any: sig
type 'a t
end = struct
type 'a t = 'a * 'a
end
with this level of abstraction, we end up knowing nothing about 'a t
in the outside world. However, sometimes it is useful to know a little more than nothing.
For instance, if I have a producer type constructor 'a t
, for instance
type 'a t = 'a * 'a
, and two types that are a related by a subtyping relation, let's say type x = < x:int >
and type xy = <x:int; y:int >
t. I can deduce from xy <: x
that xy t <: y t
because it is fine to pretend that the stored object has fewer methods than the one that was really produced. Since the order of the relation <:
is preserved from
xy :> x
to xy t :> x t
, we say that the type constructor t
is covariant in its type parameter.
And I can expose the fact that the type constructor is covariant in an interface by adding a variance annotation:
type xy = <x:int; y:int>
type x = < x:int >
module Covariant: sig
type +'a t
val xy: xy t
end = struct
type +'a t = 'a * 'a
let xy = let c = object method x = 0 method y = 1 end in c, c
end
let xy = (Covariant.xy: xy Covariant.t :> x Covariant.t)
In a dual way, if I have a type that consumes an object with a method x
, for instance
type 'a t = 'a -> int
let x: x t = fun arg -> arg#x
it is fine to pretend that it requires in fact more methods. In other words, I can coerce a x t
into a xy t
reversing the relationship from xy:>x
to x t :> xy t
. And I can expose this information with a contravariant annotation
module Contravariant: sig
type -'a t
val x: x t
end = struct
type -'a t = 'a -> int
let x c = c#x
end
let xy = (Contravariant.x: x Contravariant.t :> xy Contravariant.t)
The use of +
and -
for contravariant reflects the rules that multiplying by a positive number preserve the ordinary order x < y
implies that 2 x <2 y
whereas multiplying by a negative number reverse the order:
x < y
implies that -2 y < -2x
.
Thus the variance annotation allows us to expose how the type constructor t
behaves with respect to subtyping.
For concrete type definitions, the typechecker will infer the variance by itself and there is nothing to do.
However, for abstract type constructor, it is the role of the author to expose (or not) the variance information.
This variance information is useful when dealing with objects, polymorphic variants or private types. But one may wonder if that matters much in OCaml, since objects are not used that much. And in fact, the main use for covariance annotation is relied to polymorphism and the value restriction.
The value restriction is a consequence of a destructive interference between polymorphism and mutability.
The simplest example would be a function that generates a pair of functions to store or get some value from
a reference
type 'a t = { get: unit -> 'a; store: 'a -> unit }
let init () =
let st = ref None in
let store x = st := Some x in
let get () = match !st with None -> raise Not_found | Some x -> x
in
{store; get}
Typically, I can use like this:
let st = init ()
let () = st.store 0
let y = st.get ()
However, what is the type of st
on the first line of the previous example. The type of
init
is unit -> 'a t
because I can store any kind of value in the generated reference.
However, the type of the concrete value st
cannot be polymorphic, because I should not be able
to store an integer and retrieve a function for instance:
let wrong = st.get () 0
Thus, the type of st
is a weakly polymorphic type : '_weak1 t'
where '_weak1
is a placeholder
for a concrete type that can be replaced only once. Thus at line 2, the type of st
, we learn that 'weak1=int
and the type of t
is updated to int t
. This is the value restriction at play, it tells us that we cannot generally infer that the result of a computation is polymorphic. However, the issue only appear because with a store, we could both read and write a value of type 'a
. If we can only read (or produce) a value of type 'a
, it is fine to generalize a polymorphic value produced from a computation. For instance, in this example,
let gen_none () = None
let none = gen_None ()
surprisingly the inferred type for none
is the fully polymorphic type 'a option'
.
Indeed, an option type 'a option
is immutable container type that can only produce a value of type 'a
.
It is thus fine to generalize the type of the computation none
is a from '_weak1 option
to 'a option
. And this where we meet again the variance annotation: being a container type that can only produce an 'a
is one way to describe a covariant type parameter. And it is possible to prove more generally that if a type variable only appears in covariant position, it is always fine to generalize the type of that computation. This is the relaxed value restriction.
However, this happens because for a covariant type constructor 'a t'
, the only possible way to produce a polymorphic value is to have some kind of empty container. It can be quite fun to check that with OCaml type system :
type 'a maybe = Nothing | Just of 'a
type empty = |
type poly_maybe = { x: 'a. 'a maybe}
let a_polymorphic_maybe_is_nothing {x} = match (x:empty maybe) with
| Nothing -> ()
| _ -> .
To conclude on variance annotation, they are useful for container libraries to give their user the ability to
- coerce the whole container to a subtype with no-runtime cost
- compute polymorphic "empty" values
The use case of injectivity is a lot more subtle. In brief, they only matter when checking the exhaustiveness of some pattern matching in presence of GADTs (which explains why they were only introduced in OCaml 4.12).
Indeed, if variance annotations are concerned with the preservation of subtyping relationship, injectivity annotation
are concerned with the preservation of inequality. Imagine that I have two distinguishable types, for instance int
and float
. Can I always distinguish an int t
or float t
? If I look at sum types
type !'a sum = A
or records
type !'a product = { x: 'a}
the answer is always yes. In other words, if I have 'a != 'b
then both 'a sum != 'b sum
and
'a product !='b product
. This preservation of inequality is called injectivity. And I can add an injectivity
annotation to check that the typechecker agrees with me (like variance annotations, injectivity annotations are inferred for concrete type definitions).
However, if the type parameter was in fact a phantom type parameter,
type 'a t = float
then I cannot guarantee that int t != float t
(because _ t
is always a float
).
For an example of a type constructor with a phantom parameter, I can define a type of float numbers with units with
type m = M
type s = S
module SI: sig
type 'a t
val m: m t
val s: s t
end = struct
type 'a t = float
let m = 1.
let s = 1.
end
Here, the type 'a SI.t
represents a real number with a unit encoded in the type.
So with this example, I have both injective and non-injective types. When does injectivity matters?
And the answer is that for injectivity to matter I need to deal with explicit type equations.
And explicit types equations are the domain of GADTs. The quintessential GADT is indeed the proof of equality
witness
type ('a,'b) eq = Refl: ('a,'a) eq
let conv (type a b) (Refl: (a,b) eq) (x:a) = (x:b)
having a value eq
of type ('a,'b) t
is a proof that two types are equal. For instance, I can use this type
to export the fact that m SI.t
and s SI.t
are secretly the same type
module SI: sig
type 'a t
val m: m t
val s: s t
val magic: (s t, m t) eq
end = struct
type 'a t = float
let m = 1.
let s = 1.
let magic = Refl
end
and I can then use this secret equality to convert seconds to meters (which is bad)
let wrong_conversion (x:s SI.t) : m SI.t =
conv SI.magic x
Thus, I can use GADTs to expose the fact that a type constructor was not injective. Similarly, I can use
an injectivity annotation to prove the other equivalent definition of injectivity: if 'a t = 'b t
and t
is injective in its first parameter then 'a = 'b
:
module Inj(T: sig type !'a t end) = struct
open T
let conv (type a b) (Refl: (a t, b t) eq): (a,b) eq = Refl
end
All this is rather theoretical, but this issue of injectivity appears on more practical problem.
Imagine that I have a container type vec
module Vec: sig
type 'a t
val v2: 'a -> 'a -> 'a t
end = struct
type 'a t = 'a * 'a
let v2 x y = x, y
end
type 'a vec = 'a Vec.t
(Note the missing injectivity annotation for now)
With GADTs, I can define functions that work with either , int vec
or float vec
by
defining the right GADT
type 'a monoid =
| Int_vec: int vec monoid
| Float_vec: float vec monoid
For instance, I can define a zero
function monoid by monoid:
let zero (type a) (kind:a monoid): a = match kind with
| Int_vec -> Vec.v2 0 0
| Float_vec -> Vec.v2 0. 0.
This works as expected. The trouble starts, once I want to define a function that works only on one of the possible monoids. For instance, only integer vectors have a finite number of elements of radius 1
let unit_circle (monoid:int vec monoid) = match monoid with
| Int_vec -> Vec.[v2 1 0; v2 (-1) 0; v2 0 1; v2 0 (-1)]
But then, we get an unexpected warning:
Warning 8 [partial-match]: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
Float_vec
This warning may seem strange at a first look Float_vec
has type float vec monoid
, which does not
match with the int vec monoid
type , and trying to apply unit_circle
to Float_vec
yields
let error = unit_circle Float_vec
Error: This expression has type float vec monoid
but an expression was expected of type int vec monoid
Type float is not compatible with type int
Thus the typechecker knows that a float vec
is not compatible with an int vec
.
Why it is thus warning us about missing case in the pattern matching case?
This issue is a question of context: when analyzing the pattern matching for exhaustiveness,
the typechecker sees that the type 'a vec
is not injective, and thus it cannot assume
that there is not some hidden equations lying around that prove that in fact a float vec
is
the same as an int vec
. Contrarily, in the application case, the typechecker knows that there is no such equation
in the current context, and thus it ought to reject the application (even if there is such equation
lying around somewhere).
The easy solution in our case is thus to add the missing injectivity annotation (we can add the missing
variance too)
module Vec: sig
type +!'a t
val v2: 'a -> 'a -> 'a t
end = struct
type 'a t = 'a * 'a
let v2 x y = x, y
end
type 'a vec = 'a Vec.t
type 'a monoid =
| Int_vec: int vec monoid
| Float_vec: float vec monoid
With this missing piece of information, the typechecker can conclude that an int vec
is always different
from a float vec
and thus we don't have a warning anymore with
let unit_circle (monoid:int vec monoid) = match monoid with
| Int_vec -> Vec.[v2 1 0; v2 (-1) 0; v2 0 1; v2 0 (-1)]
| _ -> .
Indeed, since 'a vec
is injective, we know that we can deduce int vec != float vec
from the inequation
int != float
To conclude on injectivity, an injectivity annotation is a way for a library to let its user know that even if a type constructor is abstract it does preserve inequalities. This is quite small piece of information, that doesn't leak much information. At the same time, it is only useful for clients that manipulate explicitly type equations through the use of GADTs.