@leftroundabout’s answer is solid; here’s a more technical supplementary answer.
There is a sort of subtyping relationship at work in Haskell: the System F “generic instance” relation. This is what the compiler uses when checking the inferred type of a function against its signature. Basically, the inferred type of a function must be at least as polymorphic as its signature:
f :: (a -> a) -> a -> a
f g x = g x
Here, the inferred type of f
is forall a b. (a -> b) -> a -> b
, the same as the definition of g
you gave. But the signature is more restrictive: it adds the constraint a ~ b
(a
is equal to b
).
Haskell checks this by first replacing the type variables in the signature with Skolem type variables—these are fresh unique type constants that only unify with themselves (or type variables). I’ll use the notation $a
to represent a Skolem constant.
forall a. (a -> a) -> a -> a
($a -> $a) -> $a -> $a
You may see references to “rigid, Skolem” type variables when you accidentally have a type variable that “escapes its scope”: it’s used outside the forall
quantifier that introduced it.
Next, the compiler does a subsumption check. This is essentially the same as normal unification of types, where a -> b ~ Int -> Char
gives a ~ Int
and b ~ Char
; but because it’s a subtyping relationship, it also accounts for covariance and contravariance of function types. If (a -> b)
is a subtype of (c -> d)
, then b
must be a subtype of d
(covariant), but a
must be a supertype of c
(contravariant).
{-1-}(a -> b) -> {-2-}(a -> b) <: {-3-}($a -> $a) -> {-4-}($a -> $a)
{-3-}($a -> $a) <: {-1-}(a -> b) -- contravariant (argument)
{-2-}(a -> b) <: {-4-}($a -> $a) -- covariant (result)
The compiler generates the following constraints:
$a <: a -- contravariant
b <: $a -- covariant
a <: $a -- contravariant
$a <: b -- covariant
And solves them by unification:
a ~ $a
b ~ $a
a ~ $a
b ~ $a
a ~ b
So the inferred type (a -> b) -> a -> b
is at least as polymorphic as the signature (a -> a) -> a -> a
.
When you write xs = [f, g]
, normal unification kicks in: you have the two signatures:
forall a. (a -> a) -> a -> a
forall a b. (a -> b) -> a -> b
These are instantiated with fresh type variables:
(a1 -> a1) -> a1 -> a1
(a2 -> b2) -> a2 -> b2
Then unified:
(a1 -> a1) -> a1 -> a1 ~ (a2 -> b2) -> a2 -> b2
a1 -> a1 ~ a2 -> b2
a1 -> a1 ~ a2 -> b2
a1 ~ a2
a1 ~ b2
And finally solved & generalised:
forall a1. (a1 -> a1) -> a1 -> a1
So the type of g
has been made less general because it’s been constrained to have the same type as f
. The inferred type of xs
will therefore be [(a -> a) -> a -> a]
, so you’ll get the same type error writing [f (\x -> [x]) "foo" | f <- xs]
as you did writing f (\x -> [x]) "foo"
; even though g
is more general, you’ve hidden away some of that generality.
Now you might be wondering why you would ever give a more restrictive signature for a function than necessary. The answer is—to guide type inference and produce better error messages.
For example, the type of ($)
is (a -> b) -> a -> b
; but in fact this is a more restrictive version of id :: c -> c
! Just set c ~ a -> b
. So in fact you can write foo `id` (bar `id` baz quux)
instead of foo $ bar $ baz quux
, but having this specialised identity function makes it clear to the compiler that you expect to use it to apply functions to arguments, so it can bail out earlier and give you a more descriptive error message if you make a mistake.
f
andg
unify, under the substitutionb ~ a
, at the type(a -> a) -> a -> a
. Soxs :: [(a -> a) -> a -> a]
. – Weaponeer