Why Haskell treats a stuck application of a type family as a valid type?

A

3

6

Take this code snippet (I changed the type family to a closed one after the various comments and answers mentioned it):

-- a closed type family for which I won't define any instances
type family EmptyTypeFamily t where

-- 1. this type doesn't exist, since there's no `instance EmptyTypeFamily ()`
type NotAType = EmptyTypeFamily ()

-- 2. this value has a type that doesn't exist
untypedValue :: NotAType
untypedValue = undefined

main :: IO ()
main = do
  -- 3. the value with no type can be used in expressions
  return untypedValue
  return . id . fix $ \x -> untypedValue
  
  -- ERROR, finally! No instance for `Show (EmptyTypeFamily ())`
  print untypedValue

I find it quite counter intuitive that I can name a stuck application of a type family (1) and even define values for it (2) and use them in expressions (3). Of course I can't define any typeclass instances for it, since the type doesn't even exist. But shouldn't I get an error just by naming that non-type, let alone using it? This makes it harder to detect issues in my code. I read the GHC doc, which mentions "stuck" twice, but doesn't answer my question.

What's the reason why Haskell treats EmptyTypeFamily () as a valid type for which you can even define values and whatnot, when it isn't?

Alleyway answered 26/1, 2023 at 6:0 Comment(2)

Note that even if NotAType has no values, F NotAType might have values, especially if F is a GADT. It does not seem that useful to use stuck families in that way, though. – Marlea 26/1, 2023 at 9:20

Didn't know that. It's a further confirmation that GHC treats NotAType as a valid type. The answers explain why, but I'm still finding the concept confusing. – Alleyway 27/1, 2023 at 5:56

A

2

There are two factors that mean GHC needs to be able to contemplate stuck type families without immediately throwing an error and giving up.

The first is, as chi's answer pointed out, type families defined via separate type instance declarations are open. It's theoretically possible for your module to define EmptyTypeFamily, NotAType, and untypedValue and export them all, and for some other module to import your module and add a type instance EmptyTypeFamily () declaration that makes sense of this.

But there are also closed type families, where the set of instances are known statically to any module that has imported the type family at all. It's tempting to say these should at least immediately throw an error if they can't be resolved, and indeed that could conceivably be done in a case like yours where all of the arguments to the type family are fully concrete types. But the second reason GHC needs the concept of stuck type families that aren't an error is that type families are usually used in more complex cases with type variables in play. Consider this:

{-# LANGUAGE TypeFamilies #-}

type family TF t
  where TF Bool = Int
        TF Char = Bool

foo :: t -> TF t -> TF t
foo = flip const

The TF t in the type of foo is a stuck type family application; GHC can't resolve it here. Whether the declared type for foo is an error or not depends on the variable t, which depends on the context in which foo is used, not just its definition.

So GHC needs this ability to work with things with a type given by a type expression that it cannot actually resolve now. Given that, I'm not overly bothered by it not throwing errors eagerly in cases like the OP (assuming EmptyTypeFamily was rewritten to be a closed family; it definitely shouldn't with the open family). I don't know precisely, but I've always assumed it was either:

It's not possible to make a general procedure for GHC to be able to tell the difference between a stuck application that could potentially be unstuck with more information, and one that definitely can't.
It is possible, but the inconsistency and unpredictability is not considered desirable. Remember that some other GADTs or type families might be able to resolve for any type at all (much like a non-strict function not necessarily needing its argument to be defined), and be applied to a stuck family application; if GHC is distinguishing between "definitely stuck" applications that immediately throw an error and "potentially unstuck-able" applications, the programmer would need to be able to always make the same determination in their head to tell when complex cases are allowable. I'm genuinely unsure whether this could be made pleasant to work with.
It's possible implement and can be made nicely predictable, but it's a special case in the general handling of type families that no one has added yet. The original feature only had open families, I'm fairly sure, so the original guts of type family handling would be written without any real possibility of "definitely stuck" applications existing to be detected.

If you're using closed type families, one thing you can do is add a catch-all case that resolves to a TypeError. Like so:

{-# LANGUAGE DataKinds, TypeFamilies, UndecidableInstances #-}

-- These are in GHC.TypeError as of base 4.17, but the current
-- default on my system is base 4.16. GHC.TypeLits still exports
-- them for backwards compatibility so this will work, but might
-- eventually be deprecated?
import GHC.TypeLits ( TypeError
                    , ErrorMessage ( Text, ShowType, (:<>:) )
                    )
type family TF t
  where TF t = TypeError (Text "No TF instance for " :<>: ShowType t)

bad :: TF ()
bad = undefined

This helps, but it doesn't catch every possible use of a stuck TF application. It won't catch definitions like bad, because it's still "lazy" about actually evaluating TF (); you told it the type was TF (), and the implementation is undefined which can be any type so it happily unifies with TF () and passes the type check without the compiler ever needing to actually evaluate TF (). But it does catch more cases than a stuck type family application would because it's not stuck: it resolves to a type error. If you have any other binding where it has to infer a type and that depends on the type of bad, it seems to hit the error; even something like boom = bad where you'd think it would similarly be able to unify without any evaluation. And even asking for the type of bad in ghci with :t generates a type error.

At the very least it gives you better error messages than No instance for Show (EmptyTypeFamily ()) because it will resolve the type family to the type error, rather than go looking for an instance that could match a stuck type family and complain about the missing instance.

Amalita answered 27/1, 2023 at 2:40 Comment(0)

P

8

As chi remarked: it's not quite accurate to say your type family is empty, rather it merely happens to not contain any yet-known instances.
But I think that's missing the point of the question, so let's discuss instead the type family

type family EmptyTypeFamily t where {}

This one really is empty: you can't write any instances for it.

Still, this causes exactly the same behaviour as you observed. Why?

Well, what GHC does with types during compilation is in some ways similar to how we're dealing with values in all Haskell programs: lazy evaluation and pattern matching. It's well known how the program

f :: Int -> Int
f _ = 624

main :: IO ()
main = print $ f undefined

runs without problems, even though it “evaluates” undefined as the argument to f. It's ok because f doesn't actually need to know anything about its argument.

Similarly, if you write return untypedValue, then the type checker doesn't need to know anything about EmptyTypeFamily () – it just treats it as an abstract entity. It can even unify different type variables to EmptyTypeFamily (). This only requires the assumption that, whatever EmptyTypeFamily is under the hood, it must certainly be deterministic. That much is guaranteed, because the compiler would complain if you wrote two conflicting instances for it.

So, as long as you only use untypedValue with unconstraint-polymorphic functions, it simply doesn't matter that its type doesn't actually exist, because it never needs to be evaluated any further.

In other words, NotAType is purely symbolic. It's a bit like in maths, you can write a theorem starting with “let S be a set and x ∈ S, then bla bla” without actually pinning down what this set is or a value for x.

That changes only as you require additional constraints, like unifying it with a concrete type

f :: Int -> Int
f x = x + 1

main = print $ f untypedValue

    • Couldn't match type ‘EmptyTypeFamily ()’ with ‘Int’
      Expected type: Int
        Actual type: NotAType
    • In the first argument of ‘f’, namely ‘untypedValue’

...or, as you observed, use typeclass methods on it. In each case, the compiler has to start with performing an actual pattern match on the type-level value EmptyTypeFamily (). Well, in fact it's a more general process: what the compiler keeps track of are rather propositions. It has a database of knowledge that such and such types are equal, and such and such types match such and such class, and in this case GHC just determines that it doesn't have information which would allow deciding on any known Show instance.

Note that it is actually possible to have a hypothetical context in which that information is available, however nonsensical that is:

{-# LANGUAGE RankNTypes, FlexibleContexts #-}

hypothetically :: (Show NotAType => r) -> ()
hypothetically _ = ()

main = return $ hypothetically (print untypedValue)

In this case, responsibility for proving Show NotAType is deferred to the implementation of hypothetically. As the compiler tackles print untypedValue it just browses through the context and sees that Show NotAType will be proved by whatever code eventually uses the argument to hypothetically. ...Which of course never happens, but the compiler doesn't worry about that when typechecking main.

This, in the maths analogy, is like writing a proof starting with “let x ∈ ℝ such that x² = -1...” – this is perfectly valid and allows you to prove exciting things. Only, nobody will be able to use the theorem for computing something, because there exists no real number with the required property.

Professed answered 26/1, 2023 at 10:27 Comment(1)

I agree that Ben answered the question better. – Professed 27/1, 2023 at 11:53

M

2

since there's no instance EmptyTypeFamily ()

There is no such instance, yet.

When compiling a module, we do not (yet) know whether another module actually defines such an instance, hence we must be ready for that to happen. This is called the "open world assumption".

Haskell is meant to allow for separate compilation, i.e. compiling each module individually. Also, Haskell allows each module to add arbitrary instances. Coherently, we can never be sure an instance does not exist during compilation.

To be pedantic, Haskell also allows closed type families, which can not be extended in such a way, but IIRC it handles stuck close type families like stuck open ones. Perhaps it should reject those, after all. In closed type families one can however force rejection by adding an explicit TypeError ".." final case in the definition of the closed type family.

Marlea answered 26/1, 2023 at 9:13 Comment(0)

A

2

There are two factors that mean GHC needs to be able to contemplate stuck type families without immediately throwing an error and giving up.

The first is, as chi's answer pointed out, type families defined via separate type instance declarations are open. It's theoretically possible for your module to define EmptyTypeFamily, NotAType, and untypedValue and export them all, and for some other module to import your module and add a type instance EmptyTypeFamily () declaration that makes sense of this.

But there are also closed type families, where the set of instances are known statically to any module that has imported the type family at all. It's tempting to say these should at least immediately throw an error if they can't be resolved, and indeed that could conceivably be done in a case like yours where all of the arguments to the type family are fully concrete types. But the second reason GHC needs the concept of stuck type families that aren't an error is that type families are usually used in more complex cases with type variables in play. Consider this:

{-# LANGUAGE TypeFamilies #-}

type family TF t
  where TF Bool = Int
        TF Char = Bool

foo :: t -> TF t -> TF t
foo = flip const

The TF t in the type of foo is a stuck type family application; GHC can't resolve it here. Whether the declared type for foo is an error or not depends on the variable t, which depends on the context in which foo is used, not just its definition.

So GHC needs this ability to work with things with a type given by a type expression that it cannot actually resolve now. Given that, I'm not overly bothered by it not throwing errors eagerly in cases like the OP (assuming EmptyTypeFamily was rewritten to be a closed family; it definitely shouldn't with the open family). I don't know precisely, but I've always assumed it was either:

It's not possible to make a general procedure for GHC to be able to tell the difference between a stuck application that could potentially be unstuck with more information, and one that definitely can't.
It is possible, but the inconsistency and unpredictability is not considered desirable. Remember that some other GADTs or type families might be able to resolve for any type at all (much like a non-strict function not necessarily needing its argument to be defined), and be applied to a stuck family application; if GHC is distinguishing between "definitely stuck" applications that immediately throw an error and "potentially unstuck-able" applications, the programmer would need to be able to always make the same determination in their head to tell when complex cases are allowable. I'm genuinely unsure whether this could be made pleasant to work with.
It's possible implement and can be made nicely predictable, but it's a special case in the general handling of type families that no one has added yet. The original feature only had open families, I'm fairly sure, so the original guts of type family handling would be written without any real possibility of "definitely stuck" applications existing to be detected.

If you're using closed type families, one thing you can do is add a catch-all case that resolves to a TypeError. Like so:

{-# LANGUAGE DataKinds, TypeFamilies, UndecidableInstances #-}

-- These are in GHC.TypeError as of base 4.17, but the current
-- default on my system is base 4.16. GHC.TypeLits still exports
-- them for backwards compatibility so this will work, but might
-- eventually be deprecated?
import GHC.TypeLits ( TypeError
                    , ErrorMessage ( Text, ShowType, (:<>:) )
                    )
type family TF t
  where TF t = TypeError (Text "No TF instance for " :<>: ShowType t)

bad :: TF ()
bad = undefined

This helps, but it doesn't catch every possible use of a stuck TF application. It won't catch definitions like bad, because it's still "lazy" about actually evaluating TF (); you told it the type was TF (), and the implementation is undefined which can be any type so it happily unifies with TF () and passes the type check without the compiler ever needing to actually evaluate TF (). But it does catch more cases than a stuck type family application would because it's not stuck: it resolves to a type error. If you have any other binding where it has to infer a type and that depends on the type of bad, it seems to hit the error; even something like boom = bad where you'd think it would similarly be able to unify without any evaluation. And even asking for the type of bad in ghci with :t generates a type error.

At the very least it gives you better error messages than No instance for Show (EmptyTypeFamily ()) because it will resolve the type family to the type error, rather than go looking for an instance that could match a stuck type family and complain about the missing instance.

Amalita answered 27/1, 2023 at 2:40 Comment(0)

Recommended topics

Hot tags