Why does this code using UndecidableInstances compile, then generate a runtime infinite loop?
Asked Answered
B

2

28

When writing some code using UndecidableInstances earlier, I ran into something that I found very odd. I managed to unintentionally create some code that typechecks when I believed it should not:

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE UndecidableInstances #-}

data Foo = Foo

class ConvertFoo a b where
  convertFoo :: a -> b

instance (ConvertFoo a Foo, ConvertFoo Foo b) => ConvertFoo a b where
  convertFoo = convertFoo . (convertFoo :: a -> Foo)

evil :: Int -> String
evil = convertFoo

Specifically, the convertFoo function typechecks when given any input to produce any output, as demonstrated by the evil function. At first, I thought that perhaps I managed to accidentally implement unsafeCoerce, but that is not quite true: actually attempting to call my convertFoo function (by doing something like evil 3, for example) simply goes into an infinite loop.

I sort of understand what’s going on in a vague sense. My understanding of the problem is something like this:

  • The instance of ConvertFoo that I have defined works on any input and output, a and b, only limited by the two additional constraints that conversion functions must exist from a -> Foo and Foo -> b.
  • Somehow, that definition is able to match any input and output types, so it would seem that the call to convertFoo :: a -> Foo is picking the very definition of convertFoo itself, since it’s the only one available, anyway.
  • Since convertFoo calls itself infinitely, the function goes into an infinite loop that never terminates.
  • Since convertFoo never terminates, the type of that definition is bottom, so technically none of the types are ever violated, and the program typechecks.

Now, even if the above understanding is correct, I am still confused about why the whole program typechecks. Specifically, I would expect the ConvertFoo a Foo and ConvertFoo Foo b constraints to fail, given that no such instances exist.

I do understand (at least fuzzily) that constraints don’t matter when picking an instance—the instance is picked based solely on the instance head, then constraints are checked—so I could see that those constraints might resolve just fine because of my ConvertFoo a b instance, which is about as permissive as it can possibly be. However, that would then require the same set of constraints to be resolved, which I think would put the typechecker into an infinite loop, causing GHC to either hang on compilation or give me a stack overflow error (the latter of which I’ve seen before).

Clearly, though, the typechecker does not loop, because it happily bottoms out and compiles my code happily. Why? How is the instance context satisfied in this particular example? Why doesn’t this give me a type error or produce a typechecking loop?

Burtis answered 23/5, 2016 at 22:37 Comment(11)
I like this question! I propose a slightly simpler question kernel: how come compilation of class Default a where def :: a; instance Default a => Default a where def = def; main = def terminates?Shabuoth
@DanielWagner Is the key to both of these that Haskell can equate infinite types? And perhaps more practically, if that is not the behavior I want, is there anything I can do to avoid it to avoid producing bottom? I’d like this to only typecheck if those two instances (ConvertFoo a Foo and ConvertFoo Foo b) are actually defined.Burtis
No infinite types here. And those two instances are actually defined! The declaration instance ... => Convert a b says that there is an instance for every stupid pair of types you care to come up with, including if one of them happens to be a Foo.Shabuoth
@DanielWagner Yes, I do understand that, but won’t that prompt recursive evaluation of those constraints? Even if I had defined an instance on any two types, verifying that instance is valid would require recursively checking each pair of types against those same constraints, right? Or does Haskell not check those constraints recursively?Burtis
That's a great question, and one I whole-heartedly upvoted. =)Shabuoth
I find it interesting to note that the Scala translation of this, using implicits, is rejected by scalac as a "diverging implicit expansion".Sophister
@phg that is interesting, provided Scala implicits do in fact handle some of the correct cases of proper UndecidableInstances, without the compiler also rejecting them as “diverging”. Does it?Lyon
And FTR: GHC would have put forth more resistance against accepting this instance, if it hadn't been the only one. Any further instance you write in addition to (ConvertFoo a Foo, ConvertFoo Foo b) => ConvertFoo a b will be overlapping, not just undecidable, and OverlappingInstances are well-known to be risky beyond compile-time.Lyon
@Lyon I have the feeling that Scala implicits allow for strictly more cases, but I'm not sure. Could you point me to a short "proper usage example" to translate? Here's at least the specification of the termination checking.Sophister
@phg: class Foo x and class Bar x with instance Bar (x,y) => Foo (x,y) is undecidable yet safe.Lyon
@Lyon Seems to work.Sophister
P
23

I wholeheartedly agree that this is a great question. It speaks to how our intuitions about typeclasses differ from the reality.

Typeclass surprise

To see what is going on here, going to raise the stakes on the type signature for evil:

data X

class Convert a b where
  convert :: a -> b

instance (Convert a X, Convert X b) => Convert a b where
  convert = convert . (convert :: a -> X)

evil :: a -> b
evil = convert

Clearly the Covert a b instance is being chosen as there is only one instance of this class. The typechecker is thinking something like this:

  • Convert a X is true if...
    • Convert a X is true [true by assumption]
    • and Convert X X is true
      • Convert X X is true if...
        • Convert X X is true [true by assumption]
        • Convert X X is true [true by assumption]
  • Convert X b is true if...
    • Convert X X is true [true from above]
    • and Convert X b is true [true by assumption]

The typechecker has surprised us. We do not expect Convert X X to be true as we have not defined anything like it. But (Convert X X, Convert X X) => Convert X X is a kind of tautology: it is automatically true and it is true no matter what methods are defined in the class.

This might not match our mental model of typeclasses. We expect the compiler to gawk at this point and complain about how Convert X X cannot be true because we have defined no instance for it. We expect the compiler to stand at the Convert X X, to look for another spot to walk to where Convert X X is true, and to give up because there is no other spot where that is true. But the compiler is able to recurse! Recurse, loop, and be Turing-complete.

We blessed the typechecker with this capability, and we did it with UndecidableInstances. When the documentation states that it is possible to send the compiler into a loop it is easy to assume the worst and we assumed that the bad loops are always infinite loops. But here we have demonstrated a loop even deadlier, a loop that terminates – except in a surprising way.

(This is demonstrated even more starkly in Daniel's comment:

class Loop a where
  loop :: a

instance Loop a => Loop a where
  loop = loop

.)

The perils of undecidability

This is the exact sort of situation that UndecidableInstances allows. If we turn that extension off and turn FlexibleContexts on (a harmless extension that is just syntactic in nature), we get a warning about a violation of one of the Paterson conditions:

...
Constraint is no smaller than the instance head
  in the constraint: Convert a X
(Use UndecidableInstances to permit this)
In the instance declaration for ‘Convert a b’

...
Constraint is no smaller than the instance head
  in the constraint: Convert X b
(Use UndecidableInstances to permit this)
In the instance declaration for ‘Convert a b’

"No smaller than instance head," although we can mentally rewrite it as "it is possible this instance will be used to prove an assertion of itself and cause you much anguish and gnashing and typing." The Paterson conditions together prevent looping in instance resolution. Our violation here demonstrates why they are necessary, and we can presumably consult some paper to see why they are sufficient.

Bottoming out

As for why the program at runtime infinite loops: There is the boring answer, where evil :: a -> b cannot but infinite loop or throw an exception or generally bottom out because we trust the Haskell typechecker and there is no value that can inhabit a -> b except bottom.

A more interesting answer is that, since Convert X X is tautologically true, its instance definition is this infinite loop

convertXX :: X -> X
convertXX = convertXX . convertXX

We can similarly expand out the Convert A B instance definition.

convertAB :: A -> B
convertAB =
  convertXB . convertAX
  where
    convertAX = convertXX . convertAX
    convertXX = convertXX . convertXX
    convertXB = convertXB . convertXX

This surprising behavior, and how constrained instance resolution (by default without extensions) is meant to be as to avoid these behaviors, perhaps can be taken as a good reason for why Haskell's typeclass system has yet to pick up wide adoption. Despite its impressive popularity and power, there are odd corners to it (whether it is in documentation or error messages or syntax or maybe even in its underlying logic) that seem particularly ill fit to how we humans think about type-level abstractions.

Plimsoll answered 24/5, 2016 at 5:16 Comment(3)
First of all, thank you for your very detailed answer! Unfortunately, I seem to remain a tiny bit unsatisfied. I think I understood most of these things, especially the key fact that Convert X X already exists, given that we have defined Convert a b, but the description of the typechecker’s “thought process” confuses me. If it determines that “Convert X X is true if Convert X X is true”, it has not proven anything—that is just a tautology! If that sufficed as a proof, we could prove that any instance exists. Why does the recursion make things special?Burtis
Take a constraint that we know not to be true. For example, Num String. We could define instance Num String. We could also define an instance Num a => Num a. Now Num String also holds! This is oh-so-slightly different from a true tautology, which I agree would be useless, because instance Num a => Num a is equivalent to instance Num a, which actually does do something. // Of course we would have to implement some of the Num methods in instance Num a with bottom ... exactly what we did here!Plimsoll
Ahh, I think I see now, yes. One more nuance to add on top of my ever-growing mental picture of Haskell typeclasses. I suppose it’s worth asking one more thing, though if it needs to be a whole separate question, perhaps it should be: is there any way to encode the thing I actually want into Haskell’s type system? That is, I want an instance of ConvertFoo from a to b if, and only if, ConvertFoo a Foo and ConvertFoo Foo b instances exist? Or is that really not currently possible with how typeclass resolution works?Burtis
L
9

Here's how I mentally process these cases:

class ConvertFoo a b where convertFoo :: a -> b
instance (ConvertFoo a Foo, ConvertFoo Foo b) => ConvertFoo a b where
  convertFoo = ...

evil :: Int -> String
evil = convertFoo

First, we start by computing the set of required instances.

  • evil directly requires ConvertFoo Int String (1).
  • Then, (1) requires ConvertFoo Int Foo (2) and ConvertFoo Foo String (3).
  • Then, (2) requires ConvertFoo Int Foo (we already counted this) and ConvertFoo Foo Foo (4).
  • Then (3) requires ConvertFoo Foo Foo (counted) and ConvertFoo Foo String (counted).
  • Then (4) requires ConvertFoo Foo Foo (counted) and ConvertFoo Foo Foo (counted).

Hence, we reach a fixed point, which is a finite set of required instances. The compiler has no trouble with computing that set in finite time: just apply the instance definitions until no more constraint is needed.

Then, we proceed to provide the code for those instances. Here it is.

convertFoo_1 :: Int -> String
convertFoo_1 = convertFoo_3 . convertFoo_2
convertFoo_2 :: Int -> Foo
convertFoo_2 = convertFoo_4 . convertFoo_2
convertFoo_3 :: Foo -> String
convertFoo_3 = convertFoo_3 . convertFoo_4
convertFoo_4 :: Foo -> Foo
convertFoo_4 = convertFoo_4 . convertFoo_4

We get a bunch of mutually recursive instance definitions. These, in this case, will loop at runtime, but there's no reason to reject them at compile time.

Lymph answered 24/5, 2016 at 8:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.