typing recursive modules
Asked Answered
L

1

15

In Leroy's paper on how recursive modules are typed in OCaml, it is written that modules are checked in an environment made of approximations of module types:

module rec A = ... and B = ... and C = ...

An environment {A -> approx(A); B -> approx(B); C -> approx(C) } is first built, and then used to compute the types of A, B and C.

I noticed that, in some cases, approximations are not good enough, and typechecking fails. In particular, when putting compilation units sources in a recursive module definition, typechecking can fail whereas the compiler was able to compile the compilation units separately.

Coming back to my first example, I found that a solution would be to type A in the initial approximated environment, but then to type B in that initial environment extended with the new computed type of A, and to type C in the previous env with the new computed type of B, and so on.

Before investigating more, I would like to know if there is any prior work on this subject, showing that such a compilation scheme for recursive modules is either safe or unsafe ? Is there a counter-example showing an unsafe program correctly typed with this scheme ?

Lipson answered 21/2, 2012 at 10:32 Comment(2)
A very interesting question, indeed. Note that your proposed solution ignores that the type of A may depend on that of B and C, if this were not so, there would be no reason to type check them together (as opposed to dependency order).Unbodied
No, actually, the type of A can depend on B and C, but I assume the approximation of B and C is enough in that case. My question comes from a real example, I solved it by writing a patch with this solution in the compiler, and the program was safe (because composed of compilation units with recursive types), but I want to know if the patch is safe in the general case.Lipson
H
16

First, note that Leroy (or Ocaml) does not allow module rec without explicit signature annotations. So it's rather

module rec A : SA = ... and B : SB = ... and C : SC = ...

and the approximative environment is {A : approx(SA), B : approx(SB), C : approx(SC)}.

It is not surprising that some modules type-check when defined separately, but not when defined recursively. After all, the same is already true for core-language declarations: in a 'let rec', recursive occurrences of the bound variables are monomorphic, while with separated 'let' declarations, you can use previous variables polymorphically. Intuitively, the reason is that you cannot have all the knowledge that you'd need to infer the more liberal types before you have actually checked the definitions.

Regarding your suggestion, the problem with it is that it makes the module rec construct unsymmetric, i.e. order would matter in potentially subtle ways. That violates the spirit of recursive definitions (at least in ML), which should always be indifferent to ordering.

In general, the issue with recursive typing is not so much soundness, but rather completeness. You don't want a type system that is either undecidable in general, or whose specification is dependent on algorithmic artefacts (like checking order).

On a more general note, it is well-known that Ocaml's treatment of recursive modules is rather restrictive. There has been work, e.g. by Nakata & Garrigue, that pushes its limits further. However, I am convinced that ultimately, you won't be able to get as liberal as you'd like (and that applies to other aspects of its type module system as well, e.g. functors) without abandoning Ocaml's purely syntactic approach to module typing. But then, I'm biased. :)

Hepato answered 21/2, 2012 at 12:25 Comment(4)
Thanks for the pointer to Nakata & Garrigue's work, I didn't know about it. It is true that order does not matter for let rec, but it does matter in other part of the language let x = x + 2 in let x = x * 3 in ... does depend on the order. So, why not introduce a module incremental rec where order would matter, and allow typing of programs that are not typable now ? (now, I must read the paper...)Lipson
Well, a nested let isn't quite comparable. You could compare to a parallel let x = a and y = b in ..., where in fact, order does not matter for typing. Why no incremental rec? Well, IMHO that would be an ugly hack, and one that only covers a few specific use cases anyway. ;) Wouldn't you rather want a more general solution?Hepato
I would want a more general solution, but when there is none (I still need to read more about recursive modules...), I am comfortable with a hackish solution. And I don't think it is for "a few specific use cases", I think my problem could come up rather frequently.Lipson
It's worth pointing out that the author of this response authored MixML (people.mpi-sws.org/~rossberg/mixml) which is arguably the current state of the art for module systems in ML that handle recursion.Isabelisabelita

© 2022 - 2024 — McMap. All rights reserved.