Breakpoints in the argument-passing scheme of OCaml

Asked 22/3, 2015 at 5:55 Answered 22/3, 2015 at 23:7

Today, I was going through the source code of Jane Street's Core_kernel module and I came across the compose function:

(* The typical use case for these functions is to pass in functional arguments
   and get functions as a result. For this reason, we tell the compiler where
   to insert breakpoints in the argument-passing scheme. *)
let compose f g = (); fun x -> f (g x)

I would have defined the compose function as:

let compose f g x = f (g x)

The reason they give for defining compose the way they did is “because compose is a function which takes functions f and g as arguments and returns the function fun x -> f (g x) as a result, they defined compose the way they did to tell the compiler to insert a breakpoint after f and g but before x in the argument-passing scheme.”

So I have two questions:

Why do we need breakpoints in the argument-passing scheme?
What difference would it make if we defined compose the normal way?

Coming from Haskell, this convention doesn't make any sense to me.

Suhail answered 22/3, 2015 at 5:55 Comment(3)

A similar thing occurs in Haskell when influencing the inliner. The following map definition is better optimized than a more naive one: map f = go where { go [] = []; go (a : as) = f a : go as }. – Froemming 22/3, 2015 at 16:7

@J.Abrahamson That's only useful because map is recursive. Hence, go as is more efficient than map f as. I don't see how the same concept would apply to compose considering that it is not recursive. – Suhail 22/3, 2015 at 16:47

There are non-recursive instances, too. My point is that the inliner only runs when all of the variables in your definition are filled and thus behaves differently between f x = \y -> ... and f x y = ... despite them being semantically identical. – Froemming 22/3, 2015 at 17:44

This is an efficiency hack to avoid the cost of a partial application in the expected use case indicated in the comment.

OCaml compiles curried functions into fixed-arity constructs, using a closure to partially apply them where necessary. This means that calls of that arity are efficient - there's no closure construction, just a function call.

There will be a closure construction within compose for fun x -> f (g x), but this will be more efficient than the partial application. Closures generated by partial application go through a wrapper caml_curryN which exists to ensure that effects occur at the correct time (if that closure is itself partially applied).

The fixed arity that the compiler chooses is based on a simple syntactic analysis - essentially, how many arguments are taken in a row without anything in between. The Jane St. programmers have used this to select the arity that they desire by injecting () "in between" arguments.

In short, let compose f g x = f (g x) is a less desirable definition because it would result in the common two-argument case of compose f g being a more expensive partial application.

Semantically, of course, there is no difference at all.

Apheliotropic answered 22/3, 2015 at 6:43 Comment(5)

Wouldn't a closure be built to return fun x -> f (g x) anyway? Would building this closure be cheaper wrt building the one associated to partial application? I can't see an obvious reason for that. – Salvo 22/3, 2015 at 9:30

The closures built by partial application involve an additional indirection - they go through caml_curryN (which is necessary for the correct evaluation order) rather than straight to the code. – Apheliotropic 22/3, 2015 at 11:6

Does the compiler not treat let compose f g = fun x -> f (g x) and let compose f g x = f (g x) differently? I wouldn't have thought to interject a (). – Froemming 22/3, 2015 at 16:8

No, those are handled the same. let f x y = ... is just sugar for let f = fun x -> fun y -> ...: when expanded, the two are identical. – Apheliotropic 22/3, 2015 at 16:18

Gotcha. In Haskell they're treated slightly differently with regard to the inliner. It makes a lot of sense to use the (); E trick in OCaml too, though. – Froemming 22/3, 2015 at 17:46

It's worth noting that compilation of partial application has improved in OCaml, and this performance hack is no longer necessary.

Correggio answered 22/3, 2015 at 23:7 Comment(0)

Recommended topics

Hot tags