G-machine, (non-)strict contexts - why case expressions need special treatment
Asked Answered
D

1

6

I'm currently reading Implementing functional languages: a tutorial by SPJ and the (sub)chapter I'll be referring to in this question is 3.8.7 (page 136).

The first remark there is that a reader following the tutorial has not yet implemented C scheme compilation (that is, of expressions appearing in non-strict contexts) of ECase expressions.
The solution proposed is to transform a Core program so that ECase expressions simply never appear in non-strict contexts. Specifically, each such occurrence creates a new supercombinator with exactly one variable which body corresponds to the original ECase expression, and the occurrence itself is replaced with a call to that supercombinator.
Below I present a (slightly modified) example of such transformation from 1

t a b = Pack{2,1} ;
f x = Pack{2,2} (case t x 7 6 of
    <1> -> 1;
    <2> -> 2) Pack{1,0} ;
main = f 3

== transformed into ==>

t a b = Pack{2,1} ;
f x = Pack{2,2} ($Case1 (t x 7 6)) Pack{1,0} ;
$Case1 x = case x of
    <1> -> 1;
    <2> -> 2 ;
main = f 3

I implemented this solution and it works like charm, that is, the output is Pack{2,2} 2 Pack{1,0}.
However, what I don't understand is - why all that trouble? I hope it's not just me, but the first thought I had of solving the problem was to just implement compilation of ECase expressions in C scheme. And I did it by mimicking the rule for compilation in E scheme (page 134 in 1 but I present that rule here for completeness): so I used

E[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]

and wrote

C[[case e of alts]] p = C[[e]] p ++ [Eval] ++ [Casejump D[[alts]] p]

I added [Eval] because Casejump needs an argument on top of the stack in weak head normal form (WHNF) and C scheme doesn't guarantee that, as opposed to E scheme.

But then the output changes to enigmatic: Pack{2,2} 2 6.
The same applies when I use the same rule as for E scheme, i.e.

C[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]

So I guess that my "obvious" solution is inherently wrong - and I can see that from outputs. But I'm having trouble stating formal arguments as to why that approach was bound to fail.
Can someone provide me with such argument/proof or some intuition as to why the naive approach doesn't work?

Disaffect answered 30/12, 2014 at 19:49 Comment(0)
L
7

The purpose of the C scheme is to not perform any computation, but just delay everything until an EVAL happens (which it might or might not). What are you doing in your proposed code generation for case? You're calling EVAL! And the whole purpose of C is to not call EVAL on anything, so you've now evaluated something prematurely.

The only way you could generate code directly for case in the C scheme would be to add some new instruction to perform the case analysis once it's evaluated.

But we (Thomas Johnsson and I) decided it was simpler to just lift out such expressions. The exact historical details are lost in time though. :)

Lit answered 30/12, 2014 at 20:29 Comment(2)
Ah, yes, I admit that putting Eval into C scheme was a bad idea. Hmmm, now as I'm trying to think what exactly was the root of the problem I've come to a conclusion that actually the culprit is Casejump instruction, because it needs an argument in WHNF - is that right? So, theoretically, if one devised MyCasejump that would not need its argument in WHNF, then the whole problem would be gone. And as to your suggestion of introducing new instruction - would it somehow correspond to MyCasejump? Or - could you sketch out how that instruction should perform case analysis?Disaffect
Yes, it would have to be something like MyCaseJump which will evaluate its argument and choose an alternative when you evaluate it rather than at once.Lit

© 2022 - 2024 — McMap. All rights reserved.