How is Forth LEAVE ... LOOP implemented, since number of LEAVEs is not known beforehand?
Asked Answered
S

5

9

The word LOOP is described as "Resolve the destination of all unresolved occurrences of LEAVE". (emphasis mine)

Unlike IF ... ELSE ... THEN, where the number of forward references is always one, LOOP has no constraints on the quantity of LEAVE. How to implement it then?

One approach I thought of is to always keep the number of LEAVEs on top of the stack. Each LEAVE increments this counter and puts itself under it. LOOP reads the counter from the top and resolves that many references. But it seems like a cheap trick.

How do real Forth systems implement this kind of loop? I don't need teh codez (been implementing Forth as a learning experience), just the concepts.

Selfesteem answered 9/10, 2019 at 12:21 Comment(0)
S
5

In SP-Forth the loop control parameters comprise the index, limit, and address after LOOP. So, no need to resolve LEAVE at compile time, it knows the address from the loop control parameters at run-time.

Another approach is to store the control-flow stack depth on DO, place the unresolved forward reference on the control-flow stack under all other already placed values (using the stored depth) on LEAVE, and then resolve all the placed forward references on LOOP.

See my high-level implementation of DO LOOP on the basis of BEGIN UNTIL and AHEAD THEN (spoiler warning).

Soho answered 9/10, 2019 at 18:11 Comment(2)
There seem to be two problems here. a: an unknown number of LEAVES and b: that the LEAVE can jump out of the middle of another control structure - such as DO...IF LEAVE THEN LOOP, that combination seems to create over-complex and therefore fragile, solutions ?Redan
@MitraArdron, my solution covers these both problems, so it's robust. And it can be tested in any rich enough standard Forth system. The only substantial environmental dependency is that the control-flow stack should be implemented using the data stack. I.e., otherwise the code should be adjusted to deal with the separate control-flow stack (or use cs-roll instead for portability).Soho
I
5

As another approach, you can thread a singly-linked list through the unresolved address "holes". I used this when I implemented counted loops in my FORTH:

( We can now define DO, ?DO, LOOP, +LOOP and LEAVE. It would be much easier
  if LEAVE didn't exist, but oh well. Because storing LEAVE's data on the stack
  would interfere with other control flow inside the loop, let's store it in a variable. )

VARIABLE LEAVE-PTR

( Let's consider the base case: only one LEAVE in the loop. This can be trivially
  handled by storing the address we need to patch in the variable.

  This would also work quite well with nested loops. All we need to do is store
  the old value of the variable on the stack when opening a loop.

  Finally, we can extend this to an arbitrary number of LEAVEs by threading
  a singly-linked list through the branch target address holes. )

\ ...

: LEAVE, ( -- )
  HERE
  LEAVE-PTR @ ,
  LEAVE-PTR !
;

: LEAVE POSTPONE BRANCH LEAVE, ; IMMEDIATE

: DO ( -- old-leave-ptr loop-beginning )
  LEAVE-PTR @
  0 LEAVE-PTR !
  POSTPONE 2>R
  HERE
; IMMEDIATE

\ SOME-LOOP is the common code between LOOP and +LOOP
: SOME-LOOP ( old-leave-ptr loop-beginning -- )
  POSTPONE 0BRANCH ,
  LEAVE-PTR @
  BEGIN
    ?DUP
  WHILE
    DUP @ >R
    HERE SWAP !
    R>
  REPEAT
  POSTPONE UNLOOP
  LEAVE-PTR !
;
Inward answered 14/2, 2020 at 19:2 Comment(1)
This is also how taliforth worksArmalda
S
1

Since somebody already posted a great high-level solution I thought it might help to address things from a different perspective. I've recently written a Forth library called Shi in ARM-Thumb2 assembly.

In case you're comfortable with reading assembly code the source of leave can be found here.

The way it works is almost like you described it. I've used a single byte to count the nesting levels of do...loop constructs and a dedicated stack pointer which I called "control-flow stack pointer". This special stack pointer points to the end of the data stack and can push forward references in reverse order. Pushing things in from the other side has the benefit that everything else on the stack stays untouched.

The nesting level and some pointer arithmetic then allows me to resolve all potential forward references a leave might have left regardless of how deeply nested the loop was or how many leaves there were.

Sulfamerazine answered 13/11, 2019 at 13:44 Comment(0)
K
1

I found this massively difficult when developing my Forth (like you, as a learning experience).

What I did in the end was this:

  1. each loop has a branch compiled from the top of the loop to the bottom, that's a necessity for ?DO but even for DO, I compile it anyway.
  2. each LEAVE works by unconditionally branching to the TOP of the loop (well, slightly below the top - right to the opening branch), which then all on its own branches back to the clean-up code at the bottom.
  3. That way, there's nothing for LOOP/+LOOP to resolve, it has all been done already.

Beats me how I came up with that wheeze - I read the code again this evening and I still don't understand how I ever got it to work!

Karlenekarlens answered 17/11, 2019 at 21:45 Comment(2)
How does a LEAVE unconditionally branch back to the top of the loop when the LEAVE could be nested, ie. how does it find that loop e.g. in DO..IF..LEAVE .. THEN ... LOOPRedan
Good point. I forgot to mention that (at the time I was doing it like this) I had a separate control stack for loops - stopped it getting muddled up with all the other control structures. So though I never knew where the bottom of the loop was, I could always find the top of the innermost loop.Karlenekarlens
L
0

In fig-Forth, DO and friends have a quite small and simple implementation.

It has a set of primitive words that implement the runtime behavior. These are all implemented in machine code.

(DO)   ( limit start -- ) move the two values to the return stack
I      ( -- index )       fetch the current value of the index variable
LEAVE  ( -- )             set the current index=limit
(LOOP) ( -- )             checks index<limit and branch back if true.

So all the compiler has to do is track the single branch address back to the start of the loop and compile the offset just after (LOOP). These are implemented in high level code.

: BACK  HERE - , ;
: DO  COMPILE (DO) HERE ; IMMEDIATE
: LOOP  COMPILE (LOOP) BACK ; IMMEDIATE

So, LEAVE in this definition doesn't do a branch at all. It just adjusts the stored index value so that LOOP will stop looping once execution gets there.

Longhorn answered 25/9, 2020 at 22:26 Comment(1)
This unfortunately won't match the ANS spec for LEAVE - the code between LEAVE and (LOOP) has to be skipped - and its resolving those branches that is hard.Redan

© 2022 - 2024 — McMap. All rights reserved.