How do block params work in WebAssembly (with multivalue support)?
Asked Answered
M

3

6

Blocks can consume values from the stack and push results back to it, which is cool, but I cannot find any detail on how this works.

The spec for loop says (at 5) that the values are popped from the stack (before any code within the block gets executed), and the other block instructions (block and if) are described similarly. They all pop the params before executing the block:

loop blocktype instr* end

  1. Assert: due to validation, expand𝐹(blocktype) is defined.
  2. Let [π‘‘π‘š1]β†’[𝑑𝑛2] be the function type expand𝐹(blocktype).
  3. Let 𝐿 be the label whose arity is π‘š and whose continuation is the start of the loop.
  4. Assert: due to validation, there are at least π‘š values on the top of the stack.
  5. Pop the values valπ‘š from the stack.
  6. Enter the block valπ‘š instrβˆ— with label 𝐿.

Where are block params popped to? Blocks do not have their own locals, so I have no idea how to actually use the params (presumably they are not just thrown away).

The spec does not say anything about loop blocks popping params on each iteration, but I thought that was how they work.

I also struggled to establish whether loop blocks push results to the stack on each iteration, or only once, when the loop exits. Presumably, it's once, but with my ignorance on the other points, I'm not confident about that either.

I found this code in the proposal that may make things clearer (to somebody):

(func $fac (param i64) (result i64)
    (i64.const 1) (get_local 0)
    (loop $l (param i64 i64) (result i64)
        (pick 1) (pick 1) (i64.mul)
        (pick 1) (i64.const 1) (i64.sub)
        (pick 0) (i64.const 0) (i64.gt_u)
        (br_if $l)
        (pick 1) (return)
    )
)
Muddle answered 26/3, 2021 at 2:45 Comment(0)
O
2

I am new to WebAssembly, but here is how I was able to use it (replace loop by block if needed):

Following .wat adds 1 to input param

(module
    (func (export "myFunc") (param i32) (result i32)
        get_local 0
        (loop (param i32) (result i32)
            i32.const 1
            i32.add
        )
    )
)

index.html

<div id="out"></div>
<script>
const out = document.getElementById("out");
WebAssembly.instantiateStreaming(fetch('module.wasm'))
    .then(obj => {
        out.innerHTML = obj.instance.exports.myFunc(3);
    });
</script>

Alternatively, without loop param you could do the following

(module
    (func (export "myFunc") (param i32) (result i32)
        (loop (result i32)
            get_local 0
            i32.const 1
            i32.add
        )
    )
)

There are really not many examples I could find, so I am using unit tests as material

Oxblood answered 29/3, 2021 at 15:30 Comment(1)
Thanks Yoz. I'm still confused though. So, it seems that the block cannot be entered unless the stack (only) contains the types that the block takes as params (none by default). That makes sense, and the block's params are just left on the stack. But, the spec says the values are popped from the stack, and I'm pretty sure the Multivalue proposal said loops pop values on each iteration. I'll update my question to quote the official spec. – Muddle
S
2

Step 6 you quote ("Enter the block val^π‘š instr* with label 𝐿"), pushes the values back to the stack, but only after it has pushed the label L to it. Hence, after this, the label marks the point on the stack up to which operands must be consumed before leaving the block. Due to validation, that usually is guaranteed to happen by construction, except with branches: a branch may "throw away" parts of the stack. So, to specify the semantics of branches, we need this marker in the right place.

For example:

(i32.const 1)                        ;; stack: 1
(i32.const 2)                        ;; stack: 2 1
(block $l (param i32) (result i32)   ;; stack: 2 $l 1
  (i32.const 3)                      ;; stack: 3 2 $l 1
  (i32.const 4)                      ;; stack: 4 3 2 $l 1
  (br $l)
)                                    ;; stack: 4 1

This example breaks out of the block, throwing away the constants 2 and 3. In contrast, the outside value 1 remains on the stack, and so does 4 as the block's result. The only way the execution rule for br knows this is by finding the label in the respective position on the stack.

Loops work analogously:

(i32.const 1)          ;; stack: 1
(i32.const 2)          ;; stack: 2 1
(loop $l (param i32)   ;; stack: 2 $l 1 (4 $l 1 after branch)
  (i32.const 3)        ;; stack: 3 2 $l 1 (3 4 $l 1 after branch)
  (i32.const 4)        ;; stack: 4 3 2 $l 1 (4 3 4 $l 1 after branch)
  (br $l)
)                      ;; never reached

Here, when branching back to the loop entry, the original parameter 2 and the local 3 are dropped, while the outside 1 remains on the stack and 4 becomes the new parameter upon reentry. (The loop never terminates, so its result type is irrelevant in this example.)

There would of course be other ways to model this, e.g., by introducing a separate stack of labels, where each label entry records the height of the operand stack it needs to go back to. But that would be somewhat more complicated structurally.

Of course, all this is just an abstract model, and not how it will work in a real implementation. Real implementations usually don't use an operand stack at all but store intermediate values in registers. The stack-based instruction set is just an abstraction to make the code format more compact, by referencing the required registers implicitly.

FWIW, the parameterisation can perhaps be understood more easily by thinking of blocks as local functions that are implicitly called when entering the respective block. For example, in the first example, replace the block with (call $block_$l) to the following function:

(func $block_$l (param i32) (result i32)
  (local.get 0)  ;; re-push parameter
  (i32.const 3)
  (i32.const 4)
  (return)       ;; br becomes return
)

Similarly, loops are (tail-)recursive functions:

(func $loop_$l (param i32)
  (local.get 0)           ;; re-push parameter
  (i32.const 3)
  (i32.const 4)
  (return_call $loop_$l)  ;; br becomes return_call
)

Of course, this analogy only works for branches to the innermost block/loop.

Steib answered 15/6 at 13:23 Comment(2)
Thanks for the answer! Can you add an explanation for the original example of a loop with parameters, and possibly update your example to show how the forward jump works with a block with a return type? – Ritornello
Done. Also added explanation in terms of functions. – Steib
E
1

I too am struggling to understand the baroque way that blocks are defined in WebAssembly. This is more or less what the loop spec says:

  • pop the m values val_m from the stack
  • enter the block val_m instr* with label L

This last step means: push label L; push val_m; and start executing the instruction stream.

What they mean here is that the program must behave as if this procedure had been followed. The effect is to insert the label L (a local return address, if you like) into the stack, underneath the block parameters.

Of course no real-world implementation would do this; it is just a way of specifying the required behaviour, which people are free to implement as they see fit.

Extradition answered 1/4, 2021 at 15:52 Comment(3)
I see. It makes sense that they would define the process in abstract terms. So, if I understand correctly, the effect of (block-instruction (param i32) ... is to (effectively) pop an i32, push the label for the block, then push the popped i32 back to the stack, so we can have params on top of the label? So, in practice (IIUC), this means that the stack no longer needs to be empty when we enter the block?? – Muddle
Does that mean that loop instructions do not really pop anything on each iteration, except in the same abstract terms as when the block is entered initially (just under the hood stack operations that ensure the label is in the right place)? I'm still pretty vague on how this all works, but your answer was helpful. I don't have enough rep to upvote anything yet, but will come back and fix it when I have. – Muddle
Yes, that's my understanding of the process. (What I'm currently struggling with is how validation proceeds when a frame has become unreachable. The test cases in Yoz's link have many examples of this, and it seems that you have to be strict about things.) – Extradition

© 2022 - 2024 β€” McMap. All rights reserved.