What is intended/correct way to handle interrupts and use the WFI risc-v cpu instruction?

E

2

7

I am very new to bare metal programming and have never delt with interrupts before, but I've been learning on a RISC-V FE310-G002 SOC powered dev board.

I've been reading about the RISC-V WFI (Wait for interrupt) instruction and from the manuals, it doesn't sound like you can rely on it to actually sleep the core. Instead, it only suggests that execution can be halted to the system and that the instruction should be treated more like a NOP. However, this seems rather useless to me. Consider the following ASM program snippet:

wfi_loop:
WFI
J wfi_loop

This would have to be done since WFI can not be depended on. However, upon MRET from the interrupt handler, you would still be caught in the loop. So you would have to make it conditional against a global variable whose value is updated in the interrupt handler. This seems very messy.

Also, if your implementation does in fact honor the WFI instruction and the interrupt is triggered just prior to the execution of the WFI instruction, the entire core will stall until some other interrupt is triggered since it will return prior to the WFI instruction.

It seems that the only correct usage of the instruction would be inside of a kernel scheduler when there is no work to be done. But even then, I don't think you would ever want to return from the interrupt handler into such code, but rather restart the scheduler algorithm from the start. But that would be a problem too since you would somehow have to roll back the stack, etc....

I keep going round and round with this in my head and I can't seem to figure out a safe use. Maybe, if you atomically, enable interrupts in with CSRRS and then immediately call WFI like this:

CSRRSI zero, mie, 0x80
wfi_loop:
WFI
J wfi_loop
NOP
NOP

Then make sure to increment the mepc register by 8 bytes before calling MRET from the interrupt handler. The interrupt would also have to be disabled again in the mie register inside of the interrupt handler before returning. This solution would only be safe if WFI, J, and NOP are all encoded as 4 byte instructions, regardless of whether compressed instructions are used. It also depends on the program counter reaching the WFI instruction before it is possible for the interrupt to be triggered, after being enabled by the CSRRSI instruction. This would then allow the interrupt to be triggered in a safe place in code and to return in such a way that it breaks out of the loop that was waiting for it.

I guess I am just trying to understand what behavior I can expect from the hardware and, therefore, how to correctly call and return from interrupts and use the WFI instruction?

Enlistment answered 19/8, 2020 at 9:41 Comment(5)

Its the kind of thing you want to use when your solution is event driven. It is not for when you enable one interrupt and then call wfi to wait for it, for that situation dont interrupt the processor poll the peripheral. Basically when you dont care about the foreground waiting for whatever interrupt comes along next, however long that takes. – Peskoff 19/8, 2020 at 12:42

in some architectures this is how you reduce power, is to be event driven and wait for an interrupt or event to happen, handle it then go back to waiting. If there is no foreground code other than setup, then you would just put wfi in an infinite loop. – Peskoff 19/8, 2020 at 12:43

I don't know RISC-V but this instruction seems a lot like x86's hlt. Which "halts" until the next hardware interrupt request (IRQ) occurs. It is often misinterpreted as "halting" the machine forever though. So wfi is a better mnemonic choice. – Disfigure 19/8, 2020 at 12:44

That said, here is my idling function which has sti \ hlt as the last attempt to idle the machine. It is called here when waiting for key input, after having detected none available yet. We do not care which interrupt request occurs, the only need is that it will resume if a key is pressed (IRQ 1). It is essentially polling with sleep while waiting. (86-DOS applications assume that they are the sole foreground task.) – Disfigure 19/8, 2020 at 13:32

The sleep that is caused by hlt can be short (especially if running in a virtual machine or such), but what is short to us is already very long for the machine to burn if busy-looping. Idling properly allows the CPU to underclock and sleep for easily >95% of the time. I imagine wfi is the same as hlt for these purposes. – Disfigure 19/8, 2020 at 13:35

R

5

There should be one task/thread/process that is for idling, and it ought to look like your first bit of code.

Since the idle thread is setup to have the lowest priority, if the idle thread is running, that means that either there are no other threads to run or that all other threads are blocked.

When an interrupt happens that unblocks some other thread, the interrupt service routine should resume that blocked thread instead of the interrupted idle thread.

Note that a thread that blocks on IO is itself also interrupted — it is interrupted via its own use of ecall. That exception is a request for IO and causes this thread to block — it cannot be resumed until the IO request is satisfied.

Thus, a thread that is blocked on IO is suspended just the same as if it was interrupted — and a clock interrupt or IO interrupt is capable of resuming a different process than the one immediately interrupted, which will happen in the case that the idle process was running and some event that a process was waiting for happens.

What I do is use the scratch csr to point to the context block for the currently running process/thread. On interrupt, I save the fewest amount of registers necessary to (start to) service the interrupt. If the interrupt results in some other process/thread becoming runable, then when resuming from interupt, I check process priorities, and may choose a context switch instead of resuming whatever was interrupted. If I resume what was interrupted, its a quick restore. And to switch contexts, I finish saving the interrupted thread's CPU context, then resume another process/thread, switching the scratch register.

(For nested interrupts, I don't allow context switches on resume, but on interrupts after saving current context, I do set up the scratch csr to an interrupt stack of context blocks before re-enabling higher priority interrupts. Also, as a very minor optimization we can assume that a custom written idle thread doesn't need anything but its pc saved/restored.)

Roana answered 19/8, 2020 at 11:22 Comment(2)

This should give me a direction to go in. I'll need to implement a thread/process handling system, but it should be an interesting sub-project. Thanks for the help. I think I was too deep in the weeds to see the right way to use it. – Enlistment 19/8, 2020 at 17:46

This is a good answer, but it should perhaps be pointed out that it's not the only way to do it. If I get time, I'll post my own alternate answer here. – Revoice 23/9, 2023 at 10:8

A

5

So you would have to make it conditional against a global variable whose value is updated in the interrupt handler.

You have to do that regardless of the implementation of wfi as you don't know what event caused the hart to wake up.
You may have n interrupts enabled when executing wfi and any of them may have been raised.

wfi is an optimization, it saves power until something happens. As you noted the OS scheduler may find itself in the condition that no thread is schedulable (e.g. they all wait for IO or simply there's none) in that case it has to do something like (with all the necessary visibility and atomicity semantics):

while ( ! is_there_a_schedulable_thread());

That's just waiting.
But rather than spinning a tight loop (which may hurt performance and power) the scheduler can use:

while ( ! is_there_a_schedulable_thread())
{
  __wfi();
}

At worst it is just like the tight loop, at best it will pause the hart until an external interrupt happen (meaning that potentially an IO was completed and thus a thread may be free to run).

Even in the case of no threads, waking up every x microseconds (due to a timer interrupt) is better than wasting power looping.

wfi can also be useful on embedding programming if you happen to have all of the work on interrupt handlers (e.g. when a button is pushed or similar).
In the case, the main function would simply loop forever, just like the scheduler but without an exit condition.
A wfi instruction will greatly improve battery life.

You can't use wfi everywhere though or you may found your self waiting for an interrupt that never happens (in fact, it's a privileged instruction).

Think of it as an optimization for coordinating with the hardware.

In particular, it was not designed as a way to be sure an interrupt was fired:

void wait_for_int(int int_num)
{
   //Leave only interrupt int_num enabled
   enable_only_int(int_num);
   __wfi();
   restore_interrupts();
}

It could be used that way given a specific implementation of RISC-V but as you see from the pseudo-code, it's not really that convenient.
Disabling all but one interrupt is generally something that an OS cannot afford.
An embedded application could, though.

Assurgent answered 19/8, 2020 at 11:22 Comment(0)

R

5

There should be one task/thread/process that is for idling, and it ought to look like your first bit of code.

Since the idle thread is setup to have the lowest priority, if the idle thread is running, that means that either there are no other threads to run or that all other threads are blocked.

When an interrupt happens that unblocks some other thread, the interrupt service routine should resume that blocked thread instead of the interrupted idle thread.

Note that a thread that blocks on IO is itself also interrupted — it is interrupted via its own use of ecall. That exception is a request for IO and causes this thread to block — it cannot be resumed until the IO request is satisfied.

Thus, a thread that is blocked on IO is suspended just the same as if it was interrupted — and a clock interrupt or IO interrupt is capable of resuming a different process than the one immediately interrupted, which will happen in the case that the idle process was running and some event that a process was waiting for happens.

What I do is use the scratch csr to point to the context block for the currently running process/thread. On interrupt, I save the fewest amount of registers necessary to (start to) service the interrupt. If the interrupt results in some other process/thread becoming runable, then when resuming from interupt, I check process priorities, and may choose a context switch instead of resuming whatever was interrupted. If I resume what was interrupted, its a quick restore. And to switch contexts, I finish saving the interrupted thread's CPU context, then resume another process/thread, switching the scratch register.

(For nested interrupts, I don't allow context switches on resume, but on interrupts after saving current context, I do set up the scratch csr to an interrupt stack of context blocks before re-enabling higher priority interrupts. Also, as a very minor optimization we can assume that a custom written idle thread doesn't need anything but its pc saved/restored.)

Roana answered 19/8, 2020 at 11:22 Comment(2)

This should give me a direction to go in. I'll need to implement a thread/process handling system, but it should be an interesting sub-project. Thanks for the help. I think I was too deep in the weeds to see the right way to use it. – Enlistment 19/8, 2020 at 17:46

This is a good answer, but it should perhaps be pointed out that it's not the only way to do it. If I get time, I'll post my own alternate answer here. – Revoice 23/9, 2023 at 10:8

Recommended topics

Hot tags