Interrupting instruction in the middle of execution
Asked Answered
S

1

8

Suppose that CPU is running an assembly instruction, say, FOO that will be executed in several clocks (e.g. 10)

An interrupt request has come just in the middle of executing FOO and processor needs to interrupt. Does it wait until command is properly executed, or is FOO aborted and will be restarted? Does it behave differently considering different types of interrupts' prioritization?

Sternutatory answered 8/12, 2018 at 21:32 Comment(3)
Yes it waits. See the documentation in particular Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3: System Programming Guide, section 6.6 PROGRAM OR TASK RESTART which says "All interrupts are guaranteed to be taken on an instruction boundary."Raglan
see stackoverflow.com/questions/8902132/… -- when the interrupt signal is received the instructions in the pipeline are nearly always flushed, thrown awayPhonics
REP type instructions such as REP MOVSB, or REP STOSB do get interrupted, and then continued when the interrupt returns with updated register values.Lonesome
F
15

The CPU has the option of deciding to do either one, i.e. deciding when the interrupt was handled relative to the original instruction stream.

Instructions that have been issued, but not yet dispatched to an execution unit, are cancelled in current implementations from AMD and Intel. When an interrupt occurs, what happens to instructions in the pipeline?

With out-of-order execution, typically dozens of instructions are in flight, and more than one can literally be in the middle of executing in an ALU at once.

But it's an interesting question whether or not low-latency instructions like add or imul that have started executing but not yet retired will be allowed to complete and update the architectural state that the interrupt handler sees or not.

If not, it's probably because of the difficulty of building the logic for detecting how many more contiguous instructions will be ready to retire "soon", beyond the current retirement state. Interrupts are rare (one per thousands of instructions at worst, or one per millions of instructions with low I/O load), so the benefit of squeezing a bit more throughput of surrounding code around interrupt handling is low. And any potential cost in interrupt latency would be a downside.


Some instructions, especially micro-coded ones, have mechanisms for being interrupted without having to restart from scratch. For example

  • rep movsb can leave RSI, RDI, and RCX updated to part-way through a copy (so it will finish the copy on restart). The other REP-string instructions can similarly be interrupted. Only a single count of the operation is atomic with respect to interrupts.

    Even when single-stepping in a debugger (by setting TF), the CPU breaks after each count, so from an interrupt PoV it really is repeating a separate movsb instruction RCX times.

  • AVX2 gathers like vpgatherdd have an input mask vector that shows which elements to gather vs. ignore. It clears mask elements after successfully gathering the corresponding index. On an exception (e.g. page fault), the faulting element is the right-most element with its mask still set (gather order is not guaranteed, but fault order is, see Intel's manual entry).

This makes it possible for a gather to succeed without needing all the relevant pages to be mapped at the same time. Evicting an already-gathered element while paging in another can't lead to an infinite loop, even in a memory-pressure corner case. Forward progress is guaranteed.

On an async interrupt, the hardware could similarly leave the gather partially done, using the mask to record progress. IDK if any hardware actually does that, but the ISA design leaves that option open.

Anyway, this is why you need to keep creating a fresh all-ones mask inside the loop for every gather.

AVX512 gathers and scatters have the same mechanism but with the a mask register instead of a vector register. http://felixcloutier.com/x86/VPSCATTERDD:VPSCATTERDQ:VPSCATTERQD:VPSCATTERQQ.html


Very slow instructions without a mechanism for being interrupted and restarting include wbinvd. (Sync all caches to main memory and invalidate them). Intel's manual mentions that wbinvd does delay interrupts.

As a consequence, the use of the WBINVD instruction can have an impact on logical processor interrupt/event response time.

This is probably why it's a privileged instruction. There's lots of stuff that user-space can do to make the system slow (e.g. use up lots of memory bandwidth), but it can't increase interrupt latency too dramatically. (Stores that have retired from the ROB but not yet committed to L1d can increase interrupt latency because they have to happen and can't be aborted. But creating a pathological case of lots of scattered cache-miss stores in flight is harder, and the store buffer size is small.)


Related:

Fidgety answered 8/12, 2018 at 22:27 Comment(3)
A few instructions (hlt, mwait) must be interruptible to make sense; and there's a special case for "instruction after move to SS". It's probably better (for interrupt handling and not optimisation) to think of the behaviour of rep as "re-execute the instruction N times" so that the behaviour of (e.g.) mov ss,ax; rep movsd; is easy to guess (IRQs inhibited for the first movsd but not subsequent/repeated iterations).Paleography
There's also a "nasty case" involving syscall that might be worth mentioning. If an SMI, NMI or machine check can occur after syscall but before next instruction; causing CPU to try to start interrupt handler with "CPL=0 with undefined/CPL=3's stack" if IST isn't used (and causing a "not re-entrant" disaster if IST is used).Paleography
My testing indicates that the instruction the interrupt will return to will be the instruction after the oldest unretired instruction when the interrupt is received. This means that except for this instruction, all unretired instructions are thrown away. That one instruction does get to complete, however, so it is the case that sometimes an executing instruction is allowed to complete. Even if other newer instructions adjacent to this one are ready to retire, they will not be retired: so a full block of 4 instructions is not retired at interrupt, just one.Full

© 2022 - 2024 — McMap. All rights reserved.