Golang goroutine preemption
Asked Answered
C

0

8

I was wondering how Golang does preemption of goroutines, after 1.14 version where scheduler became non-cooperative and studied the source code, but it seems my knowledge is not enough to comprehend what's happening and why. So that's what i've figured out so far:

Golang version i used is 1.22.1

  1. Golang has sysmon function which is basically a thread which does runtime management. That thread is responsible for triggering preemtion. (go/src/runtime/proc.go)

  2. Sysmon periodically(each 10ms, it seems) calls retake function, which iterates all P (processors, structure seems to be an abstraction for logical CPU core) and calls preemptone function for P that were running or stuck in syscall for too long.

  3. preemptone function calls signalM which sends SIGURG to M (os thread, which currently running on P).

  4. Function initsig in go/src/runtime/signal_unix.go installs signal handler function sighandler which seems to be wrapped in assembly function sigtramp, here it's source code for amd64 (go/src/runtime/sys_linux_amd64.s):

     // Called using C ABI.
    
         TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME|NOFRAME,$0
             // Transition from C ABI to Go ABI.
             PUSH_REGS_HOST_TO_ABI0()
    
         // Set up ABIInternal environment: g in R14, cleared X15.
         get_tls(R12)
         MOVQ    g(R12), R14
         PXOR    X15, X15
    
         // Reserve space for spill slots.
         NOP SP      // disable vet stack checking
         ADJSP   $24
    
         // Call into the Go signal handler
         MOVQ    DI, AX  // sig
         MOVQ    SI, BX  // info
         MOVQ    DX, CX  // ctx
         CALL    ·sigtrampgo<ABIInternal>(SB)
    
         ADJSP   $-24
    
         POP_REGS_HOST_TO_ABI0()
         RET
    
  5. sighandler fucntion then will call doSigPreempt which will do some weird stack and instruction pointer manipulation to call function asyncPreempt:

     if ok, newpc := isAsyncSafePoint(gp, ctxt.sigpc(), ctxt.sigsp(), ctxt.siglr()); ok {
         // Adjust the PC and inject a call to asyncPreempt.
         ctxt.pushCall(abi.FuncPCABI0(asyncPreempt), newpc)
     }
    
  6. Here it seems that ctxt.pushCall adjust rsp and rip registers so that rip points to asyncPreempt and newpc (which, i believe is return address in current signal handler) is pushed on the stack:

     func (c *sigctxt) pushCall(targetPC, resumePC uintptr) {
         // Make it look like we called target at resumePC.
         sp := uintptr(c.rsp())
         sp -= goarch.PtrSize
         *(*uintptr)(unsafe.Pointer(sp)) = resumePC
         c.set_rsp(uint64(sp))
         c.set_rip(uint64(targetPC))
     }
    

The asyncPreempt function is also implemented in assembly and the source code of it for x64 located in go/src/runtime/preempt_amd64.s:

    TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0
        PUSHQ BP
        MOVQ SP, BP
        // Save flags before clobbering them
        PUSHFQ
        // obj doesn't understand ADD/SUB on SP, but does understand ADJSP
        ADJSP $368
        // But vet doesn't know ADJSP, so suppress vet stack checking
        NOP SP
        MOVQ AX, 0(SP)
        MOVQ CX, 8(SP)
        MOVQ DX, 16(SP)
        MOVQ BX, 24(SP)
        MOVQ SI, 32(SP)
        MOVQ DI, 40(SP)
        MOVQ R8, 48(SP)
        MOVQ R9, 56(SP)
        MOVQ R10, 64(SP)
        MOVQ R11, 72(SP)
        MOVQ R12, 80(SP)
        MOVQ R13, 88(SP)
        MOVQ R14, 96(SP)
        MOVQ R15, 104(SP)
        #ifdef GOOS_darwin
        #ifndef hasAVX
        CMPB internal∕cpu·X86+const_offsetX86HasAVX(SB), $0
        JE 2(PC)
        #endif
        VZEROUPPER
        #endif
        MOVUPS X0, 112(SP)
        MOVUPS X1, 128(SP)
        MOVUPS X2, 144(SP)
        MOVUPS X3, 160(SP)
        MOVUPS X4, 176(SP)
        MOVUPS X5, 192(SP)
        MOVUPS X6, 208(SP)
        MOVUPS X7, 224(SP)
        MOVUPS X8, 240(SP)
        MOVUPS X9, 256(SP)
        MOVUPS X10, 272(SP)
        MOVUPS X11, 288(SP)
        MOVUPS X12, 304(SP)
        MOVUPS X13, 320(SP)
        MOVUPS X14, 336(SP)
        MOVUPS X15, 352(SP)
        CALL ·asyncPreempt2(SB)
        MOVUPS 352(SP), X15
        MOVUPS 336(SP), X14
        MOVUPS 320(SP), X13
        MOVUPS 304(SP), X12
        MOVUPS 288(SP), X11
        MOVUPS 272(SP), X10
        MOVUPS 256(SP), X9
        MOVUPS 240(SP), X8
        MOVUPS 224(SP), X7
        MOVUPS 208(SP), X6
        MOVUPS 192(SP), X5
        MOVUPS 176(SP), X4
        MOVUPS 160(SP), X3
        MOVUPS 144(SP), X2
        MOVUPS 128(SP), X1
        MOVUPS 112(SP), X0
        MOVQ 104(SP), R15
        MOVQ 96(SP), R14
        MOVQ 88(SP), R13
        MOVQ 80(SP), R12
        MOVQ 72(SP), R11
        MOVQ 64(SP), R10
        MOVQ 56(SP), R9
        MOVQ 48(SP), R8
        MOVQ 40(SP), DI
        MOVQ 32(SP), SI
        MOVQ 24(SP), BX
        MOVQ 16(SP), DX
        MOVQ 8(SP), CX
        MOVQ 0(SP), AX
        ADJSP $-368
        POPFQ
        POPQ BP
        RET

The function seems to save registers to stack and then call asyncPreempt2, which is function written in go which in turn calls preemptPark via mcall function implemented in assembly:

    / mcall switches from the g to the g0 stack and invokes fn(g),
    // where g is the goroutine that made the call.
    // mcall saves g's current PC/SP in g->sched so that it can be restored later.
    // It is up to fn to arrange for that later execution, typically by recording
    // g in a data structure, causing something to call ready(g) later.
    // mcall returns to the original goroutine g later, when g has been rescheduled.
    // fn must not return at all; typically it ends by calling schedule, to let the m
    // run other goroutines.
    //
    // mcall can only be called from g stacks (not g0, not gsignal).
    //
    // This must NOT be go:noescape: if fn is a stack-allocated closure,
    // fn puts g on a run queue, and g executes before fn returns, the
    // closure will be invalidated while it is still executing.
    func mcall(fn func(*g))

And here's assembly implementation for x64:

    // func mcall(fn func(*g))
    // Switch to m->g0's stack, call fn(g).
    // Fn must never return. It should gogo(&g->sched)
    // to keep running g.
    TEXT runtime·mcall<ABIInternal>(SB), NOSPLIT, $0-8
        MOVQ    AX, DX  // DX = fn
    
        // Save state in g->sched. The caller's SP and PC are restored by gogo to
        // resume execution in the caller's frame (implicit return). The caller's BP
        // is also restored to support frame pointer unwinding.
        MOVQ    SP, BX  // hide (SP) reads from vet
        MOVQ    8(BX), BX   // caller's PC
        MOVQ    BX, (g_sched+gobuf_pc)(R14)
        LEAQ    fn+0(FP), BX    // caller's SP
        MOVQ    BX, (g_sched+gobuf_sp)(R14)
        // Get the caller's frame pointer by dereferencing BP. Storing BP as it is
        // can cause a frame pointer cycle, see CL 476235.
        MOVQ    (BP), BX // caller's BP
        MOVQ    BX, (g_sched+gobuf_bp)(R14)
    
        // switch to m->g0 & its stack, call fn
        MOVQ    g_m(R14), BX
        MOVQ    m_g0(BX), SI    // SI = g.m.g0
        CMPQ    SI, R14 // if g == m->g0 call badmcall
        JNE goodm
        JMP runtime·badmcall(SB)
    goodm:
        MOVQ    R14, AX     // AX (and arg 0) = g
        MOVQ    SI, R14     // g = g.m.g0
        get_tls(CX)     // Set G in TLS
        MOVQ    R14, g(CX)
        MOVQ    (g_sched+gobuf_sp)(R14), SP // sp = g0.sched.sp
        PUSHQ   AX  // open up space for fn's arg spill slot
        MOVQ    0(DX), R12
        CALL    R12     // fn(g)
        POPQ    AX
        JMP runtime·badmcall2(SB)
        RET

And here i'm completely lost. I know it's quite a long story and sorry for that, but maybe someone could clarify what's going on here so i can understand at least part of it.

Why sighandler is wrapped in assembly function sigtramp?

What this line does? ctxt.pushCall(abi.FuncPCABI0(asyncPreempt), newpc) I see that it adjusting rip and rsp but what it tries to achieve? Isn't adjusting rip to another address will cause cpu to just execute whatever stored in that address and continue execution of instructions stored there from that point? And if so what is the point of doing that because it looks like the same effect can be achieved by just calling asyncPreempt, i'm clearly missing something.

Where exactly interrupted goroutine rip/stack/registers are stored when signal handler is called? It seems like asyncPreempt does some preparation for storage of the registers and mcall saves pushed data to gobuf here:

MOVQ    BX, (g_sched+gobuf_sp)(R14)

but aren't regiters modified by go code before asyncPreempt is called? How can it be guaranteed that they are not changed and belong to goroutine?

It also saves rip there from the stack:

MOVQ    BX, (g_sched+gobuf_pc)(R14)

but i fail to understand what placed rip on the stack in the first place. Was it done before calling signal handler by kernel? I know that on x86_64 it rip should be placed on the top of stack frame (rbp + 8 basically), but i fail to track down where return address of current goroutine was stored before signal handler was called and what happened with it after.

Coniferous answered 19/4 at 11:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.