I was wondering how Golang does preemption of goroutines, after 1.14 version where scheduler became non-cooperative and studied the source code, but it seems my knowledge is not enough to comprehend what's happening and why. So that's what i've figured out so far:
Golang version i used is 1.22.1
Golang has sysmon function which is basically a thread which does runtime management. That thread is responsible for triggering preemtion. (
go/src/runtime/proc.go
)Sysmon periodically(each 10ms, it seems) calls
retake
function, which iterates all P (processors, structure seems to be an abstraction for logical CPU core) and callspreemptone
function for P that were running or stuck in syscall for too long.preemptone
function callssignalM
which sendsSIGURG
to M (os thread, which currently running on P).Function
initsig
ingo/src/runtime/signal_unix.go
installs signal handler functionsighandler
which seems to be wrapped in assembly functionsigtramp
, here it's source code for amd64 (go/src/runtime/sys_linux_amd64.s
):// Called using C ABI. TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME|NOFRAME,$0 // Transition from C ABI to Go ABI. PUSH_REGS_HOST_TO_ABI0() // Set up ABIInternal environment: g in R14, cleared X15. get_tls(R12) MOVQ g(R12), R14 PXOR X15, X15 // Reserve space for spill slots. NOP SP // disable vet stack checking ADJSP $24 // Call into the Go signal handler MOVQ DI, AX // sig MOVQ SI, BX // info MOVQ DX, CX // ctx CALL ·sigtrampgo<ABIInternal>(SB) ADJSP $-24 POP_REGS_HOST_TO_ABI0() RET
sighandler
fucntion then will calldoSigPreempt
which will do some weird stack and instruction pointer manipulation to call functionasyncPreempt
:if ok, newpc := isAsyncSafePoint(gp, ctxt.sigpc(), ctxt.sigsp(), ctxt.siglr()); ok { // Adjust the PC and inject a call to asyncPreempt. ctxt.pushCall(abi.FuncPCABI0(asyncPreempt), newpc) }
Here it seems that
ctxt.pushCall
adjust rsp and rip registers so that rip points toasyncPreempt
andnewpc
(which, i believe is return address in current signal handler) is pushed on the stack:func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { // Make it look like we called target at resumePC. sp := uintptr(c.rsp()) sp -= goarch.PtrSize *(*uintptr)(unsafe.Pointer(sp)) = resumePC c.set_rsp(uint64(sp)) c.set_rip(uint64(targetPC)) }
The asyncPreempt
function is also implemented in assembly and the source code of it for x64 located in go/src/runtime/preempt_amd64.s
:
TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0
PUSHQ BP
MOVQ SP, BP
// Save flags before clobbering them
PUSHFQ
// obj doesn't understand ADD/SUB on SP, but does understand ADJSP
ADJSP $368
// But vet doesn't know ADJSP, so suppress vet stack checking
NOP SP
MOVQ AX, 0(SP)
MOVQ CX, 8(SP)
MOVQ DX, 16(SP)
MOVQ BX, 24(SP)
MOVQ SI, 32(SP)
MOVQ DI, 40(SP)
MOVQ R8, 48(SP)
MOVQ R9, 56(SP)
MOVQ R10, 64(SP)
MOVQ R11, 72(SP)
MOVQ R12, 80(SP)
MOVQ R13, 88(SP)
MOVQ R14, 96(SP)
MOVQ R15, 104(SP)
#ifdef GOOS_darwin
#ifndef hasAVX
CMPB internal∕cpu·X86+const_offsetX86HasAVX(SB), $0
JE 2(PC)
#endif
VZEROUPPER
#endif
MOVUPS X0, 112(SP)
MOVUPS X1, 128(SP)
MOVUPS X2, 144(SP)
MOVUPS X3, 160(SP)
MOVUPS X4, 176(SP)
MOVUPS X5, 192(SP)
MOVUPS X6, 208(SP)
MOVUPS X7, 224(SP)
MOVUPS X8, 240(SP)
MOVUPS X9, 256(SP)
MOVUPS X10, 272(SP)
MOVUPS X11, 288(SP)
MOVUPS X12, 304(SP)
MOVUPS X13, 320(SP)
MOVUPS X14, 336(SP)
MOVUPS X15, 352(SP)
CALL ·asyncPreempt2(SB)
MOVUPS 352(SP), X15
MOVUPS 336(SP), X14
MOVUPS 320(SP), X13
MOVUPS 304(SP), X12
MOVUPS 288(SP), X11
MOVUPS 272(SP), X10
MOVUPS 256(SP), X9
MOVUPS 240(SP), X8
MOVUPS 224(SP), X7
MOVUPS 208(SP), X6
MOVUPS 192(SP), X5
MOVUPS 176(SP), X4
MOVUPS 160(SP), X3
MOVUPS 144(SP), X2
MOVUPS 128(SP), X1
MOVUPS 112(SP), X0
MOVQ 104(SP), R15
MOVQ 96(SP), R14
MOVQ 88(SP), R13
MOVQ 80(SP), R12
MOVQ 72(SP), R11
MOVQ 64(SP), R10
MOVQ 56(SP), R9
MOVQ 48(SP), R8
MOVQ 40(SP), DI
MOVQ 32(SP), SI
MOVQ 24(SP), BX
MOVQ 16(SP), DX
MOVQ 8(SP), CX
MOVQ 0(SP), AX
ADJSP $-368
POPFQ
POPQ BP
RET
The function seems to save registers to stack and then call asyncPreempt2
, which is function written in go which in turn calls preemptPark
via mcall
function implemented in assembly:
/ mcall switches from the g to the g0 stack and invokes fn(g),
// where g is the goroutine that made the call.
// mcall saves g's current PC/SP in g->sched so that it can be restored later.
// It is up to fn to arrange for that later execution, typically by recording
// g in a data structure, causing something to call ready(g) later.
// mcall returns to the original goroutine g later, when g has been rescheduled.
// fn must not return at all; typically it ends by calling schedule, to let the m
// run other goroutines.
//
// mcall can only be called from g stacks (not g0, not gsignal).
//
// This must NOT be go:noescape: if fn is a stack-allocated closure,
// fn puts g on a run queue, and g executes before fn returns, the
// closure will be invalidated while it is still executing.
func mcall(fn func(*g))
And here's assembly implementation for x64:
// func mcall(fn func(*g))
// Switch to m->g0's stack, call fn(g).
// Fn must never return. It should gogo(&g->sched)
// to keep running g.
TEXT runtime·mcall<ABIInternal>(SB), NOSPLIT, $0-8
MOVQ AX, DX // DX = fn
// Save state in g->sched. The caller's SP and PC are restored by gogo to
// resume execution in the caller's frame (implicit return). The caller's BP
// is also restored to support frame pointer unwinding.
MOVQ SP, BX // hide (SP) reads from vet
MOVQ 8(BX), BX // caller's PC
MOVQ BX, (g_sched+gobuf_pc)(R14)
LEAQ fn+0(FP), BX // caller's SP
MOVQ BX, (g_sched+gobuf_sp)(R14)
// Get the caller's frame pointer by dereferencing BP. Storing BP as it is
// can cause a frame pointer cycle, see CL 476235.
MOVQ (BP), BX // caller's BP
MOVQ BX, (g_sched+gobuf_bp)(R14)
// switch to m->g0 & its stack, call fn
MOVQ g_m(R14), BX
MOVQ m_g0(BX), SI // SI = g.m.g0
CMPQ SI, R14 // if g == m->g0 call badmcall
JNE goodm
JMP runtime·badmcall(SB)
goodm:
MOVQ R14, AX // AX (and arg 0) = g
MOVQ SI, R14 // g = g.m.g0
get_tls(CX) // Set G in TLS
MOVQ R14, g(CX)
MOVQ (g_sched+gobuf_sp)(R14), SP // sp = g0.sched.sp
PUSHQ AX // open up space for fn's arg spill slot
MOVQ 0(DX), R12
CALL R12 // fn(g)
POPQ AX
JMP runtime·badmcall2(SB)
RET
And here i'm completely lost. I know it's quite a long story and sorry for that, but maybe someone could clarify what's going on here so i can understand at least part of it.
Why sighandler
is wrapped in assembly function sigtramp
?
What this line does? ctxt.pushCall(abi.FuncPCABI0(asyncPreempt), newpc)
I see that it adjusting rip and rsp but what it tries to achieve? Isn't adjusting rip to another address will cause cpu to just execute whatever stored in that address and continue execution of instructions stored there from that point? And if so what is the point of doing that because it looks like the same effect can be achieved by just calling asyncPreempt
, i'm clearly missing something.
Where exactly interrupted goroutine rip/stack/registers are stored when signal handler is called?
It seems like asyncPreempt does some preparation for storage of the registers and mcall
saves pushed data to gobuf here:
MOVQ BX, (g_sched+gobuf_sp)(R14)
but aren't regiters modified by go code before asyncPreempt
is called? How can it be guaranteed that they are not changed and belong to goroutine?
It also saves rip there from the stack:
MOVQ BX, (g_sched+gobuf_pc)(R14)
but i fail to understand what placed rip on the stack in the first place. Was it done before calling signal handler by kernel? I know that on x86_64 it rip should be placed on the top of stack frame (rbp + 8 basically), but i fail to track down where return address of current goroutine was stored before signal handler was called and what happened with it after.