While reading the source codes of Go, I have a question about the code in src/sync/once.go:
func (o *Once) Do(f func()) {
// Note: Here is an incorrect implementation of Do:
//
// if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
// f()
// }
//
// Do guarantees that when it returns, f has finished.
// This implementation would not implement that guarantee:
// given two simultaneous calls, the winner of the cas would
// call f, and the second would return immediately, without
// waiting for the first's call to f to complete.
// This is why the slow path falls back to a mutex, and why
// the atomic.StoreUint32 must be delayed until after f returns.
if atomic.LoadUint32(&o.done) == 0 {
// Outlined slow-path to allow inlining of the fast-path.
o.doSlow(f)
}
}
func (o *Once) doSlow(f func()) {
o.m.Lock()
defer o.m.Unlock()
if o.done == 0 {
defer atomic.StoreUint32(&o.done, 1)
f()
}
}
Why is atomic.StoreUint32
used, rather than, say o.done = 1
? Are these not equivalent? What are the differences?
Must we use the atomic operation (atomic.StoreUint32
) to make sure that other goroutines can observe the effect of f()
before o.done
is set to 1 on a machine with weak memory model?
atomic.LoadUint32
ando.done = 1
– Brightnessa:=1
in the body off()
; 2) after theo.done=1
is executed by goroutine A, another goroutine B observed that o.done is 1 by usingatomic.LoadUint32(&o.done)
, but B still can not observe that a is 1 yet, because a normal assignmento.done=1
can not guarantee that caches in other cpus would be flushed beforeo.m.Unlock()
is executed.atomic.StoreUint32(&o.done, 1)
can make surea:=1
is coherent in all cpus' caches beforeo.done is 1
. – Yvor