Why is the default alignment 8 byte for int64_t
(e.g. long long
) in 32 bit x86 ABIs? 4 byte alignment would appear to be fine, because it can only be accessed as two 4B halves.
Interesting point: If you only ever load it as two halves into 32bit GP registers, then 4B alignment means those operations will happen with their natural alignment.
However, it's probably best if both halves of the variable are in the same cache line, since almost all accesses will read / write both halves. Aligning to the natural alignment of the whole thing takes care of that, even ignoring the other reasons below.
32bit x86 can load 64bit integers in a single 64bit-load using MMX or SSE2 movq
. Handling 64bit add/sub/shift/ and bitwise booleans using vector instructions is more efficient (single instruction), as long as you don't need immediate constants or mul or div. The vector instructions with 64b elements are still available in 32b mode.
Atomic 64bit compare-and-exchange is also available in 32bit mode (lock CMPXCHG8B m64
works just like 64bit mode's lock CMPXCHG16B m128
, using two implicit registers (edx:eax)). IDK what kind of penalty it has for crossing a cache-line boundary.
Modern x86 CPUs have essentially no penalty for misaligned loads/stores unless they cross cache-line boundaries, which is why I'm only saying that, and not saying that misaligned 64b would be bad in general. See the links in the x86 wiki, esp. Agner Fog's guides.
© 2022 - 2024 — McMap. All rights reserved.
int32
should be aligned on a 32-bit boundary, anint64
on a 64-bit boundary, and so on. A char will fit just fine anywhere." Along long
Is 64bits in size, so it is best aligned using 8 byte alignment. – Nathanint64_t
, then you'd need special, new alignment rules forstd::atomic<int64_t>
. – Desist