Why does GCC 6 assume data is 16-byte aligned?
Asked Answered
P

1

11

(Sorry in advance for not having managed to reduce my problem to a simple failing test case...)

I have faced issues with upgrading to GCC 6.3.0 to build our codebase (relevant flags: -O3 -m32).

Specifically, my application segfaults within a struct ctor call because of GCC optimizations.

In this ctor, GCC used movaps :

movaps %xmm0,0x30a0(%ebx)

movaps requires the operand to be 16-byte aligned. But at this point in time, %ebx points to my object, which is not necessarily 16-byte aligned. From glibc :

“The address of a block returned by malloc or realloc in GNU systems is always a multiple of eight (or sixteen on 64-bit systems).“

Hence the segfault (when built with -O3 -m32).

Why does it seem like GCC assumed the allocated object would be 16-byte aligned ? Am I misunderstanding something ?

Notes:

  • No alignment hints or attributes on this struct
  • Object has been initialized via default new operator
  • Depends on the level of optimization:
    • PASS: -m32 -O2
    • FAIL: -m32 -O2 -ftree-slp-vectorize
    • PASS: -m32 -O3 -fno-tree-slp-vectorize
    • FAIL: -m32 -O3

This other project, seem to have hit similar issues : https://github.com/godotengine/godot/issues/4623

Their investigation points to -fvect-cost-model=dynamic. Investigation on my codebase rather points to -ftree-slp-vectorize.

Punish answered 16/2, 2017 at 10:43 Comment(6)
Using alignof on your object would tell you what alignment the compiler thinks it needs. Sounds like it should not be ≥16, but would hurt to check.Skink
It wouldn't be an error to have alignof==16 but that would mean you couldn't use glibc's malloc. Then again, the Standard restricts the implementation's malloc, not the OS'es malloc. GCC might need to wrap glibc. (Which I think should be done anyway)Orthostichy
alignof returns 64. alignof(max_align_t) returns 8 (expected). There is no user requirement on alignment on either the object or its members; why would alignof be 64 ? I am not using malloc directly but assumed that new was using it.Punish
Well well well... A few levels deep within the struct, I just found a required alignment to cache line size. This is the reason for alignof==64 on the whole thing. And it also means, as both @Skink and @Orthostichy pointed out, that I cannot use 8-byte aligned new. Used to work by chance with previous versions and lesser optimizations because GCC was not taking advantage of the overall alignment...Punish
(Feel free to make your comments answers for me to validate)Punish
It's a pretty well known problem that GCC's vectorizer occasionally screws up alignment. Though I wasn't aware it was still a problem as of 6.3.M
S
3

It's possible that the compiler has a reason to think the object has an alignment ≥ 16 bytes. It's possible to find out what the compiler thinks the alignment is by using the alignof() operator in C++11. GCC has an extension __alignof__ that is available in C and earlier C++ versions.

A structure's alignment is the highest alignment of anything in it, recursively. There could be something in there with higher alignment than expected.

While the C++11 standard guarantees that memory returned by new is aligned to the value needed by the "fundamental alignment requirement" of any object, this only applies to standard types and objects made of them. Using C++11 alignas() or the __attribute__((aligned(x))) GCC extension to request higher alignment might exceed what new provides.

A solution to this would be to use std::aligned_alloc() (C++11 or later) or posix_memalign() (POSIX-only but < C++11) to get aligned memory. This could be coupled with the placement form of the new operator to construct the object in that memory or class specific operator overloads of new and delete.

Skink answered 17/2, 2017 at 23:33 Comment(1)
I also want to point out that in the c++17 standard, dynamic allocation will be made to honor even over-aligned data. open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0035r1.htmlPunish

© 2022 - 2024 — McMap. All rights reserved.