Misaligned address using virtual inheritance
Asked Answered
D

1

9

The following apparently valid code produces a misaligned address runtime error using the UndefinedBehaviorSanitizer sanitiser.

#include <memory>
#include <functional>

struct A{
  std::function<void()> data; // seems to occur only if data is a std::function
} ;

struct B{
  char data; // occurs only if B contains a member variable
};

struct C:public virtual A,public B{

};

struct D:public virtual C{

};

void test(){
  std::make_shared<D>();
}

int main(){
  test();
  return 0;
}

Compiling and executing on a macbook with clang++ -fsanitize=undefined --std=c++11 ./test.cpp && ./a.out produces the output runtime error: constructor call on misaligned address 0x7fe584500028 for type 'C', which requires 16 byte alignment [...].

I would like to understand how and why the error occurs.

Denti answered 28/9, 2017 at 16:38 Comment(3)
Cannot reproduce with clang 5.0.0. Which version are you using?Jeannettejeannie
I have reproduced the error using clang 6.0.0 in Wandbox. See here.Denti
I ran into a similar issue recently. This answer may be relevant, the author also filed this Clang bug report.Scarberry
R
8

Since alignment of std::function<void()> is 16 and size is 48 lets simplify. This code has the same behavior but is easier to understand:

struct alignas(16) A
{ char data[48]; };

struct B
{ char data; };

struct C : public virtual A, public B
{};

struct D : public virtual C
{};

int main()
{
    D();
}

We have the following alignments and sizes:

                     |__A__|__B__|__C__|__D__|
 alignment (bytes):  |  16 |  1  |  16 |  16 |
      size (bytes):  |  48 |  1  |  64 |  80 |

Now lets see how this looks like in memory. More explanation on that can be found in this great answer.

  • A: char[48] + no padding == 48B
  • B: char[1] + no padding == 1B
  • C: A* + B + A + 7 bytes of padding (align to 16) == 64B
  • D: C* + C + 8 bytes of padding (align to 16) == 80B

Now it is easy to see that the offset of C inside D is 8 bytes, but C is aligned to 16. Thus error, which is helpfully accompanied by this pseudo-graphic

00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00
             ^ 

Here each zero is 1 byte.

UPDATE: Where and how to place padding is up to a C++ compiler. Standard does not specify it. It looks like with the size of padding it has, clang is unable to align everything in D. One way to mitigate the misalignment is to design your classes carefully so that they have the same alignment (e.g., 8 bytes).

Reedreedbird answered 28/9, 2017 at 17:54 Comment(13)
Great explanation and reference! So then why doesn't the compiler already notice the mismatch and align C correctly inside D? Is there a way to prevent such mistakes, for example if the alignment of a member such as std::function is not known?Denti
@Lars I think I am partially wrong with how exactly is the stuff layed-out in memory (padding can be freely moved by the compiler between any elements). The compiler has a set of constrains: object's alignments. It also has some padding space, that it can wiggle around. In this case there is no way to put this padding space so that all objects will keep their alignment. If you put alignas(16) for B the error will go away, as the compiler has more padding to play with. I will try to update the post tomorrow.Reedreedbird
@Lars as for preventing misalignment in general, I don't know the answer. But this is definitely an interesting question. Side-note: misalignment is not dangerous on most architectures as it only causes performance penalties.Reedreedbird
Adding alignas attributes with the base classes as arguments (as in struct alignas(A) alignas(B) C : public virtual A, public B{ [...] };) fixed these alignment issues for me. I'm not really sure if it is a general and portable solution though.Denti
@Lars one portable solution is to use unique_ptr of std::function<void()> in A. This should make all your classes 8-bytes aligned.Reedreedbird
@Reedreedbird "misalignment is not dangerous on most architectures" I believe that on almost all architectures (except the most popular one for personal computing), an unaligned access will terminate the program, unless there is some special handling code installed.Valve
@Valve do you have any good reference on that? I can add it to the post.Reedreedbird
"The Load Register instruction performs a rotated load" LDR on ARM v5 and earlier STR: "The Store Register instruction treats the bottom two bits as zero, so writing a word to &8002 will actually write it to &8000"Valve
also: How does the ARM Compiler support unaligned accesses "ARM11 and Cortex-A/R processors can deal with unaligned accesses in hardware, removing the need for software routines. Support for unaligned accesses is limited to a sub-set of load/store instructions"Valve
See Aligned and unaligned memory accesses?Valve
See "The original 68000 was a processor with two-byte granularity and lacked the circuitry to cope with unaligned addresses. When presented with such an address, the processor would throw an exception. The original Mac OS didn't take very kindly to this exception, and would usually demand the user restart the machine. Ouch" from Data alignment: Straighten up and fly rightValve
"A*" if clang is like GCC (which is necessary for ABI compatibility), there is no pointer to the base class subobject hereValve
If your using simd this totally breaks everything as clang will generate a movaps for simd instructions because it thinks its aligned and will crash hard.Hangeron

© 2022 - 2024 — McMap. All rights reserved.