how does malloc understand alignment?

Asked 6/1, 2012 at 2:5 Answered 25/5, 2022 at 14:38

following excerpted from here

pw = (widget *)malloc(sizeof(widget));
allocates raw storage. Indeed, the malloc call allocates storage that's big enough and suitably aligned to hold an object of type widget

also see fast pImpl from herb sutter, he said:

Alignment. Any memory Alignment. Any memory that's allocated dynamically via new or malloc is guaranteed to be properly aligned for objects of any type, but buffers that are not allocated dynamically have no such guarantee

I am curious about this, how does malloc know alignment of the custom type?

Yasukoyataghan answered 6/1, 2012 at 2:5 Comment(1)

new and malloc, by default, align address to 8 bytes (x86) or 16 bytes (x64), which is the optimal for most complex data. Also is sizeof() duty to get the correct size struct with internal padding for alignment, if necessary. – Gilbertson 1/5, 2016 at 2:54

Alignment requirements are recursive: The alignment of any struct is simply the largest alignment of any of its members, and this is understood recursively.

For example, and assuming that each fundamental type's alignment equals its size (this is not always true in general), the struct X { int; char; double; } has the alignment of double, and it will be padded to be a multiple of the size of double (e.g. 4 (int), 1 (char), 3 (padding), 8 (double)). The struct Y { int; X; float; } has the alignment of X, which is the largest and equal to the alignment of double, and Y is laid out accordingly: 4 (int), 4 (padding), 16 (X), 4 (float), 4 (padding).

(All numbers are just examples and could differ on your machine.)

Therefore, by breaking it down to the fundamental types, we only need to know a handful of fundamental alignments, and among those there is a well-known largest. C++ even defines a type max_align_t whose alignment is that largest alignment.

All malloc() needs to do is to pick an address that's a multiple of that value.

Misjudge answered 6/1, 2012 at 2:34 Comment(11)

The key thing to point out is that this doesn't include custom align directives to the compiler that might over-align data. – Hedge 28/8, 2013 at 4:55

Although if you use these you are already outside the scope of the standard, please note that memory allocated in this way probably won't meet the alignment requirements for built types such as _m256 that are available as extensions on some platforms. – Libelee 28/8, 2013 at 7:25

What happens when you specify a custom alignment via alignas that is larger than the largest alignment of a primitive datatype? – Haugh 9/2, 2017 at 15:25

@Curious: Support for extended alignment is implementation-defined. – Misjudge 10/2, 2017 at 1:27

std::max_align_t is largest alignment of scalar types, so a struct or a class can potentially have stricter alignment requirement than std::max_align_t. – Tildatilde 14/6, 2018 at 14:45

@MikhailVasilyev: Yes, but only if given an alignas, right? Otherwise a UDT's alignment is just made up recursively of the alignment of the members. – Misjudge 14/6, 2018 at 22:42

@KerrekSB Yes, but some widely-used classes from the standard library might turn out to be over-aligned on some platforms. F.ex. see this issue. Also virtual classes include a pointer to a virtual table so their alignment is determined not only by alignment of the members. – Tildatilde 15/6, 2018 at 11:49

malloc has no information on the type it is allocating for; the only parameter is the size of the allocated memory. The man page states it correctly: the allocated memory is aligned such that it can be used for for any data types, i.e. the alignment is the same for all types. – Constantina 14/9, 2019 at 23:20

@Massimo: I'm not sure there's a contradicton here. I first explained how alignment of a type is defined, and then how malloc can return fundamentally aligned memory. I never said that malloc knows about alignment of any one type. As long as your types don't have extended alignment, malloc gives you suitably aligned memory. – Misjudge 19/5, 2020 at 17:51

@KerrekSB You talk about struct alignment and you conclude with All malloc() needs to do is to pick an address that's a multiple of that value. Show us how do yo pass to malloc() the struct alignment... – P 19/5, 2020 at 18:32

@Massimo: I think the "that" refers to the "largest alignment" from the previous paragraph, which is exactly the alignment guarantee malloc provides, isn't it? – Misjudge 20/5, 2020 at 10:35

I think the most relevant part of the Herb Sutter quote is the part I've marked in bold:

Alignment. Any memory Alignment. Any memory that's allocated dynamically via new or malloc is guaranteed to be properly aligned for objects of any type, but buffers that are not allocated dynamically have no such guarantee

It doesn't have to know what type you have in mind, because it's aligning for any type. On any given system, there's a maximum alignment size that's ever necessary or meaningful; for example, a system with four-byte words will likely have a maximum of four-byte alignment.

This is also made clear by the malloc(3) man-page, which says in part:

The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable.

Lettuce answered 6/1, 2012 at 2:13 Comment(10)

what is meaning of any kind of variable? it does't answer my question. does it mean malloc will always use maximum alignment size in any given system, right? – Yasukoyataghan 6/1, 2012 at 2:29

@Chang: effectively, yes. Also note, the quote is wrong. new is only guaranteed to have "any" alignment when allocating char or unsigned char. For others, it may have a smaller alignment. – Squamosal 6/1, 2012 at 2:33

@Chang: Right, the maximum alignment size. "Suitably aligned for any kind of variable" means "suitably aligned for an int and suitably aligned for a pointer and suitably aligned for any struct and . . .". – Lettuce 6/1, 2012 at 2:35

@MooingDuck: new char[16] does not guarantee any alignment at all. (In general, new T[n] returns a pointer aligned for any type X where sizeof(X)<=sizeof(T).) – Irbm 28/8, 2013 at 5:14

@aschepler: That's not true. See the C++11 spec, section 5.3.4, clause 10; new char[16] is specified in a way that's assumed to guarantee that it's suitably aligned for any type X where sizeof(X)<=16. – Lettuce 28/8, 2013 at 5:33

@aschepler: No, new T[n] is only aligned for type T, unless T is (possibly signed/unsigned) char, then it's aligned for any type X where sizeof (X) <= n. – Drye 28/8, 2013 at 5:33

Um, yes. I somehow got several very wrong ideas from looking at the very same Standard paragraph. Reading comprehension fail. – Irbm 28/8, 2013 at 13:3

@BenVoigt: I think the "magic alignment" is only for char and unsigned char, but NOT for signed char. The C++ spec treats char and unsinged char as "byte" types, but does not cnosider signed char a "byte" type. (Implicitly, the spec doesn't actually say "byte types" as such.) – Squamosal 28/8, 2013 at 16:27

@MooingDuck: Looks like you're right, but I think that may be a defect in the Standard, since the accompanying note talks about generic character arrays which include all three: "this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed" (And that note in turn should probably say character sequences, since character arrays include wide character types as well) – Drye 28/8, 2013 at 16:36

@BenVoigt: I would posit that was the defect, since unsigned char is used in byte-like ways in §3.8/5-6 §3.9/2-4, §3.10/10, and §5.3.4/10, and in none of those is signed char mentioned or implied. Also, §3.9/1 calls out unsinged char specifically with "all possible bit patterns of the value representation represent numbers". – Squamosal 28/8, 2013 at 16:51

The only information that malloc() can use is the size of the request passed to it. In general, it might do something like round up the passed size to the nearest greater (or equal) power of two, and align the memory based on that value. There would likely also be an upper bound on the alignment value, such as 8 bytes.

The above is a hypothetical discussion, and the actual implementation depends on the machine architecture and runtime library that you're using. Maybe your malloc() always returns blocks aligned on 8 bytes and it never has to do anything different.

Charmainecharmane answered 6/1, 2012 at 2:10 Comment(5)

In summary then, malloc uses the 'worst case' alignment because it doesn't know any better. Does that mean that calloc can be smarter because it takes two args, the number of objects and the size of a single object? – Larrisa 6/1, 2012 at 2:15

Maybe. Maybe not. You'd have to look at your runtime library source to find out. – Charmainecharmane 6/1, 2012 at 2:16

-1, sorry. Your answer includes the truth, but it also includes disinformation. It's not a "maybe, maybe not" thing; it's specifically documented to work in a way that doesn't depend on the size. (Dunno why not, though. It seems like it would make perfect sense for it to do so.) – Lettuce 6/1, 2012 at 2:18

The answer to my own question is No. I found this: "The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable." Seem like the memalign function is potentially useful though: wwwcgi.rdg.ac.uk:8081/cgi-bin/cgiwrap/wsi14/poplog/man/3C/… – Larrisa 6/1, 2012 at 2:22

see ruakh's reply, so malloc will always use maximum alignment size in any given system, right? – Yasukoyataghan 6/1, 2012 at 2:26

1) Align to the least common multiple of all alignments. e.g. if ints require 4 byte alignment, but pointers require 8, then allocate everything to 8 byte alignment. This causes everything to be aligned.

2) Use the size argument to determine correct alignment. For small sizes you can infer the type, such as malloc(1) (assuming other types sizes are not 1) is always a char. C++ new has the benefit of being type safe and so can always make alignment decisions this way.

Whacking answered 6/1, 2012 at 2:18 Comment(2)

Can you expand the acronym LCM? I can guess, but I shouldn't have to. – Squamosal 6/1, 2012 at 2:34

Also, there are other types in C++ that can be 1 byte. However, your implication is correct, it can still align based of the size of the type. – Squamosal 6/1, 2012 at 2:36

Previous to C++11 alignment was treated fairly simple by using the largest alignment where exact value was unknown and malloc/calloc still work this way. This means malloc allocation is correctly aligned for any type.

Wrong alignment may result in undefined behavior according to the standard but I have seen x86 compilers being generous and only punishing with lower performance.

Note that you also can tweak alignment via compiler options or directives. (pragma pack for VisualStudio for example).

But when it comes to placement new, then C++11 brings us new keywords called alignof and alignas. Here is some code which shows the effect if compiler max alignment is greater then 1. The first placement new below is automatically good but not the second.

#include <iostream>
#include <malloc.h>
using namespace std;
int main()
{
        struct A { char c; };
        struct B { int i; char c; };

        unsigned char * buffer = (unsigned char *)malloc(1000000);
        long mp = (long)buffer;

        // First placment new
        long alignofA = alignof(A) - 1;
        cout << "alignment of A: " << std::hex << (alignofA + 1) << endl;
        cout << "placement address before alignment: " << std::hex << mp << endl;
        if (mp&alignofA)
        {
            mp |= alignofA;
            ++mp;
        }
        cout << "placement address after alignment : " << std::hex <<mp << endl;
        A * a = new((unsigned char *)mp)A;
        mp += sizeof(A);

        // Second placment new
        long alignofB = alignof(B) - 1;
        cout << "alignment of B: " <<  std::hex << (alignofB + 1) << endl;
        cout << "placement address before alignment: " << std::hex << mp << endl;
        if (mp&alignofB)
        {
            mp |= alignofB;
            ++mp;
        }
        cout << "placement address after alignment : " << std::hex << mp << endl;
        B * b = new((unsigned char *)mp)B;
        mp += sizeof(B);
}

I guess performance of this code can be improved with some bitwise operations.

EDIT: Replaced expensive modulo computation with bitwise operations. Still hoping that somebody finds something even faster.

Bistre answered 28/8, 2013 at 4:42 Comment(4)

It's not actually the compiler, it's the hardware itself. On x86 a misaligned memory access simply forces the processor to fetch the two sides of the memory boundary and piece the result together, so it's always "correct" if slower. On e.g. some ARM processors, you would get a bus error and a program crash. This is a bit of a problem because many programmers are never exposed to anything else than x86, and so may not know that the behaviour is actually undefined instead of merely decreasing performance. – Cheeks 6/9, 2014 at 15:29

You are correct, its the hardware or cpu-microcode software but not the actual compiler that saves you on the x86 architecture. I really wonder why there is no more convenient api to handle this. As if C/C++ designers wanted developers to step into the trap. Reminds me of std::numeric_limits<double>::min() trap. Anyone got that one right the first time? – Bistre 6/9, 2014 at 23:56

Well, once you know what is going on, it's not too hard to change your programming style from all sorts of crazy type-punning to well-typed code, fortunately. The C type system makes it fairly easy to preserve type alignment as long as you don't go doing insane bit manipulation stuff without paying attention. Now pointer-aliasing-free code on the other hand has some much tougher semantics... – Cheeks 7/9, 2014 at 12:39

I do not understand. You have the problem whenever you have your own little heap that you manage yourself. What use of placement new are you thinking about in your comment? – Bistre 7/9, 2014 at 14:23

malloc has no knowledge of what it is allocating for because its parameter is just total size. It just aligns to an alignment that is safe for any object.

Basenji answered 8/9, 2018 at 20:52 Comment(0)

You might find out the allocation bits for your malloc()-implementation with this small C-program:

#include <stdlib.h>
#include <stdio.h>

int main()
{
    size_t
        find = 0,
        size;
    for( unsigned i = 1000000; i--; )
        if( size = rand() & 127 )
            find |= (size_t)malloc( size );
    char bits = 0;
    for( ; !(find & 1); find >>= 1, ++bits );
    printf( "%d", (int)bits );
}

Brant answered 25/5, 2022 at 14:38 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags