What's the difference between sizeof and alignof?

Asked 8/7, 2012 at 21:40 Answered 26/6, 2021 at 11:29

#include <iostream>

#define SIZEOF_ALIGNOF(T) std::cout<< sizeof(T) << '/' << alignof(T) << std::endl

int main(int, char**)
{
        SIZEOF_ALIGNOF(unsigned char);
        SIZEOF_ALIGNOF(char);
        SIZEOF_ALIGNOF(unsigned short int);
        SIZEOF_ALIGNOF(short int);
        SIZEOF_ALIGNOF(unsigned int);
        SIZEOF_ALIGNOF(int);
        SIZEOF_ALIGNOF(float);
        SIZEOF_ALIGNOF(unsigned long int);
        SIZEOF_ALIGNOF(long int);
        SIZEOF_ALIGNOF(unsigned long long int);
        SIZEOF_ALIGNOF(long long int);
        SIZEOF_ALIGNOF(double);
}

will output

1/1 1/1 2/2 2/2 4/4 4/4 4/4 4/4 4/4 8/8 8/8 8/8

I think I don't get what the alignment is...?

Andrea answered 8/7, 2012 at 21:40 Comment(3)

try this again with structs instead of native types. – Cowage 8/7, 2012 at 21:40

Returns alignment in bytes (an integer power of two) required for any instance of the given type - en.cppreference.com/w/cpp/language/alignof. sizeof just gives the size, in bytes, of course. – Unspoiled 8/7, 2012 at 21:41

Maybe worth mentioning - sizeof is always a multiple of alignof – Picco 8/7, 2012 at 22:3

Well, "memory" is basically a huge array of bytes. However, most larger things like integers need more than 1 byte to store them -- a 32 bit value, for example, would use 4 consecutive bytes of memory.

Now, the memory modules in your computer aren't usually "bytes"; they are also organized with a few bytes "in parallel", like blocks of 4 bytes.

For a CPU, it's much easier = more efficient = better performance to not "cross" such block-borders when reading something like an integer:

memory byte    0 1 2 3     4 5 6 7       8 9 10 11
 integer       goooood
                   baaaaaaaaad

This is what the "alignment" says: an alignment of 4 means that data of this type should (or must, depends on the CPU) be stored starting at an address that is a multiple of 4.

You observation that sizeof==alignof is incorrect; try structures. Structures will also be aligned (because their individual members need to end up on the correct addresses), but their size will be much larger.

Garnishee answered 8/7, 2012 at 21:48 Comment(4)

Extra point - although x86 will do unaligned reads and writes (slowly but correctly) for most things, some architectures require all operations to be aligned, and even in x86 there's some special cases that must be aligned (SIMD instructions, I think). – Picco 8/7, 2012 at 21:56

@Andrea - If this or any other answer correctly answers your question, consider marking it correct. (Of course this is purely your choice. Don't be forced into accepting answers because people say so (e.g. like I'm saying now)) :) – Laius 9/7, 2012 at 8:46

@Steve314: IIRC, SIMD instructions (at least the floating-point ones) have two variants --- aligned and unaligned. I may be wrong, though. As for architectures requiring all operations to be aligned --- most require that (including the popular MIPS and ARM); x86 being the exception here. In fact, the C standard states that unaligned access is UB. – Blinders 11/2, 2015 at 18:10

"Structures will also be aligned"-> which means there will be unused memory between 2 struct to maintain alignment requirements. This space could then be used by the compiler to place global variables? – Eatage 15/2, 2018 at 2:54

For the answers provided, there seems to be some confusion about what alignment actually is. The confusion probably arises because there are 2 kinds of alignment.

1. Member alignment

This is a qualitative measure that spells out how large an instance is in number of bytes for a specific ordering of the members within the structure/class type. Generally, compilers can compact structure/class instances if the members are ordered by their byte-size in descending order (i.e. largest first, smallest members last) within the structure. Consider:

struct A
{
  char c; float f; short s;
};

struct B
{
  float f; short s; char c;
};

Both structures contain exactly the same information. For the sake of this example; the float type takes 4 bytes, the short type takes 2 and the character takes 1 byte. However, the first structure A has members in random order, while second structure B orders members according to their byte size (this may be different on certain architectures, I'm assuming x86 intel CPU architecture with 4-byte alignment in this example). Now consider the size of the structures:

printf("size of A: %d", sizeof (A)); // size of A: 12;
printf("size of B: %d", sizeof (B)); // size of B: 8;

If you would expect the size to be 7 bytes, you would be assuming that the members are packed into the structure using a 1-byte alignment. While some compilers allow this, in general most compilers use 4-byte or even 8-byte alignments due to historic reasons (most CPU's work with DWORD (double-word) or QWORD (quad-word) general purpose registers).

There are 2 padding mechanisms at work to achieve the packing.

First, each member that has a byte size smaller than the byte-alignment is 'merged' with the next member(s) if the resulting byte size is smaller or equal to the byte-alignment. In structure B, members s and c can be merged in this way; their combined size is 2 bytes for s + 1 byte for c == 3 bytes <= 4-byte alignment. For structure A, no such merging can occur, and each member effectively consumes 4 bytes in the structure's packing.
The total size of the structure is again padded so that the next structure can start at the alignment boundary. In example B the total number of bytes would be 7. The next 4-byte boundary lies at byte 8, hence the structure is padded with 1 byte to allow array allocations as a tight sequence of instances.

Note that Visual C++ / GCC allow different alignments of 1 byte, 2 and higher multiples of 2 bytes. Understand that this works against your compiler's ability to produce optimal code for your architecture. Indeed, in the following example, each byte would be read as a single byte using a single-byte instruction for each read operation. In practice, the hardware would still fetch the entire memory line that contains each byte read into the cache, and execute the instruction 4 times, even if the 4 bytes sit in the same DWORD and could be loaded in the CPU register in 1 instruction.

#pragma pack(push,1)
struct Bad
{
  char a,b,c,d;
};
#pragma pack(pop)

2. Allocation alignment

This is closely related to the 2nd padding mechanism explained in the previous section, however, allocation alignments can be specified in variants of malloc() / memalloc() allocation functions, e.g. std::aligned_alloc(). Hence, it is possible to allocate an object at a different (typically higher multiple of 2) alignment boundary than the structure/object type's byte-alignment suggests.

size_t blockAlignment = 4*1024;  // 4K page block alignment
void* block = std::aligned_alloc(blockAlignment, sizeof(T) * count);

The code will place the block of count instances of type T on addresses that end on multiples of 4096.

The reason for using such allocation alignments are again purely architectural. For instance, reading and writing blocks from page-aligned addresses is faster because the range of addresses fit nicely into the cache layers. Ranges that are split over different 'pages' trash the cache when crossing the page boundary. Different media (bus architectures) have different access patterns and may benefit from different alignments. Generally, alignments of 4, 16, 32 and 64 K page sizes are not uncommon.

Note that the language version and platform will usually provide a specific variant of such aligned allocation functions. E.g. the Unix/Linux compatible posix_memalign() function return the memory by ptr argument and returns non-zero error values in case of failure.

int posix_memalign(void **memptr, size_t alignment, size_t size); // POSIX(Linux/UX)
void *aligned_alloc( size_t alignment, size_t size ); // C++11
void *std::aligned_alloc( size_t alignment, size_t size ); // c++17
void *aligned_malloc( size_t size, size_t alignment ); MicrosoftVS2019

Abyssinia answered 1/12, 2016 at 11:14 Comment(5)

Do you mean calloc? – Cerotype 12/4, 2018 at 8:30

Yikes! No, calloc() does not do alignment like that. The return value above will be aligned for any type (most likely an alignment of 8 or 16 or 64 on modern CPUs), and this is completely independent of the second parameter. What you're actually doing above is allocating room for count * 4 * 1024 instances of T, e.g., 4096 times as many as you presumably want. Remember that calloc(x, y) allocates the exact same amount of memory malloc(x * y). The second parameter of calloc is not for specifying alignment!!! – Overlord 27/7, 2019 at 16:15

Thanks @ToddLehman for pointing this out. Yes It's wrong and I should have double checked. On Windows, Microsoft has _aligned_malloc() while the C++ reference indicates aligned_alloc() and C++17 has std::aligned_alloc (interestingly, the alignment and size arguments are swapped around in the latter 2) I was originally aiming for a variant that also works pre C++11. – Abyssinia 28/7, 2019 at 22:3

@Abyssinia — Cool. BTW, on the Unix side, there's also posix_memalign as an alternative to malloc. – Overlord 29/7, 2019 at 16:58

Thanks I added the reference to posix_memalign. – Abyssinia 31/7, 2019 at 10:15

The two operators do fundamentally different things. sizeof gives the size of a type (how much memory it takes) whereas alignof gives what how many bytes a type must be aligned to. It just so happens that the primitives you tested have an alignment requirement the same as their size (which makes sense if you think about it).

Think about what happens if you have a struct instead:

struct Foo {
     int a;
     float b;
     char c;
};

alignof(Foo) will return 4.

Knut answered 8/7, 2012 at 21:45 Comment(1)

@Knut you said alignof(Foo) will return 4. But it's depending on the target ABI. So this could be true on ia32 (x86) but not on ARM, MIPS, PowerPC, etc. – Sims 15/1, 2013 at 10:3

Old question (although not marked as answered..) but thought this example makes the difference a bit more explicit in addition to Christian Stieber's answer. Also Meluha's answer contains an error as sizeof(S) output is 16 not 12.

// c has to occupy 8 bytes so that d (whose size is 8) starts on a 8 bytes boundary
//            | 8 bytes |  | 8 bytes  |    | 8 bytes |
struct Bad  {   char c;      double d;       int i;     }; 
cout << alignof(Bad) << " " << sizeof(Bad) << endl;   // 8 24

//             | 8 bytes |   |   8 bytes    |    
struct Good {   double d;     int i; char c;          };
cout << alignof(Good) << " " << sizeof(Good) << endl; // 8 16

It also demonstrates that it is best ordering members by size with largest first (double in this case), as the others members are constrained by that member.

Resh answered 11/12, 2014 at 12:52 Comment(1)

This is indirectly a really good tip! Thank you. – Staggard 4/10, 2023 at 12:28

The alignof value is the same as the value for sizeof for basic types.

The difference lies in used defined data types such as using struct; for an e.g.

typedef struct { int a; double b; } S;
//cout<<alignof(s);                              outputp: 8;
//cout<<sizeof(S);                               output: 12;

hence the sizeof value is the total size required for the given data type; and alignof value is the alignment requirement of the largest element in the structure.

Use of alignof : allocate memory on a particular alignment boundary.

Feudalism answered 28/8, 2012 at 7:24 Comment(1)

The example is wrong, as sizeofalways follows the constraint sizeof(Type) % alignof(Type) == 0. If the alignment is indeed 8 bytes, then the size is at least 16 bytes. That padding is guaranteed to happen. – Carriole 4/7, 2019 at 14:34

What's the difference between sizeof and alignof?

Both are operators. Both return a type of size_t.

sizeof is the size in "bytes" of the object - the memory space needed to encode it.

alignof is the address alignment requirement in "bytes" of the object. A value of 1 implies no alignment restriction. 2 implies the address should be an even address. 4 implies the address should be a quad address. etc.

When an object reference is attempted that does not meet the alignment requirement, the result is undefined behavior.
Examples:
. The access may work, only slower.
. The access attempt may kill the program.

// Assume alignof(int) --> 2
char a[4];   // It is not known that `a` begins on an odd or even address
int *p = a;  // conversion may fail
int d = *p;  // *p is UB.

Example extension and output of OP's code.

  SIZEOF_ALIGNOF(double);
  SIZEOF_ALIGNOF(complex double);
  SIZEOF_ALIGNOF(div_t);
  SIZEOF_ALIGNOF(max_align_t);

8/8
16/8
8/4
32/16

Spray answered 27/4, 2019 at 17:46 Comment(0)

Data is arranged in a specific order in memory to make it easier for the CPU to access it.

alignof informs where an object is located in memory
sizeof says how big the object is, taking into account where its parts are located

Basic types have the same result for alignof and sizeof since they have as part only themselves: eg. short takes 2 bytes and starts from an address multiple of 2 (good for CPU). For user-defined data types look at their parts:

class Foo {
  char c1; // first member (imagine adr=0, which is multiple of 1, CPU is happy)
  int i; // adr=1, but adr has to be a multiple of 4, so reserve helper bytes for CPU (padding) before `i` = 3bytes. **so adr=4**
  short s; // adr=8 (adr of `i` + its size), 8 % 2 == 0, good
  double d; // adr=10 (adr of `s` + its size), 10 % 8 != 0, need 6 more, **so adr=16**
  char c2; // adr=24 (adr of `d` + its size), 24 % 1 == 0, ok

  // and surprise, after c2 padding is also needed! why?
  // imagine you have an array (which is sequentially data in memory) of 2 elements,
  // what is adr `d` of the second element? (let me use index=2 here)  
  // arr[2].c1 adr=25, arr[2].i adr=29 (?do you remember about padding=3bytes?, and this field already feels bad), ..., arr[2].d adr=31
  // must be padding after `c2` to satisfy everyone
  // calc p (padding), where (adr`c2` + size`c2` + p) % max(alignof of every member type) = (24+1+p) % 8 == 0; p=7 (minimum value that satisfies)   
  // YEAH! Now all members of all elements will be aligned!
};

As said above alignof(type) is a preference of CPU where to put data, in the example: alignof(Foo)==alignof(double)=8 [otherwise some member will be sad].
And sizeof(Foo)==(sum of every member size + paddings)=32
ps. allowable simplifications are made in favor of understanding the idea:)

Dieselelectric answered 26/6, 2021 at 11:29 Comment(0)

The sizeof operator gives you the size in bytes of an actual type or instance of a type.

The alignof operator gives you the alignment in bytes required for any instance of the given type.

Merras answered 8/7, 2012 at 21:43 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags