Why do some types (e.g. Float80) have a memory alignment bigger than word size?
Asked Answered
D

2

6

To make it specific, I only want to know why on my 64 bit mac, the Swift compiler says the alignment of some types like Float80 is 16. To check the memory alignment requirement of a type, I use the alignof function.

sizeof(Float80) // ~> 16 bytes, it only needs 10 bytes, but because of hardware design decisions it has to be a power of 2
strideof(Float80) // ~> 16 bytes, clear because it is exact on a power of 2, struct types with Float80 in it, can be bigger
alignof(Float80) // ~> 16 bytes, why not 8 bytes, like String ?

I understand the memory alignment of types less or equal the size of the word is beneficial.

sizeof(String) // ~> 24 bytes, clear because 24 is multiple of 8
strideof(String) // ~> 24 bytes, clear because 24 is multiple of 8
alignof(String) // ~> 8 bytes, clear because something greater or equal to 8 bytes should align to 8 bytes

Many types with a bigger memory size footprint like String (with a size of 24) does have a memory alignment requirement of 8 bytes. I expect that is the size of my CPU/RAM bus in use, because I have a 64 bit mac and os. I check the size of the type without the last padding with the sizeof function and with the adding padding to the end with the strideof function (strideof is more helpful in arrays of structs, Swift then adds bytes to the end to reach the next multiple of the alignment requirement.)

I understand that padding is necessary for types lesser or equal than the size of 8 byte.

But I don't understand why it is advantageous to have a memory alignment requirement bigger than 8 bytes on my 64 bit mac.

Float80 needs 80 bits for its value, that are 10 bytes, with 6 filler bytes.

Here is an image to make it more clear, what I mean. The green positions are allowed for a Float80, the red positions not. The memory is in 8 byte chunks in this picture.

byte and word Float80 memory alignment in Swift

Decidua answered 6/3, 2015 at 11:56 Comment(4)
Your picture seems to be misleading. If one row stands for 8 bytes then a Float80 (including padding) can only take two rows, not 16. And if one row stands for 1 byte then Float80 in the 3rd column is not on an aligned address.Chanticleer
@MartinR you are absolutely correct, changed the pictureDecidua
24 is an exact power of 2??Typewriter
@SlippD.Thompson You are right, that was a mistake in that line. Thanks for the hint. I corrected it. 8 is the word size and 24 is a multiple of 8 is what was intended.Decidua
D
4

With the help of Martin R's link and the hint that it is a processor design decision. I found the readon why.

Cache lines.

Cache lines are a very small memory for the processor, on the Intel Mac 64 bit of mine it is 128 bit (16 bytes).

As seen in the picture of the question I knew there was a difference between the dotted and the bold lines. The bold lines are between the cache lines of the processor. You don't want to load 2 cache lines if you could do better with a little more memory cost. So if the processor only allows, that types with a size of 8 bytes (or bigger) are aligned on the start of a cache line (every multiple of 16). There will be no two cache line reads for a type that is as big as a cache line (double the word size im my case, 16 bytes). As you can see in the picture only the red blocks are crossing the bold line (so they are not allowed per design).

See the link attached for more info.

Cache effects

Decidua answered 8/3, 2015 at 8:51 Comment(0)
C
7

All "primitive data types" (the term may be wrong, what I mean is the data types that are used by the processor) have a "natural boundary", and the compiler will align them in memory accordingly. The alignment depends on the processor (e.g. x86 or ARM) and the programming environment (e.g. 32-bit vs 64-bit). Some processors allow misaligned data (perhaps at a lower speed), and some do not allow it.

For the 64-Bit Intel architecture, the requirements are listed in Data Alignment when Migrating to 64-Bit Intel® Architecture:

The 64-bit environment, however, imposes more-stringent requirements on data items. Misaligned objects cause program exceptions.
[...]

  • Align 8-bit data at any address
  • Align 16-bit data to be contained within an aligned four-byte word
  • Align 32-bit data so that its base address is a multiple of four
  • Align 64-bit data so that its base address is a multiple of eight
  • Align 80-bit data so that its base address is a multiple of sixteen
  • Align 128-bit data so that its base address is a multiple of sixteen

So the alignment is not necessarily equal to the "word size", it can be less or more. Float80 corresponds to the "Extended Precision" floating point type of the x86 processor, and its alignment is required to be 16 bytes.

Composite types like C struct are layed out in memory such that each member is on its natural boundary (and padding is inserted in between if necessary). The alignment of the struct itself is the largest alignment of each member.

The memory layout of a Swift Struct is not documented officially (as far as I know) but it is probably be similar to the C struct. Here is a simple example:

struct AStruct {
    var a = Int32(0)
    var b = Int8(0)
    var c = Int16(0)
    var d = Int8(0)
}
println(sizeof(AStruct))     // 9
println(alignof(AStruct))    // 4
println(strideof(AStruct))   // 12

The memory layout (probably) is (* = padding):

aaaab*ccd

Here the alignment is 4 because that is the required alignment for Int32. The struct occupies 9 bytes, but the "stride" is 12: This guarantees that in an array of structs all elements satisfy the same alignments.

(Note that the Swift strideOf() corresponds to the C the sizeof() function, this is explained in https://devforums.apple.com/message/1086107#1086107.)

The declaration of a Swift string shown as

struct String {
    init()
}

but the actual members are not visible to us mere mortals. In the debugger it looks like this:

enter image description here

which indicates that its members are a pointer, an unsigned word and another pointer. All these types have a size and alignment of 8 bytes on 64-bit. This would explain the size (24 bytes) and alignment (8 bytes) of struct Swift.

Chanticleer answered 7/3, 2015 at 16:46 Comment(5)
you did not answer my question. I know the required alignment, the question is why do I need an alignment bigger than 8 bytes on a 64 bit machine. I don't see the advantage. Specifically why does Float80 (10+6 bytes size) have alignment requirement of 16 bytes and not 8 bytes? A String (24 bytes) has a requirement of 8 bytes. A Bool with size of 1 byte would add 15 bytes of size to a struct when a Float80 is following, why isn't 7 bytes enough. Minimum is the word size because of performance while reading (examples are on wikipedia etc), But never bigger than the word size.Decidua
@ViktorLexington: Then I misunderstood your question. I guess the short answer is "because Intel designed it like that". If your question is about processor design then I cannot help. – Note that String is a Struct and therefore different from a "primitive" type.Chanticleer
Then with string as a struct with Chars and Ints, everyone with an alignment of 8 bytes, it would make sense that the String struct is also with an alignment of 8. Good to know.Decidua
@ViktorLexington: I have added some info about the String type.Chanticleer
Thanks for hinting me in the right direction, found the answer :-)Decidua
D
4

With the help of Martin R's link and the hint that it is a processor design decision. I found the readon why.

Cache lines.

Cache lines are a very small memory for the processor, on the Intel Mac 64 bit of mine it is 128 bit (16 bytes).

As seen in the picture of the question I knew there was a difference between the dotted and the bold lines. The bold lines are between the cache lines of the processor. You don't want to load 2 cache lines if you could do better with a little more memory cost. So if the processor only allows, that types with a size of 8 bytes (or bigger) are aligned on the start of a cache line (every multiple of 16). There will be no two cache line reads for a type that is as big as a cache line (double the word size im my case, 16 bytes). As you can see in the picture only the red blocks are crossing the bold line (so they are not allowed per design).

See the link attached for more info.

Cache effects

Decidua answered 8/3, 2015 at 8:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.