Is it really well defined to check pointer alignment using the pointer's integer value?

Asked 13/9, 2023 at 20:35 Answered 13/9, 2023 at 21:22

Solved c++pointers c++17 language-lawyer undefined-behavior

Is there a guaranteed (not implementation-defined!) way to check for pointer alignment?

The most common way to query pointer alignment seems to be:

convert to integer
check whether the integer is a multiple of the alignment:

bool is_aligned(void const *ptr, size_t alignment) {
  return reinterpret_cast<intptr_t>(ptr) % alignment == 0;
}

For example, this is how Boost.Align checks alignment.

However, in C++17 at least, basic.compound#3.4 says:

The value representation of pointer types is implementation-defined.

Furthermore, expr.reinterpret.cast#4 says:

A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is implementation-defined.

It seems it'd be legal to (for example) represent pointers as integers with reversed bit order, in which case the simple arithmetic above would not work.

AFAICT, the only guaranteed way that we can check alignment is using std::align, which if read liberally (I'm sure this is an abuse of std::align) could be used like so:

bool is_aligned(void const *ptr, size_t alignment) {
  void *mut_ptr = const_cast<void *>(ptr);
  size_t space = alignment;
  return std::align(alignment, alignment, mut_ptr, space) == ptr;
}

However, on the overwhelming majority of platforms, pointers are just fancy integers, or I'd expect Boost to have a code path for it. Are there any platforms (other than ds9k :P) where pointers aren't just fancy integers?

What would we lose by standardizing "reinterpret_cast<intptr_t>(ptr) shall have the same object representation as ptr" or "reinterpret_cast<intptr_t>(ptr) % alignment == 0 if ptr is aligned to alignment"? It seems neither of those would exclude segmented memory, or pointers with a trap representation, which are two atypical cases I can think of.

If nothing else, is there a reason not to standardize std::is_aligned()?

EDIT: my specific non-toy use case is I have std::byte const *ptr;. The only think I know about ptr is that it contains the object representations of 1024 ints. How can I check whether it's safe to reinterpret_cast ptr or use assume_aligned on it?

Threemaster answered 13/9, 2023 at 20:35 Comment(14)

It seems it'd be legal to (for example) represent pointers as integers with reversed bit order Nice! That is going into my DS9K implementation! (Better than segmented memory pointers. But I'm still going to keep banked memory pointers.) – Bertine 13/9, 2023 at 20:38

Is this extra work necessary? Can I just use operator & or operator % to determine alignment? My justification is that if you are using pointers, and need to know alignment, you are probably not aiming for portability and can use platform specifics. – Lustrate 13/9, 2023 at 20:45

For example, for an optimized memcpy(uint8_t *, uint8_t*), I can copy 4 bytes at a time if both pointers are 32-bit aligned. – Lustrate 13/9, 2023 at 20:46

Not attempting to answer the question (thus posting as a comment rather than an answer), but just a note that alignof, alignas, std::aligned_storage and std::aligned_union are all closely related to the question. – Assyria 13/9, 2023 at 20:53

Lovely question! I think your std::align solution may have a problem in a situation where mut_ptr + space would be outside the original memory segment and aligning the pointer would fall off the edge. Possibly... Hypothetically... – Quadrangular 13/9, 2023 at 20:57

Sorry, I should've used reinerpret_cast. I've added that – Threemaster 13/9, 2023 at 21:4

"The value representation of pointer types is implementation-defined." Implementation-defined does not mean undefined. If Boost were ported to a different implementation that defines the value representation of pointers differently, presumably that code would also be changed. – Pellerin 13/9, 2023 at 21:10

@Eljay: You don't need a DeathStation 9000 compiler for that. Bit-reversed memory access patterns are used in FFT, and some DSP architectures have direct support for bit-reversed pointers. At least dsPIC does – Zoophilia 13/9, 2023 at 21:56

@BenVoigt A non-DS9000 compiler ought to transform the pointer representation to be an integer of the expected order – Silverstein 13/9, 2023 at 22:37

This looks like one of those purely theoretical issues that don't get addressed because they don't get in the way, like the previous lack of std::launder. Casting to uintptr_t will probably work on every platform. – Georginageorgine 14/9, 2023 at 6:58

@Eljay, I used to have a representation where pointers and integers are XOR 0x20000000, because address 0 is valid RAM. – Chyle 14/9, 2023 at 11:35

@M.M: Although it's likely that such an architecture provides a way to do the bit-reversal efficiently when making a copy of the pointer, it's still an extra step that's doesn't support any of the behaviors specified in the language. – Zoophilia 14/9, 2023 at 14:23

The Standard I have access to says: 'A pointer can be explicitly converted to any integral type large enough to hold all values of its type. The mapping function is implementation-defined. [Note: It is intended to be unsurprising to those who know the addressing structure of the underlying machine. — end note]' While that's an intent and not a requirement it would be unhelpful if you can't infer memory alignment from the low bits of an address (as arithmetic type) on platforms where alignment is a thing. So not guaranteed but good implementations would need a good reason not to. – Chickweed 15/9, 2023 at 12:20

Interestingly, the concept of alignement in C++ does not mean that addresses must be multiples of some size, but only restricts the differences between addresses. So if I am right, checking pointer alignment makes no sense. en.cppreference.com/w/cpp/language/object#Alignment – Assassinate 19/9, 2023 at 20:3

It seems it'd be legal to (for example) represent pointers as integers with reversed bit order, in which case the simple arithmetic above would not work.

Yes, nothing forbids that.

Even worse, the mapping can be type-dependent. The only requirement that is implementation-independent is that a round-trip reinterpret_cast back to the original pointer type results in the original pointer value. And that's maybe not far fetched. For types with high alignment you don't really need to store the zero bits at the end and so the pointer type might be smaller. Then a simple reinterpret_cast mapping might not provide any possible information to check alignment for these pointer types. See e.g. Do all pointers have the same size in C++? for discussion on pointer sizes and Exotic architectures the standards committees care about for exotic architectures, at least with regards to C, which also includes examples for this (Cray T90).

Of course, in your use you are using void* specifically, which avoids this issue.

It is also possible that no integer type large enough to represent all pointer values exists, in which case intptr_t shouldn't be available and reinterpret_cast to any integer type will be ill-formed. That's specifically considered in the C standard, so there must have existed such C implementations or it was considered that such implementations may not be unlikely, when (u)intptr_t where added to C.

However, in C++17 at least, basic.compound#3.4 says:

Note that the value representation of the pointer is irrelevant. It isn't required that reinterpret_cast leaves the bytes in memory unchanged. So the implementation-defined mapping by reinterpret_cast could be completely distinct from how integer and pointer object representations relate.

For example, on 64bit systems often the actual available address space is smaller (e.g. only 48bit). Then there are additional bits in the object representation. I could imagine the compiler/CPU using these extra bits for various purposes. Depending on that purpose the bits may or may not be part of the value representation and they may or may not be stripped or modified by reinterpret_cast. Similarly, if a different pointer type instead of void* was used, the compiler/CPU could use the bits that are guaranteed to be zero due to alignment requirements of the type for these purposes.

AFAICT the only guaranteed way that we can check alignment is using std::align, which if read liberally (I'm sure this is abuse of std::align) could be used like so:

The function has undefined behavior if the storage to which the input pointer points isn't contiguously at least alignment long the way you wrote it. So the size/space parameters should probably be 1 (or 0, see LWG 2421) instead. I see nothing in the specification that would require the size to be at least as large as the alignment.

The specification of std::align does however seem broken to me anyway, so not sure whether that is intended to work. For example it fails to specify to which (or one-past which) object the resulting pointer points, pretending that it is possible to describe a pointer value just by the address it represents, but that's not possible since C++17.

I think it is also not in line with the definition of alignment in [basic] which doesn't assume any integer representation of addresses and only gives restrictions on when an address satisfies a supported alignment requirement. An implementation that only supports alignment 1 would e.g. be valid and then there is no definition for when an address is aligned for some other power-of-two.

If nothing else, is there a reason not to standardize std::is_aligned()?

Not that I am aware of. std::align seems to have that functionality already as discussed above. A user-friendly wrapper around it would be simple to do, assuming the specification of std::align is really as intended and supposed to be guaranteed to work on every implementation.

EDIT: my specific non-toy use case is I have std::byte const *ptr;. The only think I know about ptr is that it contains the object representations of 1024 ints. How can I check whether it's safe to reinterpret_cast ptr or use assume_aligned on it?

Even if you know that ptr is correctly aligned and that the std::byte array contains valid object representations for int, then it is still not guaranteed that reinterpret_cast (which must be followed by std::launder either way) will be valid. You also need to make sure that, prior to copying the int object representation into the storage, int objects have been created. That can happen implicitly, e.g. if you memcpy the object representations into the array or use std::bit_cast to obtain the filled array into which ptr points, but you can't assume it generally. Otherwise you might have an aliasing violation (and/or precondition violation of std::launder). In particular if the object representations were copied into the std::byte array by a simple per-byte assignment in a loop, then this will result in an aliasing violation.

Corelation answered 13/9, 2023 at 21:22 Comment(4)

A practical example of modern architecture where intptr_t (and uintptr_t) are more than a memory address is CHERI. In CHERI, the pointer is 128-bits, with a 64-bits opaque value and a 64-bits memory address, hence intptr_t is 128 bits to satisfy the round-trip requirement. I expect that the lower 8 bytes of that are the memory address... but I'm not sure there's any guarantee. – Wilds 14/9, 2023 at 10:31

I do want to note that being the represent the memory address itself in a 64bits integer doesn't imply the memory address is stored in the lower-bits of the 128bits integer. Of course, in practice, the goal is to make CHERI practical, and that implies trying to make it as easy to integrate in existing codebase, and thus having the memory address in the lower bits is easier for everyone involved. – Wilds 14/9, 2023 at 14:20

Interesting. std::uintptr_t is 128bit there, but it is still possible to represent the pointer value in a 64bit integer, which (assuming the integers behave standard-conforming) is indeed the lower bits of the equivalent 128bit integer and pointer representation, at least on the implementation that the following compiler explorer link uses: cheri-compiler-explorer.cl.cam.ac.uk/z/aWYx8W. So the standard implementation shown in the question still works there (for supported alignments, definitively not for alignments > 2^63) – Corelation 14/9, 2023 at 14:21

Clang has a __builtin_is_aligned function for this reason: clang.llvm.org/docs/LanguageExtensions.html While I was porting the CMocka test library to support CHERI, I actually used Clang's __builtin_align_down align a pointer value properly: gitlab.com/cmocka/cmocka/-/commit/… It's not standard C, but only Clang supports CHERI currently, so it's not too big of a deal if it only works on the Clang compiler. – Vins 18/9, 2023 at 20:23

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags