Is a misaligned load due to a cast undefined behavior?
Asked Answered
S

2

8

Is a misaligned load due a a cast from void* undefined behavior?


Here's what I am seeing with Clang and its sanitizers:

bufhelp.h:146:29: runtime error: load of misaligned address 0x7fff04fdd0e1 for type 'const uintptr_t' (aka 'const unsigned long'), which requires 8 byte alignment
0x7fff04fdd0e1: note: pointer points here
 00 00 00  66 66 6f 6f 62 61 72 34  32 46 4f 4f 42 41 52 31  37 66 6f 6f 62 61 72 34  33 46 4f 4f 42
              ^ 

And here's where the cast comes into play:

buf_xor(void *_dst, const void *_src1, const void *_src2, size_t len)
{
  ...
  ldst = (uintptr_t *)(void *)dst;
  lsrc1 = (const uintptr_t *)(const void *)src1;
  lsrc2 = (const uintptr_t *)(const void *)src2;

  for (; len >= sizeof(uintptr_t); len -= sizeof(uintptr_t))
    *ldst++ = *lsrc1++ ^ *lsrc2++;

  ...
}

Related, but I don't believe answers the question above:

Safar answered 6/3, 2015 at 6:17 Comment(3)
It would entirely depend on what you show into that function as parameters. Without seeing the calling code, it is not really possible to answer the question.Purree
@Purree - is that the case? Clang already alerted the address is 0x7fff04fdd0e1. Can 0xXXXXXXX1 ever be an uintptr_t* used in the loop? Or am I parsing Clang's output incorrectly?Safar
Yes but why is it XXX1? You must be passing a char pointer or something to the function. There's nothing in the code posted which would cause a misalignment, since all the pointer arithmetic you do there is based on the correct type.Purree
N
9

The conversion to a wrongly aligned pointer itself is undefined, not only a load through that pointer (C11 (n1570) 6.3.2.3 p7):

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned [...] for the referenced type, the behavior is undefined.

The code shown also breaks strict aliasing, as the pointed-to object is unlikely to be declared as uintptr_t (the address would be correctly aligned otherwise).

To be standard conforming, unsigned char can be used instead.

If uintptr_t-sized chunks shall be copied for performance reasons, unsigned char can be used until the address is properly aligned, followed by another loop copying uintptr_t. This should be done through a union or via memcpy to avoid aliasing issues (Gcc can optimize memcpy calls out if the size is constant). The last bytes may need to be copied via unsigned char again to avoid out-of-bounds access (a read sizeof(uintptr_t)-1 bytes past the array shouldn't cause problems (Glibc does this in several places), but the write through dst may write into another object). It may help to restrict-qualify the pointers used.

Normalcy answered 6/3, 2015 at 9:0 Comment(3)
Edited the out-of-bounds access part. I shouldn't give advice to ignore UB before my first coffee...Normalcy
Fun fact: some compilers do define the behaviour for creating misaligned pointers, even when they don't define the behaviour of dereferencing. Notably Intel's SIMD intrinsics API requires creating misaligned __m128i* that can't be dereferenced directly, so any compiler supporting that API must de-facto support casting to misaligned pointers: Is `reinterpret_cast`ing between hardware SIMD vector pointer and the corresponding type an undefined behavior?Grugru
Also related: re: alternatives to memcpy as an alignment and strict-aliasing safe load / store: GNU C typedef uint64_t aliasing_unaligned_u64 __attribute__((aligned(1),may_alias)); as mentioned in Why does unaligned access to mmap'ed memory sometimes segfault on AMD64? (interesting example of misaligned loads being a problem in practice on an ISA that allows that in asm) or Why does glibc's strlen need to be so complicated to run quickly?Grugru
M
1

I think the answer to your specific question is "yes" -- although I'm not sure it's specific to the cast itself, but to unaligned pointers in general. The code internal to buf_xor() looks mostly OK to me, so I'd take a look at what the addresses are that are passed in.

It looks to me like you don't want to be calling buf_xor() with unaligned addresses. If that's not it (if you're calling but_xor() everywhere with aligned addresses), then I'd make sure that uintptr_t is defined to be the same thing (and specifically 64 bits wide, based on your output) where buf_xor() is compiled, as well as where it's called.

One final personal opinion note is that because your buf_xor() implementation requires aligned pointers as parameters on some processor/compiler implementations, you could probably save some future hassle by changing the signature to reflect that (change void * to uintptr_t *) -- or change the implementation itself to deal gracefully, 'manually', with unaligned addresses on all architectures.

Malpighiaceous answered 6/3, 2015 at 6:52 Comment(1)
This is actually GnuPG code that I am analyzing and acceptance testing. I wanted to fix the problem with unions or aligned pointers in the callers. But its done in so many places, it became easier to fix the function with a memcpy into a temporary. I'll file the bug report to alert them to the problem, though (see Issue 1881: Undefined behavior when running make check under Clang sanitizers).Safar

© 2022 - 2024 — McMap. All rights reserved.