AFAIK, there are three situations where aliasing is ok
- Types that only differ by qualifier or sign can alias each other.
- struct or union types can alias types contained inside them.
- casting T* to char* is ok. (the opposite is not allowed)
These makes sense when reading simple examples from John Regehrs blog posts but I'm not sure how to reason about aliasing-correctness for larger examples, such as malloc-like memory arrangements.
I'm reading Per Vognsens re-implementation of Sean Barrets stretchy buffers. It uses a malloc-like schema where a buffer has associated metadata just before it.
typedef struct BufHdr {
size_t len;
size_t cap;
char buf[];
} BufHdr;
The metadata is accessed by subtracting an offset from a pointer b
:
#define buf__hdr(b) ((BufHdr *)((char *)(b) - offsetof(BufHdr, buf)))
Here's a somewhat simplified version of the original buf__grow
function that extends the buffer and returns the buf as a void*
.
void *buf__grow(const void *buf, size_t new_size) {
// ...
BufHdr *new_hdr; // (1)
if (buf) {
new_hdr = xrealloc(buf__hdr(buf), new_size);
} else {
new_hdr = xmalloc(new_size);
new_hdr->len = 0;
}
new_hdr->cap = new_cap;
return new_hdr->buf;
}
Usage example (the buf__grow
is hidden behind macros but here it's in the open for clarity):
int *ip = NULL;
ip = buf__grow(ip, 16);
ip = buf__grow(ip, 32);
After these calls, we have 32 + sizeof(BufHdr) bytes large memory area on the heap. We have ip
pointing into that area and we have new_hdr
and buf__hdr
pointing into it at various points in the execution.
Questions
Is there a strict-aliasing violation here? AFAICT, ip
and some variable of type BufHdr
shouldn't be allowed to point to the same memory.
Or is it so that the fact that buf__hdr
not creating an lvalue means it's not aliasing the same memory as ip
? And the fact that new_hdr
is contained within buf__grow
where ip
isn't "live" means that those aren't aliasing either?
If new_hdr
were in global scope, would that change things?
Do the C compiler track the type of storage or only the types of variables? If there is storage, such as the memory area allocated in buf__grow
that doesn't have any variable pointing to it, then what is the type of that storage? Are we free to reinterpret that storage as long as there is no variable associated with that memory?
buf
is never accessed, but is only used to calculate addresses, there's no object of typechar
there. Still it would be somewhat cleaner to not ever declarebuf
as a member, but simply usenew_hdr+1
as the starting address. (This is about C rules, C++ ones may be different). – Huffbuf
might be a problem.malloc()
like functions are expected to return a max-aligned object andbuf
should be a such one. – Hedgerbuf
is declaredchar[]
, butip
where it is assigned, isint*
. – Brawlip
is completely irrelevant, as there is no lvalue access with typeint*
anywhere. You can declareip
asbananas_t
and it wouldn't matter. – Deeannstruct X {char a; char b[]}
, thenstruct X* x = malloc(sizeof(struxt X) + sizeof(int)); int * ip = (int*)x->b
, might not generate correctly aligned pointer toint
, because there might not be necessary padding betweenx->a
andx->b
. – Brawlint
. – Deeannbananas_t
requires a 128 bit alignment;malloc()
like functions are expected to return memory with at least such an alignment. Butbuf[]
within the object might be only 64 bit aligned (when e.g.size_t
is 32 bit) and because the object itself is at least 128 bit aligned (allocated bymalloc()
),buf[]
is misaligned. – Hedger