I've been trying to understand a particular aspect of strict aliasing recently, and I think I have made the smallest possible interesting piece of code. (Interesting for me, that is!)
Update: Based on the answers so far, it's clear I need to clarify the question. The first listing here is "obviously" defined behaviour, from a certain point of view. The real issue is to follow this logic through to custom allocators and custom memory pools. If I malloc
a large block of memory at the start, and then write my own my_malloc
and my_free
that uses that single large block, then is it UB on the grounds that it doesn't use the official free
?
I'll stick with C, somewhat arbitrarily. I get the impression it is easier to talk about, that the C standard is a bit clearer.
int main() {
uint32_t *p32 = malloc(4);
*p32 = 0;
free(p32);
uint16_t *p16 = malloc(4);
p16[0] = 7;
p16[1] = 7;
free(p16);
}
It is possible that the second malloc
will return the same address as the first malloc
(because it was free
d in between). That means that it is accessing the same memory with two different types, which violates strict aliasing. So surely the above is undefined behaviour (UB)?
(For simplicity, let's assume the malloc
always succeeds. I could add in checks for the return value of malloc
, but that would just clutter the question)
If it's not UB, why? Is there an explicit exception in the standard, which says that malloc
and free
(and calloc
/realloc
/...) are allowed to "delete" the type associated with a particular address, allowing further accesses to "imprint" a new type on the address?
If malloc
/free
are special, then does that mean I cannot legally write my own allocator which clones the behaviour of malloc
? I'm sure there are plenty of projects out there with custom allocators - are they all UB?
Custom allocators
If we decide, therefore, that such custom allocators must be defined behaviour, then it means the strict aliasing rule is essentially "incorrect". I would update it to say that it is possible to write (not read) through a pointer of a different ('new') type as long as you don't use pointers of the old type any more. This wording could be quietly-ish changed if it was confirmed that all compilers have essentially obeyed this new rule anyway.
I get the impression that gcc
and clang
essentially respect my (aggressive) reinterpretation. If so, perhaps the standards should be edited accordingly? My 'evidence' regarding gcc
and clang
is difficult to describe, it uses memmove
with an identical source and destination (which is therefore optimized out) in such a way that it blocks any undesirable optimizations because it tells the compiler that future reads through the destination pointer will alias the bit pattern that was previously written through the source pointer. I was able to block the undesirable interpretations accordingly. But I guess this isn't really 'evidence', maybe I was just lucky. UB clearly means that the compiler is also allowed to give me misleading results!
( ... unless, of course, there is another rule that makes memcpy
and memmove
special in the same way that malloc
may be special. That they are allowed to change the type to the type of the destination pointer. That would be consistent with my 'evidence'. )
Anyway, I'm rambling. I guess a very short answer would be: "Yes, malloc
(and friends) are special. Custom allocators are not special and are therefore UB, unless they maintain separate memory pools for each type. And, further, see example X for an extreme piece of code where compiler Y does undesirable stuff precisely because compiler Y is very strict in this regard and is contradicting this reinterpretation."
Follow up: what about non-malloc
ed memory? Does the same thing apply. (Local variables, static variables, ...)
free
d pointer. – Valenzamalloc
I write to the data before trying to read it. Which is surely exactly what everyone does withmalloc
all the time! – Valenzafree
allowed to end the lifetime of an object? Ismy_free
, which reuses blocks of memory for a single large memory block, also allowed to end the lifetime of the "same object"? – Valenzafree
ends the object lifetime, so this is very much clear-cut. It is a bit more complicated with custom allocators that reuse memory. The standard does imply that storage of different objects may overlap. This happens in unions for example. What it forbids for such overlapping objects is storing a value through an lvalue of one type and then retrieving a value through an lvalue of another, incompatible type. Thus custom allocators are allowed, you just have to stop using an object after custom-deallocating it (exactly like with built-in allocation functions). – Oxheartstruct SomeStruct s1, *p = malloc(sizeof *p);
, the statementstruct SomeStruct = *p;
will have well-defined behavior even though the contents of*p
are indeterminate, because structure types are forbidden from having trap representations. There is no standard-defined mechanism via which a custom allocator can achieve similar semantics without having to physically overwrite all memory that gets recycled. – Behest