C memory allocator and strict aliasing
Asked Answered
I

4

11

even after reading quite a bit about the strict-aliasing rules I am still confused. As far as I have understood this, it is impossible to implement a sane memory allocator that follows these rules, because malloc can never reuse freed memory, as the memory could be used to store different types at each allocation.

Clearly this cannot be right. What am I missing? How do you implement an allocator (or a memory pool) that follows strict-aliasing?

Thanks.

Edit: Let me clarify my question with a stupid simple example:

// s == 0 frees the pool
void *my_custom_allocator(size_t s) {
    static void *pool = malloc(1000);
    static int in_use = FALSE;
    if( in_use || s > 1000 ) return NULL;
    if( s == 0 ) {
        in_use = FALSE;
        return NULL;
    }
    in_use = TRUE;
    return pool;
}

main() {
    int *i = my_custom_allocator(sizeof(int));
    //use int
    my_custom_allocator(0);
    float *f = my_custom_allocator(sizeof(float)); //not allowed...
}
Incoming answered 7/10, 2011 at 12:15 Comment(4)
How is paxdiablo's answer an answer?Parachute
@curiousguy, I agree! This is a fascinating issue. The standard talks about " .. explicitly deallocated" (in paxdiablo's quote from the standard). But, if free has a monopoly on "deallocating", then that means free is very special. The code in the question does not call free, therefore this special behaviour is not available. Therefore, the question is: does "deallocation" occur in the questioner's code - i.e. is the "pseudo-deallocation" in the questioner's code sufficient to bring the object lifetime to an end and to allow a new type to be written to the existing location?Nathanielnathanil
@AaronMcDaid: The idea that free() should have a monopoly on such behavior is particularly absurd on freestanding implementations which aren't required to support that function.Sharasharai
The idea of the free (delete for C++) monopoly seems relatively new. It is a very silly idea that was never floated in these committees before.Parachute
N
11

I don't think you're right. Even the strictest of strict aliasing rules would only count when the memory is actually allocated for a purpose. Once an allocated block has been released back to the heap with free, there should be no references to it and it can be given out again by malloc.

And the void* returned by malloc is not subject to the strict aliasing rule since the standard explicitly states that a void pointer can be cast into any other sort of pointer (and back again). C99 section 7.20.3 states:

The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).


In terms of your update (the example) where you don't actually return the memory back to the heap, I think your confusion arises because allocated object are treated specially. If you refer to 6.5/6 of C99, you see:

The effective type of an object for an access to its stored value is the declared type of the object, if any (footnote 75: Allocated objects have no declared type).

Re-read that footnote, it's important.

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

In other words, the allocated block contents will become the type of the data item that you put in there.

If you put a float in there, you should only access it as a float (or compatible type). If you put in an int, you should only process it as an int (or compatible type).

The one thing you shouldn't do is to put a specific type of variable into that memory and then try to treat it as a different type - one reason for this being that objects are allowed to have trap representations (which cause undefined behaviour) and these representations may occur due to treating the same object as different types.

So, if you were to store an int in there before the deallocation in your code, then reallocate it as a float pointer, you should not try to use the float until you've actually put one in there. Up until that point, the type of the allocated is not yet float.

Nonferrous answered 7/10, 2011 at 12:19 Comment(6)
But doesn't that mean the compiler must have special knowledge of the free-call so that a pointer [after a free and a new malloc] might alias to the formerly freed pointer?Incoming
@Sebastian, I've updated the answer with the relevant standards info. Allocated memory is treated specially.Nonferrous
@Sebastian: accessing anything trough the formerly-freed pointer is UB anyway, so the compiler is free to assume what it likes in terms of aliasing. For example, int *ptr = malloc(sizeof(int)); *ptr = 1; free(ptr); long *ptr2 = malloc(sizeof(long)); *ptr2 = 2; if (ptr == ptr2) *ptr; has UB in the case where the second allocation just so happens to have equal address to the first. So the compiler doesn't have to track the fact that ptr has been freed, it can safely just continue to apply strict aliasing rules, and assume that it is not aliased.Bluh
That said, as long as the compiler treats malloc and free as calls to unknown code in another TU, then none of the usual optimizations that rely on strict aliasing are going to be applied anyway, since for all the compiler knows ptr is aliased somewhere else in the program, and malloc and free modify its contents.Bluh
"since objects are allowed to have trap representations" No. This issue has nothing to do with trap representation. Even if you know that there isn't a trap representation, you are still not allowed to break the aliasing rules.Parachute
curiousguy, I stated "... one reason" was to do with trap representations, I didn't say it was the only reason. For example, allowing aliasing will remove quite a bit of power for the compiler to optimise, but the trap representations are also a limit on treating memory content as a different type to what was stored there.Nonferrous
W
3

I post this answer to test my understanding of strict aliasing:

Strict aliasing matters only when actual reads and writes happen. Just as using multiple members of different type of an union simultaneously is undefined behavior, the same applies to pointers as well: you cannot use pointers of different type to access the same memory for the same reason you cannot do it with an union.

If you consider only one of the pointers as live, then it's not a problem.

  • So if you write through an int* and read through an int*, it is OK.
  • If you write using an int* and read through an float*, it is bad.
  • If you write using an int* and later you write again using float*, then read it out using a float*, then it's OK.

In case of non-trivial allocators you have a large buffer, which you typically store it in a char*. Then you make some sort of pointer arithmetic to calculate the address you want to allocate and then dereference it through the allocator's header structs. It doesn't matter what pointers do you use to do the pointer arithmetic only the pointer you dereference the area through matters. Since in an allocator you always do that via the allocator's header struct, you won't trigger undefined behavior by that.

Wileywilfong answered 28/11, 2015 at 17:52 Comment(4)
There's some other nastiness too. Even if "int" and "long" have the same representation, using memcpy to copy data from a "long" to allocated storage and then reading the memcpy'd memory as an "int" will yield Undefined Behavior.Sharasharai
@Sharasharai long is a 64 bit integer on 64 bit Linux while int is 32 bit.Wileywilfong
Substitute "long" and "long long" then. My point is that using memcpy to move data between malloc-allocated arrays of different types is UB even when the types have the same representation.Sharasharai
Union aliasing is allowed in CPalliasse
S
3

Standard C does not define any efficient means by which a user-written memory allocator can safely take a region of memory that has been used as one type and make it safely available as another. Structures in C are guaranteed not to trap representations--a guarantee which would have little purpose if it didn't make it safe to copy structures with fields containing Indeterminate Value.

The difficulty is that given a structure and function like:

struct someStruct {unsigned char count; unsigned char dat[7]; }
void useStruct(struct someStruct s); // Pass by value

it should be possible to invoke it like:

someStruct *p = malloc(sizeof *p);
p->count = 1;
p->dat[0] = 42;
useStruct(*p);

without having to write all of the fields of the allocated structure first. Although malloc will guarantee that the allocation block it returns may be used by any type, there is no way for user-written memory-management functions to enable such reuse of storage without either clearing it in bytewise fashion (using a loop or memset) or else using free() and malloc() to recycle the storage.

Sharasharai answered 8/1, 2017 at 22:59 Comment(0)
K
0

Within the allocator itself, only refer to your memory buffers as (void *). when it is optimized, the strict-aliasing optimizations shouldn't be applied by the compiler (because that module has no idea what types are stored there). when that object gets linked into the rest of the system, it should be left well-enough alone.

Hope this helps!

Kibitka answered 7/10, 2011 at 12:22 Comment(1)
Things could still break, especially with fixed-size-chunk allocators, if whole-program optimization can see into them but is willfully blind to pointer type conversions.Sharasharai

© 2022 - 2024 — McMap. All rights reserved.