malloc-free-malloc and strict-aliasing
Asked Answered
V

3

6

I've been trying to understand a particular aspect of strict aliasing recently, and I think I have made the smallest possible interesting piece of code. (Interesting for me, that is!)

Update: Based on the answers so far, it's clear I need to clarify the question. The first listing here is "obviously" defined behaviour, from a certain point of view. The real issue is to follow this logic through to custom allocators and custom memory pools. If I malloc a large block of memory at the start, and then write my own my_malloc and my_free that uses that single large block, then is it UB on the grounds that it doesn't use the official free?

I'll stick with C, somewhat arbitrarily. I get the impression it is easier to talk about, that the C standard is a bit clearer.

int main() {
    uint32_t *p32 = malloc(4);
    *p32 = 0;
    free(p32);

    uint16_t *p16 = malloc(4);
    p16[0] = 7;
    p16[1] = 7;
    free(p16);
}

It is possible that the second malloc will return the same address as the first malloc (because it was freed in between). That means that it is accessing the same memory with two different types, which violates strict aliasing. So surely the above is undefined behaviour (UB)?

(For simplicity, let's assume the malloc always succeeds. I could add in checks for the return value of malloc, but that would just clutter the question)

If it's not UB, why? Is there an explicit exception in the standard, which says that malloc and free (and calloc/realloc/...) are allowed to "delete" the type associated with a particular address, allowing further accesses to "imprint" a new type on the address?

If malloc/free are special, then does that mean I cannot legally write my own allocator which clones the behaviour of malloc? I'm sure there are plenty of projects out there with custom allocators - are they all UB?

Custom allocators

If we decide, therefore, that such custom allocators must be defined behaviour, then it means the strict aliasing rule is essentially "incorrect". I would update it to say that it is possible to write (not read) through a pointer of a different ('new') type as long as you don't use pointers of the old type any more. This wording could be quietly-ish changed if it was confirmed that all compilers have essentially obeyed this new rule anyway.

I get the impression that gcc and clang essentially respect my (aggressive) reinterpretation. If so, perhaps the standards should be edited accordingly? My 'evidence' regarding gcc and clang is difficult to describe, it uses memmove with an identical source and destination (which is therefore optimized out) in such a way that it blocks any undesirable optimizations because it tells the compiler that future reads through the destination pointer will alias the bit pattern that was previously written through the source pointer. I was able to block the undesirable interpretations accordingly. But I guess this isn't really 'evidence', maybe I was just lucky. UB clearly means that the compiler is also allowed to give me misleading results!


( ... unless, of course, there is another rule that makes memcpy and memmove special in the same way that malloc may be special. That they are allowed to change the type to the type of the destination pointer. That would be consistent with my 'evidence'. )


Anyway, I'm rambling. I guess a very short answer would be: "Yes, malloc (and friends) are special. Custom allocators are not special and are therefore UB, unless they maintain separate memory pools for each type. And, further, see example X for an extreme piece of code where compiler Y does undesirable stuff precisely because compiler Y is very strict in this regard and is contradicting this reinterpretation."


Follow up: what about non-malloced memory? Does the same thing apply. (Local variables, static variables, ...)

Valenza answered 21/7, 2015 at 9:4 Comment(11)
Please pick a language. C and C++ have totally different aliasing rules.Cauterant
OK, I'm sticking with C. Let's hope that others aren't tempted to add C++ :-). If this question leads to an interesting discussion, then maybe I'll ask again about C++.Valenza
@Dayalrai, I'm not accessing a freed pointer.Valenza
There's no aliasing issues there, the object behind the first malloc is totally "gone" when you free it.Southeaster
This behavior is probably as undefined as calling malloc alone, which returns an array of uninitialized values.Crackpot
@YvesDaoust, after each malloc I write to the data before trying to read it. Which is surely exactly what everyone does with malloc all the time!Valenza
If you claim the rules are violated, quote the exact clause that is violated. "Same memory" is a vague notion that is hard to define precisely. The standard probably talks about "same object".Oxheart
@n.m., good point. But then the question really is: why is free allowed to end the lifetime of an object? Is my_free, which reuses blocks of memory for a single large memory block, also allowed to end the lifetime of the "same object"?Valenza
@AaronMcDaid: that's what I mean. You free the first pointer and will stop using it, so the undefined behavior is purely speculative.Crackpot
The standard says that free ends the object lifetime, so this is very much clear-cut. It is a bit more complicated with custom allocators that reuse memory. The standard does imply that storage of different objects may overlap. This happens in unions for example. What it forbids for such overlapping objects is storing a value through an lvalue of one type and then retrieving a value through an lvalue of another, incompatible type. Thus custom allocators are allowed, you just have to stop using an object after custom-deallocating it (exactly like with built-in allocation functions).Oxheart
@n.m.: Given struct SomeStruct s1, *p = malloc(sizeof *p);, the statement struct SomeStruct = *p; will have well-defined behavior even though the contents of *p are indeterminate, because structure types are forbidden from having trap representations. There is no standard-defined mechanism via which a custom allocator can achieve similar semantics without having to physically overwrite all memory that gets recycled.Behest
O
5

Here are the C99 strict aliasing rules in (what I hope is) their entirety:

6.5
(6) The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

(7) An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.

These two clauses together prohibit one specific case, storing a value via an lvalue of type X and then subsequently retrieving a value via an lvalue of type Y incompatible with X.

So, as I read the standard, even this usage is perfectly OK (assuming 4 bytes are enough to store either an uint32_t or two uint16_t).

int main() {
    uint32_t *p32 = malloc(4);
    *p32 = 0;
    /* do not do this: free(p32); */

    /* do not do this: uint16_t *p16 = malloc(4); */
    /* do this instead: */
    uint16_t *p16 = (uint16_t *)p32;

    p16[0] = 7;
    p16[1] = 7;
    free(p16);
}

There's no rule that prohibits storing an uint32_t and then subsequently storing an uint16_t at the same address, so we're perfectly OK.

Thus there's nothing that would prohibit writing a fully compliant pool allocator.

Oxheart answered 21/7, 2015 at 10:45 Comment(5)
6.5.6 seems very relevant, but I'm still a bit nervous about this example. All of those sentences talk about storing "into an object having no declared type". Are you saying that p32 (or, more precisely, *p32) has "no declared type"? If so, how can this be true? Surely, it's type is uint32_t*? Unless, of course, we're saying that anything from malloc has 'no declared type'?Valenza
The standard says in the footnote #73 that allocated objects have no declared type.Oxheart
To clarify, anything malloc-ed (and maybe calloc-ed) has "no declared type"? Also, there is no other way to create objects of "no declared type"? If so, then any custom allocators must be built on malloc/calloc-ed data; as using a global static array of char would have a declared type and would not be appropriate?Valenza
This is an interesting question that I'm not qualified to answer. The footnote is not normative so there may or may not be more examples of objects with no declared type.Oxheart
@n.m.: If an implementation would allow a compiler to call a function void *get_new_object(size_t size); that it knew nothing about beyond the signature, the compiler would have to treat the pointer thus returned as identifying an object that might not have a declared type. There is, however, no requirement that implementations allow a program to call functions thy know nothing about.Behest
A
1

Your code is correct C and does not invoke undefined behaviour (except that you do not test malloc return value) because :

  • you allocate a bloc of memory, use it and free it
  • you allocate another bloc of memory, use it and free it.

What is undefined is whether p16 will receive same value as p32 had at a different time

What would be undefined behaviour, even if value was the same would be to access p32 after it has been freed. Examples :

int main() {
    uint32_t *p32 = malloc(4);
    *p32 = 0;
    free(p32);

    uint16_t *p16 = malloc(4);
    p16[0] = 7;
    p16[1] = 7;
    if (p16 == p32) {         // whether p16 and p32 are equal is undefined
        uint32_t x = *p32;  // accessing *p32 is explicitely UB
    }
    free(p16);
}

It is UB because you try to access a memory block after it has been freed. And even when it does point to a memory block, that memory block has been initialized as an array of uint16_t, using it as a pointer to another type is formally undefined behaviour.


Custom allocation (assuming a C99 conformant compiler) :

So you have a big chunk of memory and want to write custom free and malloc functions without UB. It is possible. Here I will not go to far into the hard part of management of allocated and free blocs, and just give hints.

  1. you will need to know what it the strictest alignement for the implementation. stdlib malloc knows it because 7.20.3 §1 of C99 language specification (draft n1256) says : The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object. It is generally 4 on 32 bits systems and 8 on 64 bits systems, but might be greater or lesser ...
  2. you memory pool must be a char array because 6.3.2.3 §7 says : A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object. : that means that provided you can deal with the alignement, a character array of correct size can be converted to a pointer to an arbitrary type (and is the base of malloc implementation)
  3. You must make your memory pool start at an address compatible with the system alignement :

    intptr_t orig_addr = chunk;
    int delta = orig_addr % alignment;
    char *pool = chunk + alignement - delta; /* pool in now aligned */
    

You now only have to return from your own pool addresses of blocs got as pool + n * alignement and converted to void * : 6.3.2.3 §1 says : A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

It would be cleaner with C11, because C11 explicitely added _Alignas and alignof keywords to explictely deal with it and it would be better than the current hack. But it should work nonetheless

Limits :

I must admit that my interpretation of 6.3.2.3 §7 is that a pointer to a correctly aligned char array can be converted to a pointer of another type is not really neat and clear. Some may argue that what is said is just that if it originally pointed to the other type, it can be used as a char pointer. But as I start from a char pointer it is not explicitely allowed. That's true, but it is the best that can be done, it is not explicely marked as undefined behaviour ... and it is what malloc does under the hood.

As alignement is explicitely implementation dependant, you cannot create a general library usable on any implementation.

Adermin answered 21/7, 2015 at 9:24 Comment(7)
So, imagine I malloc a single large block at the start of my program, and then use my_malloc and my_free to implement a custom memory pool. I would then be using my_free as if it "deallocated" the memory. But I wouldn't actually be using the 'real' free, and therefore this would be a problem. I ask this because there are plenty of custom memory pool libraries out there that do exactly this. Are they all UB?Valenza
@AaronMcDaid 6.3.2.3 § 7 says : A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned68) for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object ... That means that you can implement a custom memory pool with defined behaviour. But that would need a new question.Adermin
Serge, thanks for the quote. I'm also looking at quotes in this answer to another question. So, let's assume I am careful with alignment. Your quote talks about converting to a character type. But the real question question is what happens if I then convert from the character type to a non-character type (different from the original non-character type). This is necessary for a fully functional memory pool.Valenza
Nice update. I see that alignment is an issue, but that can be resolved by using malloc initially to provide the backing store for our custom allocator. Also, I see that pointer arithmetic is an issue as we can't always be sure we've added the correct number of bytes to give a pointer that can be cast to the non-char type, but I think that can be resolved by treating the start-pointer (from malloc) as an array of the non-char type, and using this to skip forward and backwards accordingly to suitably-aligned pointers. Is this OK so far?Valenza
@AaronMcDaid : Using the initial pointer as an array of the non char type will be enough to mimic calloc but not malloc, since malloc does not know (and do not care for) the type that will be used. And the allocated ans freed blocks management could be tougher.Adermin
The way the aliasing rules are written, it is impossible to recycle memory for use as arbitrary unknown types without physically overwriting it, since recycled memory can be read as a structure type after a free/malloc cycle without UB (it's unclear whether the copy holds Indeterminate or Unspecified values, but the read would not invoke UB in any case) but reading recycled memory as the wrong structure type after using any user-code allocator would invoke UB unless the allocator overwrote the memory with a character type prior to recycling it.Behest
To be clear, p16 == p32 causes undefined behaviour (demons may fly out your nose etc.) , it is not just an "undefined whether they are equal"Emendation
R
0

The actual rules regarding aliasing are laid out in standard section 6.5, paragraph 7. Note the wording:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

(emphasis mine)

Aliasing includes the notion of objects, not just general memory. For malloc to have returned the same address on a second use requires the original object to have been deallocated. Even if it has the same address, it is not considered the same object. Any attempts to access the first object through dangling pointers leftover after free are UB for completely different reasons, so there's no aliasing because any continued use of the first pointer p32 is invalid anyway.

Rabassa answered 21/7, 2015 at 9:26 Comment(4)
"object to have been deallocated" That seems reasonable. But that's a slippery concept. Is the 'official' free the only function that has deallocation rights? If I write a custom my_malloc and my_free that provides access to a large memory pool that is malloc-ed once at the start of the program, then is that UB? Sure it is, if free has monopoly on deallocation. But that would mean all the custom memory pool code libraries are UB?Valenza
By a literal reading of 6.5, yeah, seems like pool allocators built on top of malloc that don't limit themselves to a fixed set of types per-pool are invalid. (NB that most people doing this probably are limiting the types of object allocated in each pool)Rabassa
as per the answer of n.m., I'm beginning to think that 6.5.7 should never be quoted without 6.5.6, as 6.5.6 allows the "effective type" of memory to change via writing to it, as long as the original allocation has "no declared type". I think. So 6.5.7 on its own is misleading as it implies that types never change between allocation and deallocation.Valenza
@Leushenko: I'm not sure why you would expect that people writing custom allocators would never use them in type-agnostic fashion. If I'm writing a program which uses five different structure types that are all 32 bytes each, is there any reason why I should want to use five separate pools to manage them?Behest

© 2022 - 2024 — McMap. All rights reserved.