What precautions should I take to make a memory pool that does not invoke undefined behavior?

Asked 22/9, 2016 at 13:35 Answered 22/9, 2016 at 16:18

c language-lawyer undefined-behavior c11 strict-aliasing

My initial problem is that I have, on a project, several objects which share a lifetime (i.e., once I free one of them, I'll free them all), then I wanted to allocate a single block of memory. I have arrays of three different object types, struct foo, void *, and char. At first I wanted to malloc() a block like this:

// +---------------+---------+-----------+---------+---------+
// | struct foo[n] | padding | void *[m] | padding | char[o] |
// +---------------+---------+-----------+---------+---------+

But then... how could I accomplish this without invoking undefined behavior? I.e., respecting type aliasing rules, aligment... How to properly calculate the memory block size, declare the memory block (with its effective type), and how to properly get pointers to all three sections within it portably?

(I do understand I could malloc() 3 blocks, which would result in three free(), but I'd like to know how to do it with a single block while still well-behaved.)

I'd like to extend my problem to a more general question: what precautions should one take to implement a memory pool for objects with arbitrary sizes and alignment while keeping the program well-behaved? (Assuming it is possible to implement it without invoking undefined behavior.)

Knar answered 22/9, 2016 at 13:35 Comment(6)

Why the vote to close? Could you explain how I could improve my question, or is it inherently inadequate? – Knar 22/9, 2016 at 13:45

I see 5 questions here - maybe narrow it. – Etiquette 22/9, 2016 at 14:27

@chux I see 1 question: how to correctly put objects of different types inside a malloced memory. Personally I think it's an interesting and on-topic question. – Soricine 22/9, 2016 at 14:29

@Soricine Agree on-topic and an interesting question. The "like to extend my problem" part goes beyond asking about malloced memory. IAC, not my DV, only offered an improvement idea. – Etiquette 22/9, 2016 at 15:54

I think questions that boil down to "How can I implement X" are generally too broad, and therfore off-topic. It would've been better to have your implementation and then try to have a question to address actual problems you've run into. – Ulberto 22/9, 2016 at 22:24

@ray, I'm not asking "how can I implement X", which I agree would be too broad. My question is regarding the rules defined in the standard: do they allow such things? Would it be possible to put different types of objects on the same memory block? And, if so, how to properly calculate offsets? I've never asked for a particular piece of code. – Knar 23/9, 2016 at 1:35

However hard you try, it's not possible to implement malloc in pure C.

You always end up violating strict aliasing at some point. For the avoidance of doubt, using a char buffer that doesn't have dynamic storage duration will also violate strict aliasing rules. You would also have to make sure that any pointer returned has an appropriate alignment.

If you're happy to tie yourself down to a particular platform then you may as well turn to that particular implementation of malloc for inspiration.

But why not consider writing a stub function which calls malloc and also builds up a table of other allocated objects? You could even implement some kind of observer / notify framework. Another starting point could be well-known garbage collectors that have been written in C.

Loo answered 22/9, 2016 at 13:39 Comment(12)

What about the particular case of three types of objects, in order, in a single memory block? Would it be possible to calculate the alignment and size, and get pointers to the data without violating strict aliasing? – Knar 22/9, 2016 at 13:42

Hum. Without looking at the code, it's hard to say. I'd be inclined to adopt the ideas in my new final paragraph. – Loo 22/9, 2016 at 13:45

What I was trying to achieve is allocating three arrays of three different types, of sizes n, m, and o, on a single malloc'd block, as I "drew" above. I can get sizeof(struct foo[n]) and sizeof(void *[m]), as well as their _Alignof, but then could I calculate an appropriate size for them together, and effectively declare the memory block such that I wouldn't violate strict aliasing while getting pointers to the start of the three arrays? Also, do you have any reference that argues that malloc cannot be implemented in pure compliant C? (Though, intuitively, I would agree.) – Knar 22/9, 2016 at 13:59

See #38515679 – Loo 22/9, 2016 at 14:4

using a char buffer will also violate alignment rules. The alignment isn't the only problem, and it might not even be a problem. Using an object of type char, that doesn't have allocated storage duration, as arbitrary memory with a non character type will always violate strict aliasing. – Disputatious 22/9, 2016 at 15:36

@2501: Since any access to a char[] is going to be done via ` char*

, I see no reason to believe that the authors of the Standard didn't expect that compilers would naturally treat accesses to a

char[]` (as distinct from accesses to standalone variables of type char) as being capable of aliasing anything else as a consequence of that, without need for a separate rule to allow such usage. – Laynelayney 22/9, 2016 at 18:59

@Laynelayney Since any access to a char[] is going to be done via ` char* No. This whole post is about implementing malloc. This carries an implication that it is going to be used with any type. Thus the char buffer is not going to be only accessed using char. – Disputatious 23/9, 2016 at 5:45

@2501: Aliasing rules are about conflicts between accesses to an object by its own type, and via other types. Accesses to a char[] as its own type would be via char*, and thus could not conflict with anything else. – Laynelayney 23/9, 2016 at 14:35

@Laynelayney You didn't address my comment at all: This whole post is about implementing malloc. This carries an implication that it is going to be used with any type. And it is more that just an implication, OP is directly asking for different, non-character, types: I have arrays of three different object types, struct foo, void *, and char – Disputatious 23/9, 2016 at 15:2

@2501: The rules of aliasing were written to cover the scenario where an object of type T is used as a T, then accessed using a U*, and then used again as a T. If the object of type T is never used as a type T, there is no aliasing conflict. In the case of an object declared as char[], any "natural" access would be via char*, so there woudl be no confict even if the object was accessed using its normal type. – Laynelayney 23/9, 2016 at 15:25

@Laynelayney The whole time you're talking about something different that the topic my first comment started. I have tried to explain it over and over again, but alas, you keep ignoring it. I don't understand why did you engage me in conversation if you decided to ignore my comments. This pointless 'debate' is over. – Disputatious 23/9, 2016 at 15:43

@2501: By my understanding, the issue is that while C89 Standard explicitly allowed things with any declared type to be accessed via char*, it did not include a reciprocal guarantee with regard to things accessed as char[]. Is that the point of contention? I do not believe the authors of the Standard intended the aliasing rule to forbid use of space within a char[] to hold things of arbitrary type (though alignment might pose problems) but expected compilers would naturally allow such uses whether mandated or not. – Laynelayney 23/9, 2016 at 17:6

As said in another answer, you can't reimplement malloc within C itself. The reason is that you can't generate objects that don't have an effective type without malloc.

But for your application you don't need this, you can use malloc or similar, see below, to have one large block of memory without problems.

If you have such a large block you must know how to place the objects inside this block. The major problem here is alignment, you have to place all your objects on boundaries that correspond to their minimal alignment requirements.

Since C11, the alignment of types can be obtained with the _Alignof operator, and overaligned memory can be requested with aligned_alloc.

Put all this together this reads:

compute the lcm of all alignments of your types
with aligned_alloc request enough memory that is aligned at that value
place all your objects on multiples of that alignment

Aliasing then isn't a problem if your are starting with a typeless object that you receive through a void* pointer. Each part of that large object has the effective type of with which you have written into, see my recent blog entry.

The relevant part of the C standard is 6.5 p6:

The effective type of an object for an access to its stored value is the declared type of the object, if any.87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

Here the "object with no declared type" is an object (or subobject) allocated by malloc or similar. It clearly says that such objects can be written to with any type at any time and that this than changes the effective type to the desired.

Luella answered 22/9, 2016 at 15:21 Comment(8)

"Each part of that large object has the effective type of with which you have written into", so, if I understood correctly, once I know the correct offsets, I could save the malloc'd block into a character pointer, then get different areas of the memory block and give different effective types to it (e.g., char *mem = malloc(n); struct foo *f = mem; void **p = mem + x;)? Could you please clarify this? – Knar 22/9, 2016 at 15:57

The only way the specification for memcpy works is if malloc returns a single object, which may either be an array of some type, or a struct with a flexible array member. The language of the Standard does not recognize the idea that "an object" can have multiple types, nor that memcpy can work on areas of storage which are adjoining but are not part of the same "object". – Laynelayney 22/9, 2016 at 19:3

@paulotorrens, basically yes. But it is not the assignment of the pointers that gives the memory region the effective type but only copying an object into it, something as *f = toto or memcpy(f, &toto, sizeof *f) where toto is an object of type struct foo. – Luella 22/9, 2016 at 20:31

@supercat, the standard has subobjects of larger objects all over the place. What I am proposing here is nothing else than mapping an implicit struct type over a malloced object. – Luella 22/9, 2016 at 20:39

@JensGustedt: If one casts a pointer received from malloc to a structure type, one can then regard the storage as being either an instance of that structure type or an array thereof (depending upon size), and the rules for "memcpy" will work just fine. If one allocates a pointer "p", stores an "int" in the first 4 bytes and a "float" in the next four, the only way memcpy could copy the first eight bytes of the allocated region would be if both were considered to be part of the same array object. – Laynelayney 22/9, 2016 at 20:51

@supercat, no, you are wrong. Please see my edit. It is not the conversion of the pointer that provides the effective type, but writing into the object. And there is nothing that says that it can't be done with a different type later on. The only provision that the standard makes is that the type of a read must be consistent with the one of the last preceeding write. – Luella 22/9, 2016 at 20:55

@JensGustedt, but since converting from a pointer to uintptr_t uses implementation defined semantics, how could I be sure I'm getting a pointer that is a multiple of some alignment? – Knar 23/9, 2016 at 1:54

@JensGustedt: The description of memcpy says that it copies bytes from one "object" to another object. It does not allow for the possibility that the range of memory being copied might encompass multiple objects. If the thing returned by malloc is considered to be one object for purposes of memcpy, regardless of how stuff is written into it, that would not pose a problem, but it would either require that the Effective Type rules use a different definition of "object" from memcpy, or else that a malloc region can only contain things of a single type. – Laynelayney 30/11, 2016 at 20:51

By knowing the size of the union of the 3 types a more efficient allocation can occur.

union common {
  struct foo f;
  void * ptr;
  char ch;
};

void *allocate3(struct foo **f, size_t m, void **ptr, size_t n, char **ch,
    size_t o) {
  size_t u_sz = sizeof (union common);
  size_t f_sz = sizeof *f * m;
  size_t f_cnt = (f_sz + u_sz - 1)/u_sz;
  size_t p_sz = sizeof *ptr * n;
  size_t p_cnt = (p_sz + u_sz - 1)/u_sz;
  size_t c_sz = sizeof *ch * o;
  size_t c_cnt = (c_sz + u_sz - 1)/u_sz;
  size_t sum = f_cnt + p_cnt + c_cnt;
  union common *u = malloc(sum * u_sz);
  if (u) {
    *f = &u[0].f;
    *ptr = &u[f_cnt].ptr;
    *ch = &u[f_cnt + c_cnt].ch;
  }
  return u;
}

This way, each of the 3 array begin on a union boundary , so alignment issues are met. By adjusting the space of each array to be a multiple of size of the union, less wasted space than first answer yet meets OP posted goals.

A tad wasteful is struct foo is large, yet o is small. Could use following as further improvement. There is no need for padding after the last array.

malloc((f_cnt + p_cnt) * u_sz + c_cz);

Further thought on squeezing the allocation. Each subsequent "count-of-union elements" can use a different union that omits the earlier types and so on. When reaching the end - that is the gist of the idea just above, the last array only needs to depend on the last type. This makes code more complicated (prone to error) yet does gain increased space efficiency without aliments issues, etc. Some coding ideas follow

union common_last2 {
  // struct foo f;
  void * ptr;
  char ch;
};

size_t u2_sz = sizeof (union common_last2);
size_t p_cnt = (p_sz + u2_sz - 1)/u2_sz;

... malloc(f_cnt*usz + p_cnt*u2_sz + c_cz);

*ch = tbd;

Etiquette answered 22/9, 2016 at 16:18 Comment(0)

First off, be sure to use -fno-strict-aliasing or whatever the equivalent is on your compiler. Otherwise even if all alignments are satisfied a compiler may use aliasing rules to overlap different uses of the same memory block.

I doubt this is consistent with the intention of the Standard's authors, but given optimizers can be so aggressive that the only way to implement type-agnostic memory pools safely is to disable type-based aliasing analysis. The authors of the Standard wanted to avoid branding as non-compliant some compilers that used type-based aliasing. Further, they figured they could defer to compiler writers' judgment about how to recognize and handle cases where aliasing was likely. They identified cases where compiler writers might not think it was necessary to recognize aliasing (e.g. between signed and unsigned types) but otherwise expected compiler writers to exercise reasonable judgment. I see no evidence that they intended their list of allowable cases to be regarded as exhaustive even on platforms where other forms of aliasing would be useful.

Further, no matter how carefully one abides by the Standard, there's no guarantee that compilers apply breaking "optimizations" anyway. At least as of gcc 6.2 there are aliasing bugs that will break code that uses storage as type X, writes it as Y, reads it as Y, writes that same value as X, and reads the storage as X--behavior which is 100% defined under the Standard.

If aliasing is taken care of (e.g. using the indicated flag), and you know the worst-case alignment requirement for your system, defining storage for the pool is easy:

union
{
   char [POOL_BLOCK_SIZE] dat;
   TYPE_WITH_WORST_ALIGNMENT align;
} memory_pool[POOL_BLOCK_COUNT];

Unfortunately, the Standard provides no way to avoid type-based aliasing problems even if all platform-dependent alignment issues are taken care of.

Laynelayney answered 22/9, 2016 at 15:21 Comment(9)

Your answer is not usefull at all. The problem has nothing to do with aliasing and only with alignment. The compiler option that you mention (and that is gcc specific) doesn't help to align the objects on the correct boundaries. – Luella 22/9, 2016 at 15:45

@JensGustedt: Alignment can be dealt with easily with unions. Aliasing, however, is apt to be a source of Heisenbugs unless one can force the compiler to implement a dialect of C that allows memory reuse. – Laynelayney 22/9, 2016 at 16:3

Why would you use unions for alignment when you now have the tools in C that where exactly introduce for this purpose? Aliasing is not the deal here, effective types are. And memory that has been allocated through malloc and friends has special rules, here. It has the effective type of the last object that was copied into it (when using memcpy) or of the last lvalue access (when using *f = toto). Every C compiler has to implement this "reuse" of memory for malloced objects. – Luella 22/9, 2016 at 20:43

@JensGustedt: The way the rules are written, avoiding aliasing issues would require either that a memory pool scrub memory before reallocating it, or that all consumers of a memory pool guarantee that they will copy a structure from that memory without having written every byte thereof, without regard for whether anything would ever use the values contained therein. Neither practice is conducive to efficiency. – Laynelayney 22/9, 2016 at 20:49

This is wrong. Nothing in the standard says so for objects that don't have a declared type. They basically receive their effective type each time you write into them. – Luella 22/9, 2016 at 21:17

@JensGustedt: Since structs are forbidden from having trap representations, it is not necessary to write to all struct fields before making a copy of the struct as a whole, but Effective Type rules sink that. – Laynelayney 23/9, 2016 at 14:37

@JensGustedt: Some data structures can be initialized most efficiently if the system can guarantee that a read of an uninitialized value will have no side-effects beyond yielding some Unspecified value which may differ each time it's read. That guarantee should cost almost nothing to provide (basically ensure that any writes as the new type cannot get sequenced beyond writes of the old type), but some compilers may use the fact that the new type is read before it's written as a signal that writes of the old type can be moved beyond any subsequent writes of the new type. – Laynelayney 23/9, 2016 at 15:42

@JensGustedt: What would help programmers and compilers alike would be a set of directives to (1) indicate that an object's storage will no longer be used as its own type, but the bits must be left for reinterpretation as other types; (2) indicate that an object's storage will no longer be used as its own type, and may be left holding arbitrary values; (3) formerly-untyped storage will be used as some type which will expect to reinterpret the bits thereof; (4) formerly-untyped storage will be used as some type, but may hold arbitrary storage. – Laynelayney 23/9, 2016 at 15:49

@JensGustedt: If data gets written as an old type, gets read as a new type by code which doesn't care about the value (it can cross-check the value read against other information and notice if it's invalid), and then written as the new type, the only problem scenario would be having the compiler move the old write past the new write; since omitting the write altogether should be even cheaper than deferring it, directives saying "If you haven't performed an old-type write before doing this one, don't do it at all" should cost less than nothing. – Laynelayney 23/9, 2016 at 16:2

To answer one of OP's questions

how could I accomplish this (wanted to malloc() a block like this) without invoking undefined behavior?

A space inefficient approach. Allocate a union of the types. Reasonable if the size needed of the smaller types is not too large.

union common {
  struct foo f;
  void * ptr;
  char ch;
};

void *allocate3(struct foo **f, size_t m, void **ptr, size_t n, char **ch,
    size_t o) {
  size_t sum = m + n + o;
  union common *u = malloc(sizeof *u * sum);
  if (u) {
    *f = &u[0].f;
    *ptr = &u[m].ptr;
    *ch = &u[m + n].ch;
  }
  return u;
}

void sample() {
  struct foo *f;
  void *ptr;
  char *ch;
  size_t m, n, o;
  void *base = allocate3(&f, m, &ptr, n, &ch, o);
  if (base) {
    // use data
  }
  free(base);
}

Etiquette answered 22/9, 2016 at 14:44 Comment(7)

@Disputatious Agree OP wants to allocate all three at the same time. That is what this code does - only 1 malloc() call. If m, n, o are constant, then perhaps a struct would work. This answer does not rely on m, n, o as constants. – Etiquette 22/9, 2016 at 15:43

Yeah, that's the point: as @chux said, m, n and o are not constants. If they were, a simple struct would suffice. GCC allows structs inside methods with variable sizes (e.g., int x = 10; struct { int y[x]; } z;), though sadly this is far from being portable. Sadly this solution won't be really useful, since it would be better to make three allocations instead of wasting such space. – Knar 22/9, 2016 at 15:49

This code doesn't work. You are allowed to allocate more than one element, of a certain type using either ofm,n,o and assign it to the respective pointer, but that pointer obviously won't work as an array of that type as it doesn't point to an array of that type but to an array of a union. – Disputatious 22/9, 2016 at 15:57

@Disputatious "but that pointer obviously won't work as an array of that type." -->> Hmm. The address is aligned right, to the matcing type and the memory space is enough. What won't work? – Etiquette 22/9, 2016 at 16:0

Yes, you're right, it will be aligned and it can be used freely. – Disputatious 22/9, 2016 at 16:3

You could even minimize the memory waste to be less than the size of the union. Simply calculate the ratio of the sizeof the union and the sizeof the type in the union times m, and allocate that many unions. – Disputatious 22/9, 2016 at 16:10

@Disputatious Agreed - I think were on the same track - I posted an alternative answer using that as the similar idea came up with me too. – Etiquette 22/9, 2016 at 16:20

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags