Is casting to (void**) well-defined?
Asked Answered
S

1

7

Suppose A is a struct and I have a function to allocate memory

f(size_t s, void **x)

I call f to allocate memory as follows.

struct A* p;
f(sizeof(struct A), (void**)&p);

I wonder if (void**)&p here is a well-defined casting. I know that in C, it is well-defined to cast a pointer to void* and vice versa. However, I am not sure about the case of void**. I find the following document which states that we should not cast to a pointer with stricter alignment requirement. Does void** have stricter or looser alignment requirement?

Subhuman answered 13/7, 2022 at 9:17 Comment(3)
The cast itself might be legal, but writing to the dereferenced pointer isn't.Torrefy
You should pass a pointer to an actual void* and then you can convert the void* to a struct A* after the function returns. To avoid the need for a temporary variable of type void* you could return the void* from the function similar to malloc and realloc.Athanor
Stop using out parameters, problem solved. Now you have void * f(std::size_t); even in C. And In C++ you could much improve the type safety with template <typename T> T * f(); ... A *a = f<A>();. Or just use new A();, that's what it's there fore.Bifocals
C
9

The conversion is not defined by the C standard, and, even if it were, code in f that assigned to it via the void ** type would not be defined by the C standard.

C 2018 6.3.2.3 7 says a pointer to an object type may be converted to a pointer to a different object type. This covers (void **) &p, since &p is a pointer to the object p, and void ** is a pointer to the object type void *. However, this paragraph only tells us the conversion may be performed. It does not full define what the result is. It says:

  • “If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.” This is generally not a problem; in common C implementations, the alignment requirements of void * and struct A * will be the same, and this is easily checked.

  • “Otherwise, when converted back again, the result shall compare equal to the original pointer.” This is all the paragraph tells us about the result of the conversion: It is a pointer you can convert back to struct A * to get the original pointer or its equivalent. It does not tell us the pointer can be used for anything else while it is in the void ** type.

  • “When a pointer to an object is converted to a pointer to a character type,…” This part of the paragraph does not apply, since we are not converting to a pointer to a character type.

So, suppose the function f has some code that uses its parameter x like this:

*x = malloc(…);

Because the standard did not define what will happen if x is used as a void ** for any purpose other than converting it back to struct A *, we do not know what *x will do.

A typical expectation is that *x will access the same memory p is in, but it will access it as a void * instead of as a struct A *. A technical problem here is that the C standard does not guarantee that a void * is represented in memory in the same way that a struct A * is represented in memory. As far as the standard is concerned, void * could use eight bytes while struct A * uses four bytes, or void * could use a flat byte address while struct A * uses a segment-and-offset address scheme. However, as with alignment, in common C implementations, different types of pointers have the same representation in memory, and this can be checked.

But then we arrive at the aliasing rule. Even if void * and struct A * have the same representation in memory, C 2018 6.5 7 says:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,

The list continues with several other categories of types, and none of them match the struct A * type of p. That is, this paragraph in the standard tells us the object p shall have its stored value accessed (“accessed” in the C standard includes both reading and writing) only by an expression that has one of the listed types. The expression used to access p in *x = malloc(…); is *x, and its type is void *, and void * is not compatible with struct A *, and void * is also not any of the other types listed in the paragraph.

So the code *x = malloc(…); breaks that rule. Violating a “shall” rule means the behavior of the code is not defined by the C standard.

Some compilers support breaking this rule, when a switch is used to ask them to support aliasing objects through different types. Using such a switch prevents some optimizations by the compiler. In particular, given two pointers x and y that point to different types not matching the aliasing rule, then compiler may assume they point to different objects, so it can reorder accesses to *x and *y in whatever way is efficient because a store to one cannot change the value in the other.

So, if you verify that void * and struct A * have the same representation and alignment requirement and that your compiler supports aliasing, then the behavior will be defined for the specific C implementation you check. However, it is not defined by the C standard generally.

Churrigueresque answered 13/7, 2022 at 9:38 Comment(17)
As a side question, will (struct A*)(*x) = malloc(…); make it valid?Durable
@Durable No, that will not even compile.Athanor
@IanAbbott oops, my mistake.Durable
@Afshin: * (struct A **) x = malloc(…); would be fine as long as the alignment requirements (void * has at least the same alignment requirement as struct A *) are met, which they are in common C implementations.Churrigueresque
@EricPostpischil Why would the alignment of void * come into play? You aren't writing any void* here.Bifocals
@GoswinvonBrederlow: Suppose the alignment requirement of struct A * is two bytes and p is at address 102. Further suppose the alignment requirement of void * is four bytes and the compiler represents void ** as a number of four-byte words, so 13 in the bytes of a void ** means address 52. Then (void **) &p will convert the address of p, 102, to void ** by dividing 102 by four, yielding 25, with the remainder discarded. When this is later converted to struct A **, it is multiplied by four, yielding 100, which is wrong; it is not the address of p, 102.Churrigueresque
Suppose we have two structs A and B with the same alignment requirement. And we have a variable a of type A. Does struct B *pb = (struct B *)&a have the representation problem mentioned above please?Subhuman
@user18676624: Pointers to structures are special in that the C standard requires all pointers to structure types to have the same representation as each other.Churrigueresque
@EricPostpischil That isn't a problem of the alignment but representation. The alignment difference would seem to allow using different representations though. But have you ever seen such a C implementation? Seems like a problem only relevant to the language.lawyer tag.Bifocals
@GoswinvonBrederlow: That is a problem of alignment; there is a violation of the rule about alignment during conversion. The reason for that rule is representation: A pointer to a type with an alignment requirement of X is only required to represent addresses that are multiplies of X. Therefore, when another address is converted to that type, if it is not aligned correctly, the type might not be able to represent it. Therefore the conversion rule in the C standard must have a limitation on the alignment. I know of C implementations with different pointer representations for different types.Churrigueresque
@EricPostpischil Why are you arguing when I'm agreeing with you?Bifocals
The question was about struct A** vs void**. struct A* can be case to void* and back as can char and struct { char c; } pointer. So I would argue void* must use the same representation as struct pointer with byte granularity. So struct A** and void** would have the same size and alignment. Why would their representation ever differ? Note: struct A** might very well only store multiples of 4 but then void** would for the same reasons. It doesn't make sense to save bits in one but not the other.Bifocals
void * is required to be able to represent any address. A pointer to a struct is not required to be able to represent any address. Therefore the demands on void * and struct foo * are different, so they could have different representations. In a word-oriented C implementation, struct foo * might consist of only a word number, whereas void * would have extra bits for the byte-within-the-word. And since struct foo * and void * would be different, the implementors might make different choices about the properties of struct foo ** and void **.Churrigueresque
Sorry, I have another question. If I cast struct foo* to void *, is it well-defind to use the resulting pointer of void * type for something other than converting it back to struct foo*? I find that the parameter of free function is of void * and thus it seems that answer is yes.Subhuman
@user18676624: A pointer that originated as a struct foo * can be converted to a pointer to the type of the first member of the structured and used to access that member. It can also be converted to a pointer to a character type and used to access the bytes that represent the structure.Churrigueresque
@EricPostpischil Thanks a lot. Could you please show me the part of the standard that covers this point?Subhuman
@user18676624: C 2018 6.7.2.1 15 says “… A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa…” 6.3.2.3 7 says “… When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object…,” and accessing those bytes is defined by 6.5 7.Churrigueresque

© 2022 - 2024 — McMap. All rights reserved.