Does nullptr_t break type punning or pointer conversions?
Asked Answered
C

2

10

Consider this union:

typedef union
{
  void*      vptr;
  nullptr_t  nptr;
} pun_intended;

nullptr_t is supposedly compatible with void* 1). Ok so what if we initialize the void* to some non-zero value?

pun_intended foo = { .vptr = (void*)42 }; 
  • This conversion is supposedly legit (impl.defined) as per C23 6.3.2.3 §4, or at least it was until nullptr_t was introduced.
  • And what about union type punning? Also supposedly legit.
  • And what about inspecting any type's internal representation in C using a character type pointer, well-defined until C23, 6.3.2.3 §7.

Full example:

#include <stdio.h>
#include <inttypes.h>
#include <stddef.h>

typedef union
{
  void*      vptr;
  nullptr_t  nptr;    
} pun_intended;

int main(void)
{
  pun_intended foo = { .vptr = (void*)42 };
  printf("Value: %" PRIuPTR "\n", (uintptr_t)foo.vptr);

  if(foo.nptr != (void*)42)
  {
    puts("It does not have value 42.");
    if(foo.nptr == nullptr)  
      puts("Because it's a nullptr.");
    else
      puts("But it's not a nullptr.");

    unsigned int val = *(unsigned char*)&foo; // little endian assumption here
    printf("And it has value %d.\n", val);

    if(foo.vptr != nullptr)
    {
      puts("foo.vptr is however not a nullptr.");
    }
  }
}

Output on clang 16 -std=c2x:

Value: 42
It does not have value 42
Because it's a nullptr
And it has value 42.
foo.vptr is however not a nullptr

Output on gcc 13.2 -std=c2x:

Value: 42
It does not have value 42.
But it's not a nullptr.
And it has value 42.
foo.vptr is however not a nullptr.

My question: Is anything of the above (which was previously well-defined or impl.defined) now undefined/unspecied behavior? If so, where is that stated? Or are these scenarios simply not considered in C23 - a defect?


1) Source: C23 n3096 draft 7.21.2

The size and alignment of nullptr_t is the same as for a pointer to character type. An object representation of the value nullptr is the same as the object representation of a null pointer value of type void*.

Coinsure answered 15/8, 2023 at 13:39 Comment(11)
nullptr_t is different from all pointer or arithmetic types ... and has exactly one value nullptr. Can this be squared with a nullptr_t holding the value 42?Cripps
@adabsurdum I think the bottom line is that no value "nullptr" exists in the real world where computers live. They have to name the exact representation of it, such as "a pointer of size sizeof(void*) with all bytes set to zero". Otherwise I don't think this type is any improvement over NULL, quite the contrary as it seems to make C less suitable for hardware-related programming.Coinsure
Similarly if a nullptr_t must have representation nullptr, then what exactly is "not nullptr"? A trap representation?Coinsure
From C23 7.21.2p2 regarding nullptr_t: "It has only a very limited use in contexts where this type is needed to distinguish nullptr from other expression types", so it's probably not intended to be used in this way.Zooplasty
Re “nullptr_t is supposedly compatible with void*”: Who supposes that and why? “Compatible” has a specific meaning in C. Largely, it means that two types can be completed to be the same type. The fact that one type can be converted to another implicitly does not mean they are compatible. The fact that two types have the same size and representation does not mean they are compatible. nullptr_t is not compatible with void *.Dielle
The problem seems similar to that of performing an lvalue conversion on a _Bool object that has somehow (for example by union type-punning) been set to a value other than 0 or 1, which as far as I can tell would result in UB. That's probably already been asked about in another question.Hydrant
@EricPostpischil C23 7.21.2 says so. Source added to the question.Coinsure
@IanAbbott I recall this fun little display of UB: _Bool type and strict aliasing. It is quite similar to this indeed, both problems happen when you apply low-level programming to artificial types without well-defined representations. The kind of academic people who come up with these kind of artificial representations never had to implement CRC, walk patterns, wear leveling etc to guarantee memory coherency.Coinsure
Having the same object representation does not make two types compatible. void * and char * have the same object representation but are not compatible. short and int may have the same object representation but are not compatible in C implementations in which they do.Dielle
@EricPostpischil Having the same representation ought to mean that you can memcpy them into each other seamlessly, except some compilers seem to have added diagnostics if you try to pass a nullptr_t to certain functions. If C chose to make artificial restrictions regarding assignment on top of that, that's another story.Coinsure
@Lundin, yes, you can memcpy them. That is not what “compatible” means. “Compatible” has a technical definition in the C standard. It is about type semantics, not representation. Largely, two types are compatible if they are the same except for any missing parts. E.g. int [] and int [17] are compatible because they are both arrays of int and differ only in that one is missing the size.Dielle
D
8

Ok so what if we initialize the void* to some non-zero value?

C 2023 N3096 7.21.2 3 explicitly answers this. After telling us that the representation of a nullptr value in the nullptr_t type is the same as for a null pointer value in the void * type, it tells us what happens if there is a different sequence of byte values in a nullptr_t object:

… if the object representation is different, the behavior is undefined.

Dielle answered 15/8, 2023 at 14:37 Comment(4)
ugh. Just what C needs: syntactic sugar that adds the wholly-new concept of an immutable type to C and opens the door to weird corner-case UB and bugs. Straight from the too-complex-to-understand, must-be-everything-to-everybody font of craziness that is C++.Curtis
@AndrewHenle: It's nothing special. The UB mentioned in this answer is the exact same UB you get if you perform an lvalue conversion of a non-value representation of any type, not just nullptr_t, and non-value representations (previously called trap representations) have been around for ages. You're still free to write invalid representations to an object of type nullptr_t, or to have an invalid representation in such an object before you initialize it.Poddy
nullptr_t isn't really immutable, or introducing any new concepts. It's just a type that happens to only have one value. const types are closer to immutable, and those have been around for ages.Poddy
Just found this part now. Can union type punning always be said to be an lvalue conversion though?Coinsure
G
2

IMO it is UB:

6.3.2.3

If a null pointer constant or a value of the type nullptr_t (which is necessarily the value nullptr )...

6.5.4.4

A pointer type shall not be converted to any floating type. A floating type shall not be converted to any pointer type. The type nullptr_t shall not be converted to any type other than void , bool or a pointer type. No type other than nullptr_t shall be converted to nullptr_t

Here you convert void * to nullptr_t via union which is explicitly prohibited.

You can convert nullptr_t to void * via this union but not void * to nullptr_t

IMO nullptr_t exists only to introduce nullptr constant and I do not see any practical use of this type.

Gehman answered 15/8, 2023 at 14:12 Comment(3)
Although... if I can convert from nullptr_t to void*, and void* is explicitly a type through which any other object pointer conversion is possible, I should also be able to convert from void* to nullptr_t. Or otherwise nullptr_t also breaks void pointers as a generic type, in addition to all other oddities shown in the question.Coinsure
@Coinsure No type other than nullptr_t shall be converted to nullptr_t sounds quite definitive. And you are right - I agree, but it does not change the working used. BTW nullptr_t exists only to introduce nullptr an I do not see any practical use of this typeGehman
@Lundin: Re “Or otherwise nullptr_t also breaks void pointers as a generic type”: It does not break pointers because nullptr_t is not a pointer type. Pointer types are derived types; they are derived from object or function types per C 2023 N3096 6.2.525. nullptr_t is a new type. Even if it were a pointer type, it would not break the rules about conversions between pointers to object types because it would not be a pointer to an object type.Dielle

© 2022 - 2025 — McMap. All rights reserved.