When are type-punned pointers safe in practice?

Asked 15/6, 2018 at 6:59 Answered 16/6, 2018 at 19:54

A colleague of mine is working on C++ code that works with binary data arrays a lot. In certain places, he has code like

char *bytes = ...
T *p = (T*) bytes;
T v = p[i]; // UB

Here, T can be sometimes short or int (assume 16 and 32 bit respectively).

Now, unlike my colleague, I belong to the "no UB if at all possible" camp, while he is more along the lines of "if it works, it's OK". I am having a hard time trying to convince him otherwise.

Given that:

bytes really come from somewhere outside this compilation unit, being read from some binary file.
It's safe to assume that array really contains integers in the native endianness.

In practice, given mainstream C++ compilers like MSVC 2017 and gcc 4.8, and Intel x64 hardware, is such a thing really safe? I know it wouldn't be if T was, say, float (got bitten by it in the past).

Moppet answered 15/6, 2018 at 6:59 Comment(23)

This blog post blog.qt.io/blog/2011/06/10/type-punning-and-strict-aliasing has some references to weird bugs people had doing type puns on used-in-practice compilers (gcc for example). Your colleague really ought to write the memcpy implementation; it'll generate the same code on optimising compilers but it doesn't carry the risk of spontaneously breaking – Paperback 15/6, 2018 at 7:10

GCC 4.8 is rather old. I wouldn't call that one "mainstream". – Zagreb 15/6, 2018 at 7:13

@JamesPicone I read that before asking. Unfortunately, it deals with more complicated cases than simply reading a binary array. – Moppet 15/6, 2018 at 7:17

@JesperJuhl ~3 years is not that old. – Hornswoggle 15/6, 2018 at 7:17

Hello! I do not have any comments on whether the given code is safe or not, but I have a sad experience with alignment. I was working with code for some i.MX28 processor. I believed that if I have some packed data as char*, I can get the desired value with something like memcpy (&float_variable, char_pointer + offset, sizeof (float)). This assumption has lead to a hard-to-find bug that my colleagues and I managed to resolve after several hours of digging into the code. – Golfer 15/6, 2018 at 7:18

@JesperJuhl gcc 4.8 is installed on RHEL 7, which is going to be supported for a while, and we need to support it. – Moppet 15/6, 2018 at 7:18

@Sergey Tachenov You don't have to be stuck with an ancient compiler just to target RHEL7. You can do what we do at my workplace and install devtoolset-7 and you'll get gcc 7.3.1, fully supported by RH. – Zagreb 15/6, 2018 at 7:24

@SergeyTachenov doesn't really matter how complicated the case is. Compilers take advantage of alias analysis and the strict aliasing rule, as demonstrated by them breaking code that abuses type punning. Compilers that take advantage of the strict aliasing rule will sometimes compile to not what you expect. Sometimes they'll do what you expect. godbolt.org/g/HYRNzi suggests that clang and gcc will reason their way around 'simple' aliasing issues, but your colleague has no idea which minor changes to the code might break it. – Paperback 15/6, 2018 at 7:25

@JamesPicone If only I were able to convince my colleague in that. He reasons that optimizations can't break anything here because the compiler has no idea where that data came from. – Moppet 15/6, 2018 at 7:30

if they are pointer to different type compiler assume they can't be aliases without care were the data comes from – Impotence 15/6, 2018 at 7:31

@Impotence yes, but it got to do something. If it has no idea what's there, how can it possibly read anything else than what's really there? I know it can, like it happened with float, but a real-life example with integer types, where a certain compiler would go wrong in this case, would be useful. – Moppet 15/6, 2018 at 7:37

@SergeyTachenov Can he read assembly? This is a good example for why you should never underestimate a compiler's ability to outsmart you: godbolt.org/g/tsZqzS . Short summary: Compiler acts as if a particular function that is never called was called, because otherwise the program would contain undefined behaviour – Paperback 15/6, 2018 at 7:38

@JamesPicone I'm struggling a bit to believe that the optimization of clang is legal like demonstrated in your example. I would have expected a nullptr-call in main() (which might end in a crash or something else). How may code come in effect (over function "boundaries") before/without it is even called? – Pumphrey 15/6, 2018 at 11:42

@JesperJuhl "there is no safe type punning in C++." Really, not even with memcpy? With volatile? – Kakemono 15/6, 2018 at 12:26

@Scheff How do you "call" a null pointer? – Kakemono 15/6, 2018 at 15:22

@Kakemono This way: void (*fct)() = nullptr; int main() { f(); return 0; }. Well, I would expect for that program to crash somehow... Live demo on coliru – Pumphrey 15/6, 2018 at 15:50

@Scheff A "crash" is one possible behavior of an execution with UB (Undefined Behavior). Why would it be the one and only behavior? – Kakemono 15/6, 2018 at 15:56

@Kakemono Thinking twice, I came to a similar conclusion: Calling a null function pointer is, of course, Undefined Behavior. IMHO, the same holds for the example of James Picone. In his case, the UB manifests as valid function call where (in my feeling) should not be one - yet another kind of UB. – Pumphrey 15/6, 2018 at 17:27

I am finally at a desktop PC with Internet access where I can look at that example. It's awesome, and is really a good illustration to anyone who thinks that UB is not that bad. After a bit of reasoning, I even understood how it works. The compiler simply sees that there is only one value ever assigned to the pointer, not counting nullptr, and as it's UB the compiler is free not to consider it at all. And if there's just one possible value, why not to replace the indirect call with a direct one and then even inline it? And that's exactly what it did. Brilliant. – Moppet 15/6, 2018 at 18:4

@SergeyTachenov there's a complexity in that the function pointer is declared static, so we know all accesses to it must be local to the translation unit, so the compiler knows when viewing the translation unit that there can be no other writes to that pointer. If it wasn't static, it could potentially be written to in a different TU – Paperback 17/6, 2018 at 9:6

@curiousguy: The fact that the Standard imposes no requirements on how some action behaves does not mean that implementations cannot or should not define a predictable behavior. An implementation intended for low-level programming should recognize the possibility that a programmer may know useful things about the platform that the compiler does not, and behave "in a documented fashion characteristic of the environment" in cases where the environment would have a natural characteristic behavior and there's no compelling reason to do anything else. – Kahlil 18/6, 2018 at 18:42

@Tyker: The Standard allows for the existence of implementations that are conforming, but of such low quality as to be useless. The fact that the Standard does not require that a compiler recognize that doSomething(&someAggregate.member) might access an object of someAggregate's type in no way implies that quality implementations should not be expected to do so. Given that the Standard doesn't even require compilers to acknowledge that an access to someAggregate.member might access someAggregate, C would be useless without implementations going beyond the mandates of the Standard. – Kahlil 18/6, 2018 at 18:55

Have you read What is the strict aliasing rule? – Sudatory 15/10, 2018 at 22:24

char* can alias other entities without breaking strict aliasing rule.

Your code would be UB only if originally p + i wasn't a T originally.

char* byte = (char*) floats;
int *p = (int*) bytes;
int v = p[i]; // UB

but

char* byte = (char*) floats;
float *p = (float*) bytes;
float v = p[i]; // OK

If origin of byte is "unknown", compiler cannot benefit of UB for optimization and should assume we are in valid case and generate code according. But how do you guaranty it is unknown ? Even outside the TU, something like Link-Time Optimization might allow to provide the hidden information.

Hexateuch answered 15/6, 2018 at 8:31 Comment(15)

The original array comes as bytes, so it's definitely UB. The problem is, even if the compiler can't make any optimizations, it is still allowed to generate code that wouldn't do the right thing. Only I can't come up with an example to demonstrate it. – Moppet 15/6, 2018 at 8:41

Compilers have to know it is UB to generate invalid code. so in practice, currently, I think only LTO could allow to exhibit the problem, but I think that compiler are relatively conservative about char* aliasing. – Hexateuch 15/6, 2018 at 8:52

I think LTO is the best argument I've seen so far. It both gives a good warning about all the unimaginable things that could happen and explains that memcpy isn't as slow as one may think. – Moppet 15/6, 2018 at 9:8

I think Jarod42 is quite right. If the compiler dont know where the pointer comes from, it has to assume we are doing the right thing with the cast. as long as we dont introduce two different pointers it should just work, since there is nothing which can not alias. The following might be wrong, but there is still an issue with object lifetime which i dont know how to resolve. Even for pod types, if you just interpret binary data through a pointer of that type, you never accually created an object of that type according to the standard. its lifetime never started. or am i wrong? – Reinaldoreinaldos 15/6, 2018 at 10:24

"But how do you guaranty it is unknown ?" asm("")? – Kakemono 15/6, 2018 at 12:27

@phön Lifetime of trivial type is a controversial language lawyer issue. The standard seems very divided about many essential questions regarding lifetime. There is a split between the churches and C++ is in fact many semantics under one name. – Kakemono 15/6, 2018 at 15:28

@Hexateuch "Compilers have to know it is UB to generate invalid code" If you compiler vendor insists that allocating aligned memory isn't sufficient to create all trivial objects that could possibly fit, you should reconsider your trust in your compiler vendor. – Kakemono 15/6, 2018 at 16:5

@phön: Unfortunately, gcc and clang can generate bogus code even if there aren't any references that could alias in the code as written. Given the sequence {T1 *p1=unionArr[i]->v1; useT1(&p1);}{T2 *p2=unionArr[i]->v2; useT2(&p2);}{T1 *p3=unionArr[i]->v1; useT1(&p3);} accesses made via p1, p2, and p3 don't alias each other as written because their lifetimes are totally disjoint. Both clang and gcc, however, extend the lifetime of p1 across the use of p2 and use that in place of p3, thus causing p1 and p2 to alias even though they don't do so in the code as written. – Kahlil 15/6, 2018 at 19:52

@curiousguy: It's reasonable for a compiler to assume that during the a particular execution of a function or complete loop, an object won't be accessed by seemingly unrelated means unless there is evidence, within that same execution, that such a thing may happen. A programmer should be required to ensure that an object doesn't get used multiple ways unless there is evidence within that context where that will occur. Compiler writers that claim they have no obligation to notice such evidence, however, should be recognized as morons or jerks, and not be welcome in polite company. – Kahlil 15/6, 2018 at 19:56

@curiousguy: Unfortunately, for whatever reason, the authors of the Standard are focused on individual discrete operations, without looking at the context in which they occur. If a function like void clearShort(uint16_t *p) { *p=0; } is invoked from void test(uint32_t *p1, *p2; *p1 += 0x1234; clearLowHalf((uint16_t*)p2); *p1 += 0x1234;}, a compiler generating code for clearShort might or might not know about the surrounding operations involving *p1, but it shouldn't be able to know about the operations on p1 without also knowing that between those operations, a uint16_t* was... – Kahlil 15/6, 2018 at 20:6

...derived from a uint32_t of unknown provenance. The authors of C89 and later standards may have thought such a thing should be common sense, but it obviously isn't. – Kahlil 15/6, 2018 at 20:10

"The original array comes as bytes" It "comes" from outside, so with no type. – Kakemono 17/6, 2018 at 22:12

@Kahlil i am not talking about having pointers of different types in one scope. i mean it programwide. your sequence is UB unless useT1 sets data of type T2 for the next use. but imagine having binary data read from network or hdd in a different translation unit and then the only access you ever do in your whole program is through pointers of a specific (and right aligned) POD type. then there is never even a possiblity of aliasing. i am not sure about lifetime issues. cppref says: For [...our case...], lifetime begins when the properly-aligned storage for the object is allocated [...]. – Reinaldoreinaldos 18/6, 2018 at 7:41

@phön: Neither C89 nor any standard derived from it has ever made a real attempt to define all of the usage patterns necessary to make an implementation useful. The notion of "effective type" is bad because it makes the optimizations that are available in a piece of code dependent upon the calling context which the compiler generally can't see and the programmer generally can't control. This severely limits a compiler's ability to benefit from such rules, while making it hard for the programmer to 100% reliably work around the restrictions. Despite that, the rules fail to define... – Kahlil 18/6, 2018 at 13:40

...even the most basic usage patterns (like using an aggregate member expression to access storage of an aggregate's type). Actually, all the rule would need to make it decent would be recognition that it's only applicable in places where seemingly-unrelated references are used to access the same storage within each others' active lifetimes [i.e. where there is aliasing in the code as written]. Given T1 *p1 = &uptr->v1; T2 *p2 = &uptr->v2; *p1=1; *p2=2;, the lvalues *p1 and *p2 alias. Change that to T1 *p1 = &uptr->v1; *p1=1; T2 *p2 = &uptr->v2; *p2=2;, however, and they don't. – Kahlil 18/6, 2018 at 13:45

Type-punned pointers are safe if one uses a construct which is recognized by the particular compiler one is using [i.e. any compiler that is configured support quality semantics if one is using straightforward constructs; neither gcc nor clang support quality semantics qualifies with optimizations are enabled, however, unless one uses -fno-strict-aliasing]. The authors of C89 were certainly aware that many applications required the use of various type-punning constructs beyond those mandated by the Standard, but thought the question of which constructs to recognize was best left as a quality-of-implementation issue. Given something like:

struct s1 { int objectClass; };
struct s2 { int objectClass; double x,y; };
struct s3 { int objectClass; char someData[32]; };

int getObjectClass(void *p) { return ((struct s1*)p)->objectClass; }

I think the authors of the Standard would have intended that the function be usable to read field objectClass of any of those structures [that is pretty much the whole purpose of the Common Initial Sequence rule] but there would be many ways by which compilers might achieve that. Some might recognize function calls as barriers to type-based aliasing analysis, while others might treat pointer casts in such a fashion. Most programs that use type punning would do several things that compilers might interpret as indications to be cautious with optimizations, so there was no particular need for a compiler to recognize any particular one of them. Further, since the authors of the Standard made no effort to forbid implementations that are "conforming" but are of such low-quality implementations as to be useless, there was no need to forbid compilers that somehow managed not to see any of the indications that storage might be used in interesting ways.

Unfortunately, for whatever reason, there hasn't been any effort by compiler vendors to find easy ways of recognizing common type-punning situations without needlessly impairing optimizations. While handling most cases would be fairly easy if compiler writers hadn't adopted designs that filter out the clearest and most useful evidence before applying optimization logic, both the designs of gcc and clang--and the mentalities of their maintainers--have evolved to oppose such a concept.

As far as I'm concerned, there is no reason why any "quality" implementation should have any trouble recognizing type punning in situations where all operations upon a byte of storage using a pointer converted to a pointer-to-PODS, or anything derived from that pointer, occur before the first time any of the following occurs:

That byte is accessed in conflicting fashion via means not derived from that pointer.
A pointer or reference is formed which will be used sometime in future to access that byte in conflicting fashion, or derive another that will.
Execution enters a function which will do one of the above before it exits.
Execution reaches the start of a bona fide loop [not, e.g. a do{...}while(0);] which will do one of the above before it exits.

A decently-designed compiler should have no problem recognizing those cases while still performing the vast majority of useful optimizations. Further, recognizing aliasing in such cases would be simpler and easier than trying to recognize it only in the cases mandated by the Standard. For those reasons, compilers that can't handle at least the above cases should be viewed as falling in the category of implementations that are of such low quality that the authors of the Standard didn't particularly want to allow, but saw no reason to forbid. Unfortunately, neither gcc nor clang offer any options to behave reasonably except by requiring that they disable type-based aliasing altogether. Unfortunately, the authors of gcc and clang would rather deride as "broken" any code needing features beyond what the Standard requires, than attempt a useful blend of optimization and semantics.

Incidentally, neither gcc nor clang should be relied upon to properly handle any situation in which storage that has been used as one type is later used as another, even when the Standard would require them to do so. Given something like:

union { struct s1 v1; struct s2 v2} unionArr[100];
void test(int i)
{
    int test = unionArr[i].v2.objectClass;
    unionArr[i].v1.objectClass = test;
}

Both clang and gcc will treat it as a no-op even if it is executed between code which writes unionArr[i].v2.objectClass and code which happens to reads member v1.objectClass of the same union object, thus causing them to ignore the possibility that the write to unionArr[i].v2.objectClass might affect v1.objectClass.

Kahlil answered 16/6, 2018 at 19:54 Comment(10)

Feel free to write your own compiler that specifically documents that it defines type-punned interactions. I suspect you'll find that it's harder to maintain optimisations than you think. Type-punning via union is perfectly defined in C; if you'd prefer to write C code (as your type declarations suggest), feel free. – Paperback 17/6, 2018 at 9:14

@JamesPicone If unions are well defined in the perfect semantics of C, why are there so many open questions? – Kakemono 17/6, 2018 at 12:59

@JamesPicone: Neither gcc nor clang processes all the corner cases required by the Standard except in -fno-strict-aliasing mode, and I don't know if there are any compilers that can be configured to process all the cases that are unambiguously required by the Standard without also handling the cases I described. Efficiently and correctly processing a sequence like int temp = unionPtr->m1.x; unionPtr->m2.intMember = x; where x is a member of a Common Initial Sequence, should generate release/acquire barriers on lvalues of the proper types without actually generating loads or stores. – Kahlil 17/6, 2018 at 19:6

@JamesPicone: If the rules were workable, gcc and clang should be able to optimize code while still handling all the corner cases required. Further, under the C rules as written (I'm not positive about C++), the lvalue aggregate->member is an lvalue of the member type; unless the member happens to be of character type, the only way for accesses using lvalue to have defined behavior is for the storage to have heap duration whose dynamic/effective type is either not set or matches that of the member type. – Kahlil 17/6, 2018 at 19:34

I saw a discussion where GCC maintainers (on a bugtrack site) could not even agree on the basic idea that a C++ program that writes its own free store handling, in such way that raw memory is reused for a different type, is strictly conforming as is, without special keywords, magic, "compiler fences", even if the user defined allocation functions aren't called operator new and operator delete. (They sounded like "respect my authoritah" people.) – Kakemono 18/6, 2018 at 5:55

@Kahlil As I said many times, the C++ rules aren't even clear for committee members. From the core issues list, it's obvious that these people think that a lvalue must refer to an (existing) object. Changing the active members is always done with an lvalue that refers to a pre-object, a range of bytes of a type that the C++ std doesn't recognize as an "existing" "object". Can we even talk about the qualities of non existing things? (Do we need to refer to Jean-Paul Sartre for this?) – Kakemono 18/6, 2018 at 6:1

It's pretty obvious that nobody in the committee worked it through. The most basic issues of the definition of "an object", "lifetime", and "lvalue" are still unsolved. Note the modern description "An object is created by a definition ([basic.def]), by a new-expression, when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary])." and compare the old description of C++: there wasn't even a mention of unions for years. – Kakemono 18/6, 2018 at 6:4

So either: 1) the committee wasn't aware unions existed 2) nobody reads that inane verbiage 3) the committee didn't realize until recently that changing active union member implied the creation of a live object. Either way, this is BAD. It's an indication that the committee was more interested in making complicated features and defining funny special cases then in thinking about the semantics of most basic code involving a union. – Kakemono 18/6, 2018 at 6:8

@curiousguy: Judging from what I've read of C committee discussions leading up to C99, the Committee had thought that support for access patterns beyond those explicitly listed should be a Quality of Implementation issue, but were effectively badgered into "clarifying" the rules. The original rules would have been fine as a baseline if they were not presumed to be exhaustive, but making a useful exhaustive set of rules would require more than "clarification". Further, it's possible for a C compiler to be useful for some purposes without allowing memory to be recycled as different types. – Kahlil 18/6, 2018 at 6:50

@curiousguy: There would thus be no problem with having the Standard recognize the legitimacy of such compilers provided that they were recognized as only being suitable for limited purposes, and the Standard recognized the legitimacy of code for purposes such compilers can't serve. – Kahlil 18/6, 2018 at 6:51

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags