Is it legal and well defined behavior to use a union for conversion between two structs with a common initial sequence (see example)?
Asked Answered
M

2

17

I have an API with a publicly facing struct A and an internal struct B and need to be able to convert a struct B into a struct A. Is the following code legal and well defined behavior in C99 (and VS 2010/C89) and C++03/C++11? If it is, please explain what makes it well-defined. If it's not, what is the most efficient and cross-platform means for converting between the two structs?

struct A {
  uint32_t x;
  uint32_t y;
  uint32_t z;
};

struct B {
  uint32_t x;
  uint32_t y;
  uint32_t z;
  uint64_t c;
};

union U {
  struct A a;
  struct B b;
};

int main(int argc, char* argv[]) {
  U u;
  u.b.x = 1;
  u.b.y = 2;
  u.b.z = 3;
  u.b.c = 64;

  /* Is it legal and well defined behavior when accessing the non-write member of a union in this case? */
  DoSomething(u.a.x, u.a.y, u.a.z);

  return 0;
}


UPDATE

I simplified the example and wrote two different applications. One based on memcpy and the other using a union.


Union:

struct A {
  int x;
  int y;
  int z;
};

struct B {
  int x;
  int y;
  int z;
  long c;
};

union U {
  struct A a;
  struct B b;
};

int main(int argc, char* argv[]) {
  U u;
  u.b.x = 1;
  u.b.y = 2;
  u.b.z = 3;
  u.b.c = 64;
  const A* a = &u.a;
  return 0;
}


memcpy:

#include <string.h>

struct A {
  int x;
  int y;
  int z;
};

struct B {
  int x;
  int y;
  int z;
  long c;
};

int main(int argc, char* argv[]) {
  B b;
  b.x = 1;
  b.y = 2;
  b.z = 3;
  b.c = 64;
  A a;
  memcpy(&a, &b, sizeof(a));
  return 0;
}



Profiled Assembly [DEBUG] (Xcode 6.4, default C++ compiler):

Here is the relevant difference in the assembly for debug mode. When I profiled the release builds there was no difference in the assembly.


Union:

movq     %rcx, -48(%rbp)


memcpy:

movq    -40(%rbp), %rsi
movq    %rsi, -56(%rbp)
movl    -32(%rbp), %edi
movl    %edi, -48(%rbp)



Caveat:

The example code based on union produces a warning regarding variable 'a' being unused. As the profiled assembly is from debug, I don't know if there is any impact.

Manville answered 22/7, 2015 at 3:36 Comment(27)
See Unions and type-punningMcclary
@ShafikYaghmour I previously reviewed that question and it has lots of conflicting information. In particular, it seems to hint that my example is legal and well-defined. Can you clarify the rules for the specific example provided?Manville
I don't think that will even compile. struct C is an incomplete type. Which makes struct B an incomplete type. Which makes union U an incomplete type. There is no way (AFAIK) that you can expose U/B and hide C like that. You can only have pointers to incomplete types.Saturate
In my answer to that question, I cite both defect report 283 and a key discussion from std-discussion both of those should be considered authoritive. Although the std-discussion thread makes it clear there is a lot of unspecified behavior but what it clear is that the alternative memcpy is well defined and equally efficient.Mcclary
@AlanAu struct C is forward declared and done so specifically to indicate that it's type should not matter for this question. It's not meant to be actual working code.Manville
@ShafikYaghmour I was just about to ask about memcpy. Given that, it seems there is little need for my use of union. Would you agree?Manville
As for C variants, memcpy should work well as mentioned. If you do it in C++ you could create appropriate constructors to copy the data and yoru assignment operator would be good as well.Andromede
@QuinnRoundy The code I use will need to work equally well whether compiled with a C or C++ compiler.Manville
@QuinnRoundy That include Visual Studio 2010, whose C compiler is C89 with elements of C99.Manville
@ShafikYaghmour Given that the conversion is from a larger type to a smaller type with matching types to the x, y, and z of struct B, does the example still suffer from undefined/implementation defined issues?Manville
I guess you'll receive quite a lot of theoretical answers like "undefined dehavior in C++", "valid in C99", etc. However, I'm pretty sure that your code will work correctly in practice, I cannot see any reason for compiler to screw this.Childish
You might want to see #12165355 and #21500466Childish
@Childish Is there any performance advantage to using a union in the manner compared to memcpy? With union, I should be able to return a (const struct A*)u.a from a function and avoid the copy that would be imposed by memcpy. Correct?Manville
@Codorilla I think calling memcpy may be slower, unless compiler manages to remove the call (which seems unlikely). Returning other side of union by pointer is absolutely free, and by value is likely to be free too. You might want to try and look at the generated assembly. Besides, are you really sure this performance difference matter?... BTW, you can use inheritance here in C++ =)Childish
@Childish I wonder if the guarantee regarding the initial member also holds for subsequent members whose types also match. Since it's not explicitly stated in the standard, that seems like implementation defined behavior at best.Manville
@Childish Inheritance is great, but the code needs to work for C and C++ compilers. Utilizing a C++ only feature in this context would have a ripple effect that complicates supporting both languages.Manville
A similar sub-question: Is it UB if you cast a pointer to B to pointer to A and then use it to access first three members of B? My guess is that the C standard is going in the direction where this is (or will be) not only working but also well defined.Triatomic
@Triatomic Casting to a different type breaks strict aliasing rules. You'd have to disable that compiler optimization.Manville
@AlanAu I've reworked the example to eliminate your concern. Having C not explicitly defined was supposed to simplify the example, but it was obviously more of a distraction.Manville
@stgatilov: I can imagine optimizations breaking this unless you declare the union of volatile struct... - changes to one structure cached in registers, not flushed to RAM, then the (old) RAM values read through the "other side".Joiner
@Joiner Based on how the standard defines "union", that isn't a concern. Also, declaring a type "volatile" does not make it thread safe or atomic.Manville
@Joiner Proper handling of the common initial sequence is guaranteed by the standard and any optimization that breaks that rule is a defect.Manville
@Joiner Conversion from uint64_t to uint8_t isn't going to "work" no matter what the compiler does and it's out of the scope of this question.Manville
@Joiner You've gone way off topic. This question isn't about anything you mentioned.Manville
@Joiner I was pointing out that your understanding of how the volatile modifier works is incorrect because I don't want other readers to be confused by your comments.Manville
@Joiner I'm simply saying that your suggestion to use "volatile" to ensure atomic behavior is incorrect. see: #6628096. That is a better place for future comments.Manville
@Codorilla: @Codorilla: That's STILL about a multi-threaded system. A write to a variable in a single thread program is guaranteed to be atomic: int a = 5; a = 10; printf("%d",a) - you will never get "anything else than "10" as result because the system guarantees the write will end (for all practical purposes) before the next command. This won't work with writing to memory underlying the variable. const int a=5; int* b = (int*) &a; *b = 10; printf("%d",a);. You're most likely getting a "5". But make volatile const int a and the atomicity within a single thread is enforced.Joiner
S
11

This is fine, because the members you are accessing are elements of a common initial sequence.

C11 (6.5.2.3 Structure and union members; Semantics):

[...] if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

C++03 ([class.mem]/16):

If a POD-union contains two or more POD-structs that share a common initial sequence, and if the POD-union object currently contains one of these POD-structs, it is permitted to inspect the common initial part of any of them. Two POD-structs share a common initial sequence if corresponding members have layout-compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

Other versions of the two standards have similar language; since C++11 the terminology used is standard-layout rather than POD.


I think the confusion may have arisen because C permits type-punning (aliasing a member of a different type) via a union where C++ does not; this is the main case where to ensure C/C++ compatibility you would have to use memcpy. But in your case the elements you are accessing have the same type and are preceded by members of compatible types, so the type-punning rule is not relevant.

Sexivalent answered 22/7, 2015 at 8:29 Comment(41)
Can you provide links for your sources?Manville
@Codorilla see #82156Sexivalent
As a follow up, do you see any advantage of using a union in the case versus memcpy? When I profiled the code in release mode, the assembly was the same whether I used union or memcpy. But, this is a simple case. I wonder if the result would be the same in a real world use case.Manville
@Codorilla personally, I'd reserve memcpy for the case where you're converting between members of different types; I'd be surprised to see memcpy used in code where the union method is allowed. As you say an optimising compiler can "see through" the memcpy, but you aren't always running optimised code - it would get in the way when stepping through a debug build.Sexivalent
There's also the issue of the behavior of union being well defined while it's implementation defined whether the use of memcpy is as efficient. The profiled assembly in debug mode makes that clear. Using union in this case carries stronger performance guarantees across platforms.Manville
It's interesting - and perhaps notable, though I don't know whether I want it to be... - that while C has the clause about a requirement for some ill-defined 'visible declaration' of the union, C++ (including as of 14) does not make any such stipulation.Phlegm
@Phlegm I'd imagine it permits more aggressive optimization; C can assume that function arguments S* s and T* t do not alias even if they share a common initial sequence as long as no union { S; T; } is in view, while C++ can make that assumption only at link time. Might be worth asking a separate question about that difference.Sexivalent
Interesting idea! (Assuming the difference is intentional.) I think I will. I'm sure it's OK just to 'quote your quotes' and link back here?Phlegm
@Phlegm sure, no problemSexivalent
@ecatmur: I would agree that the clear intention of the Standard's mention of the complete union declaration is to permit programmers to write functions that can accept pointers to many structure types and operate upon members of their CIS, so long as they ensure that a union declaration is visible to let compilers know about the possible aliasing; in practice, gcc assumes that no aliasing will occur between pointers of different structure types will occur, even with members of the CIS of structures that appear within the same complete union declaration, making such code impossible...Nickienicklaus
...except via -fno-strict-aliasing or compiler-specific directives.Nickienicklaus
@Nickienicklaus well, quite; the fact that compilers don't respect the visible-declaration rule puts a bit of a damper on things. underscore_d's excellent Q-and-A https://mcmap.net/q/246365/-union-39-punning-39-structs-w-quot-common-initial-sequence-quot-why-does-c-99-but-not-c-stipulate-a-39-visible-declaration-of-the-union-type-39 says I think all that needs to be said (for now) about the visible-declaration rule.Sexivalent
@ecatmur: That answer misses a major point: in the absence of the aliasing rules, the CIS rule would imply that a pointer to one structure type could be used to access members of another structure type's CIS; the usefulness of the CIS rule comes primarily from that. The language gcc processes is thus a semantically-gutted version of the language Dennis Ritchie invented.Nickienicklaus
@Nickienicklaus I don't think I agree; if Ritchie had meant types to be ignorable he'd have stuck with B. We can still write plenty of useful and interesting programs staying within and respecting the type system, even with strict aliasing rules.Sexivalent
@ecatmur: The CIS rule doesn't "ignore" types. It makes it possible to perform some tasks more cleanly than would be possible in its absence. If one wants to perform a task which would greatly benefit from the CIS rule, should one write nasty-looking code which computes field offsets manually rather than using structure types, decide to forego the desired task in favor of some other "useful and interesting" program, or find a compiler that supports the CIS rule?Nickienicklaus
@Nickienicklaus or alternatively, store and pass pointers to the complete union object rather than to its subobjects.Sexivalent
@ecatmur: That only works if all structure types have the same alignment requirements. Changing a function that should take a pointer to the type with the least-restrictive alignment so it uses the "union" type would make it unusable with any instances that don't satisfy the coarsest alignment type within the union.Nickienicklaus
@ecatmur: The CIS guarantees predate aliasing rules, and their usefulness stemmed from what they implied about pointers to structure types that share a CIS. Since the only plausible way a function that received pointers to two structures that shared a CIS could abide by the CIS rule would be able to honor the CIS rule if the structures happened to be part of a common union would be to honor the CIS rule for structure pointers, there was no perceived need to redundantly state the rule as applying to pointers.Nickienicklaus
@ecatmur: A principle of concise spec writing is that if there would be no plausible reason why anyone would expect that something might conceivably do X, there's no need to waste ink promising that it won't do X. Further, I don't think I've seen any evidence that the authors of C89 intended that a decision not to mandate that a feature or guarantee be supported on even those platforms where it would be most expensive and least useful, should be interpreted as an invitation to drop support on platforms where it had shown itself to be practical and useful.Nickienicklaus
@Nickienicklaus re. alignment: if you pass the union pointer, then every instance of the union satisfies its alignment. re. CIS/aliasing priority: if you say so; I don't have a copy of 1st edition K&R to hand to check, and in any case that ship sailed nearly 30 years ago. re. UB exploitation: compiler writing is a competitive business so it could have been expected; I don't think the ANSI committee were naifs. You always have the option of dialing down optimization to get a compiler that conforms more closely to your own mental model.Sexivalent
@ecatmur: C existed as a language prior to C89. To ensure acceptance, it was necessary to write the Standard so that existing implementations that applied aliasing optimizations could continue to do so. The idea that future optimizations should try to exploit allowances for antiquated implementations, rather than using new directives, is one of the worst ideas in programming-language history. C became popular because it let programmers targeting a particular family of targets exploit features common to that family. How much of a performance hit should programmers be willing to tolerate...Nickienicklaus
...in the name of "optimization"? I don't think the C89 authors were naifs or dishonest--unlike those who trash what should be a useful language for systems programming.Nickienicklaus
@Nickienicklaus really, compiler vendors are dishonest because their products aren't suitable for your style of systems programming? I think we've got far enough off topic here.Sexivalent
@ecatmur: Compiler writers who claim to be C89 compatible while interpreting the rules in a fashion that would have been universally denounced as broken in 1989 are being dishonest. Dennis Ritchie designed a language that allows user code to implement an efficient malloc/free with two prerequisites: (1) it must know what the coarsest alignment is; (2) it needs a pointer to one or more blocks of memory that satisfy that alignment, and are large enough to satisfy malloc() requests. His book doesn't mention a third requirement: It must have a means of erasing the effective type...Nickienicklaus
...from a block of memory that is faster than physically erasing it. Do you think the authors of C89 intended that malloc/free should be possible only on systems meeting the third prerequisite (which no compilers met), or do you think they intended the Standard to be interpreted in a fashion that would make Dennis Ritchie's malloc implementation work correctly with nothing more than his stated prerequisites?Nickienicklaus
@Nickienicklaus do any compilers actually prevent using a static array of character type as backing storage for user-supplied allocation functions? I hadn't heard of anything like that occurring.Sexivalent
@ecatmur: The code struct foo {unsigned char x;} *p,s; p=malloc(sizeof *p); s=*p; is well-defined (assuming malloc returns non-null) no matter how the returned storage was previously used, since structure types have no trap representations. The only for a malloc/free pair to make such code well-defined, however, is to ensure that all storage returned by "malloc" gets physically erased before making it available to user code.Nickienicklaus
@ecatmur: It's not likely that compilers would find an "optimization" that would break such code, but the maintainers of gcc take the viewpoint that programmers who rely upon compilers not finding optimizations deserve what they get if future compilers do find them. Actually, an even worse issue is that in C99, even on systems where long and long long have matching representations, using memcpy to copy from a long to allocated storage and later reading it with a long long* was clearly allowable under C89 but is just as clearly UB under C99.Nickienicklaus
@ecatmur: In any case, returning to the original subject, the C99 rule regarding union-type visibility had a clear purpose which the authors of gcc have chosen to ignore. There is a simple remedy for all of the aliasing-related problems: define directives to select aliasing modes. That would make it possible for compilers to support optimizations beyond those presently allowed (e.g. disallow all aliasing, including char*, except in places where code uses specific directives to indicate it) while supporting concepts which are needed in production code but have been...Nickienicklaus
...neglected by the Standard for years. Unless the authors of the Standard are willing to recognize that different programs have different needs, however, it will be impossible to write code with any confidence that future versions of the Standard won't break it.Nickienicklaus
@Nickienicklaus so, standardizing -fno-strict-aliasing? Makes sense to me. However, the fact that things have come to this pass indicates one of two things: either the Standard committees are unrepresentative of the language's users, or they are representative and your use cases are too much of a minority niche to warrant representation.Sexivalent
@ecatmur: From what I've seen, -fno-strict-alias is pretty standard in project build files, but there should be no need to disable the type based optimizations that are easiest to find and--in most cases--offer the most benefit. Requiring that the easy and harmless optimizations be disabled to ensure correct operation of code where aliasing should be considered obvious (e.g. if a function that receives a uint16_t* and casts it to uint64_t* to operate on groups of four values, such a function should be expected to modify things of type uint16_t*) is not a recipe for good performance.Nickienicklaus
@ecatmur: As for the Committee, I think the real problem is a lack of clarity as to what it is supposed to be defining--is it supposed to be a complete spec for a reduced language that can write efficient code on commonplace platforms, or for a language which can be used on more obscure platforms but whose efficiency is impaired by the inability to use commonplace features? Either could be useful, but what's happened is that the Committee defined the latter but compiler writers interpreted it as the former.Nickienicklaus
@ecatmur: Further, I would suggest that a proper standard should define useful classes of things such that any combination of things from the proper classes may be used together (e.g. for any combination of a machine screw from the set of all standard standard #4-40 machine screws, and a nut from the set of all standard #4-40 nuts, the threads on the nut and screw will match). Defining categories of compliance such that any collection of inputs that was at least selectively-conforming to any compiler that was at least a minimal-conforming implementation would be required to either...Nickienicklaus
...process the inputs successfully without UB, in a fashion consistent with the Standard, or else refuse to do so in implementation-defined fashion (also without UB), would turn the C specifications into a real standard. Incidentally, I wonder if there's any way to do a survey of what fraction of gcc projects don't use -fno-strict-aliasing?Nickienicklaus
@ecatmur: On the other hand, some committee members do seem to be out to lunch. In open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm is there any basis in the Standard for regarding Example 2 as UB without also regarding union { int i; float f;} u; scanf("%d", &u.i); printf("%d\n", u,i); scanf("%f\n", &u.f); printf("%f", u.f); as UB, since the scanf uses u.f while the current member of the union is u.i? Is the latter interpretation even remotely reasonable?Nickienicklaus
@Nickienicklaus wow, interesting. I think the committee position would have to be that your example is UB and you would have to write scanf("%d", &(u.i = {})); or { int i; scanf("%d", &i); u.i = i; } and similarly when changing the current member to u.f. I agree that there's probably a lot of idiomatic code made UB by that rule.Sexivalent
@ecatmur: When the C Standard was written, some compilers existed that applied various aliasing optimizations. The authors of the Standard didn't want to tell users of those implementations that they couldn't migrate to C89 without making their existing code run slower on their existing platforms. I see no evidence that they ever intended that the rules should be interpreted in such a fashion as to force programmers to write less efficient code. The language hyper-modernists are pushing may be useful for some purposes, but people should not pretend it is the language which became popular...Nickienicklaus
...in the 1990s, nor that it is a low-level language. A proper low-level language allows programmers to specify the machine operations they want, without having to request a bunch of operations they don't want and hope an optimizer filters out the redundancy. I would have no argument that achieving top efficiency on newer hardware requires that compilers be able to reorder things more aggressively than was necessary in years past, but the language was never designed for that; code would be faster and more reliable if compilers focused on adding directives to tell the compiler...Nickienicklaus
...that things won't alias, rather than trying to apply broadly rules which were intended for much more limited purposes (legitimizing the behavior of existing compilers).Nickienicklaus
@ecatmur: The dominant optimization the C89 rules were trying to enable was the addition of register caching to in-order execution. Having casts from T* to U* triggered a flush of any register-cached U* values at the point of the cast, except when incompatible arrangements of padding bits would make reinterpretation useless, would have been sufficient to allow most existing code that uses aliasing to continue working as-is, and allowed the majority of code that wouldn't work as-is to be readily adapted by adding some typecasts, without precluding many useful optimizations.Nickienicklaus
M
5

It is legal in both C and C++

For example, in C99 (6.5.2.3/5) and C11 (6.5.2.3/6):

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

Similar provisions exists in C++11 and C++14 (different wording, same meaning).

Malapropos answered 22/7, 2015 at 8:31 Comment(2)
- albeit that C++ - perhaps significantly - does not make the stipulation that "a declaration of the complete type of the union [must be] visible"Phlegm
@underscore_d: And gcc doesn't honor such guarantee with pointers to the unions' member types even when the complete union declaration is visible.Nickienicklaus

© 2022 - 2024 — McMap. All rights reserved.