Active member of an union after assignment
Asked Answered
M

1

9

Suppose sizeof( int ) == sizeof( float ), and I have the following code snippet:

union U{
    int i;
    float f;
};

U u1, u2;
u1.i = 1;    //i is the active member of u1
u2.f = 1.0f; //f is the active member of u2

u1 = u2;

My questions:

  1. Does it have a defined behaviour? If not why?
  2. What is the active member of u1 after the assignment and why?
  3. Which member of u1 can be read from after the assignment without causing an UB and why?
Majestic answered 31/8, 2022 at 11:39 Comment(3)
The float part, certainly. When you perform assignment, the active member is copied along with the rest. But I'll let someone versed in the standard answer.Schneider
I believe this should not be UB, even when sizes are not equal. Assigning u1 = u2; should make u1 to have the same active member as u2.Strophanthus
The title sounds like a Workplace SE question!Gerson
P
13
  1. Does it have a defined behaviour? If not why?

It has defined behaviour. The assignment copy the value of u2 and for me the value of an union is a designation of the active member (although that part is not represented and so can't be examined but it determines what is UB and what is not) and the value of the active member if there is one.

  1. What is the active member of u1 after the assignment and why?

f, see above.

Which member of u1 can be read from after the assignment without causing an UB and why?

f. In general, only the active member of an union can be read without UB in C++. There is a special rule for union of structs where those struct have a common initial sequence. Note: C is more relaxed and makes implementation defined (and perhaps completely defined) some cases which are undefined in C++ and I may have missed some changes in C++ to make it more compatible with C.


If someone want to look up the standard, I suggest starting with class.copy.assign/13.

Provoke answered 31/8, 2022 at 12:15 Comment(6)
From your first point, u1 = u2 is same as assigning a float value to and int type. Is it UB for all cases (irrespective of u2.f value)?Tarragon
My understanding (I've no time to back up with standard citations, I'm rusty at that game and would have to rediscover things I used to know), u1 = u2 is ending the livetime of u1.i, an object of type int, starting the lifetime of u1.f, an object of type float at the same address, assigning u2.f to that object. As long as u2.f is valid, the assignment is valid.Provoke
Note that many C++ implementations do define the behaviour of reading an inactive union member. For example all compilers that implement GNU extensions to C and C++, such as GCC, clang, and ICC define it like C99 does. ISO C++ leaving a behaviour undefined doesn't prevent implementations from defining it. The way GCC describes it with a citation of the C89/C90 standard does seem to indicate that C89 required implementations to pick some behaviour, but that's ancient; modern C defines itTheodore
("Implementation defined behaviour" has a specific meaning, i.e. behaviour that the ISO standard requires an implementation to define somehow. I hadn't remembered C89 making union type-punning implementation defined, and that's a pretty obscure point, so I just wanted to make that distinction for future readers. C99 and later fully define the behaviour. And many but not all modern C++ compilers also define the behaviour, but ISO C++ doesn't. So +1)Theodore
@PeterCordes: Looking at C89 Defect Report #028, I don't think there's any clarify as to what "implementation-defined" was supposed to mean. In considering something like: int test(void) { union U { float f; int i; } u; int *fp = &u.f; *fp = 1.0; int *ip = &u.i; return *ip; };, Defect Report #028 says that because the behavior or writing one member and reading another is implementation-defined, such action is not permitted, and therefore performing such an action via pointers would yield Undefined Behavior. Such a statement would only make sense if IDB and UB were near synonyms.Deadbeat
@PeterCordes: I think the actual example given in DR#028 should be UB, but the reason given in #028 is completely bogus. If one were to read the aliasing rule as requiring that an lvalue used for access have a fresh visible association with one of the indicated type, the example given in DR#028 would be UB, but most situations clang and gcc can't handle without -fno-strict-aliasing involve lvalues that are freshly associated with suitable types in ways that would be visible to any compiler that bothered to look.Deadbeat

© 2022 - 2024 — McMap. All rights reserved.