C++ Undefined behaviour with unions
Asked Answered
c++
E

3

8

Was just reading about some anonymous structures and how it is isn't standard and some general use case for it is undefined behaviour...

This is the basic case:

struct Point {
    union {
       struct {
           float x, y;
       };
       float v[2];
    };
};

So writing to x and then reading from v[0] would be undefined in that you would expect them to be the same but it may not be so.

Not sure if this is in the standard but unions of the same type...

union{ float a; float b; };

Is it undefined to write to a and then read from b ?

That is to say does the standard say anything about binary representation of arrays and sequential variables of the same type.

Erastian answered 24/6, 2013 at 10:31 Comment(4)
You can just highlight the code snippet and hit Ctrl+K. Only use backticks for short one line code.Mariande
Unnamed structs in unions are not part of ISO-C++ (although they are supported by many compilers as an extension).Anticipation
@Anticipation Good point, but... Give the struct a name, and his question is still just as valid.Silber
@JamesKanze Of course, I did not mean to criticize the question, I just wanted to point it out. Though now that I think of it, the particular construct used by the OP becomes a lot less useful when forced to use named structs instead.Anticipation
S
7

The standard says that reading from any element in a union other than the last one written is undefined behavior. In theory, the compiler could generate code which somehow kept track of the reads and writes, and triggered a signal if you violated the rule (even if the two are the same type). A compiler could also use the fact for some sort of optimization: if you write to a (or x), it can assume that you do not read b (or v[0]) when optimizing.

In practice, every compiler I know supports this, if the union is clearly visible, and there are cases in many (most?, all?) where even legal use will fail if the union is not visible (e.g.:

union  U { int i; float f; };

int f( int* pi, int* pf ) { int r = *pi; *pf = 3.14159; return r; }

//  ...
U u;
u.i = 1;
std::cout << f( &u.i, &u.f );

I've actually seen this fail with g++, although according to the standard, it is perfectly legal.)

Also, even if the compiler supports writing to Point::x and reading from Point::v[0], there's no guarantee that Point::y and Point::v[1] even have the same physical address.

Silber answered 24/6, 2013 at 10:41 Comment(4)
Thanks for the reply! Your last point about y and v[1] having no guarantee reminded me of some opengl code, specifically passing vertex data to the gpu. Not sure if you are familiar with glVertexAttribPointer() but I assume code will look like this: struct Vertex { float x, y, z; float u, v; } vertices[10]; glVertexAttribPointer( ..., 3, GL_FLOAT, ... , &verticies.x); I assume opengl would treat it as an array but if there is no guarantee it is the same then this code would be considered undefined ?Erastian
@skin If they actually do treat &verticies.x as a pointer to the first element of an array, the behavior is undefined (and there are, or at least have been compilers which would crash on execution in such cases).Silber
(I hope this answer isn't too old for question!) Did you intend for the second argument in f() to be float* instead of int*? And how did it fail with g++?Quartile
Why is that legal? You are passing a float * to int *, which shouldn't compile. If you correct it to float *, wouldn't it be disallowed by strict pointer aliasing?Pecuniary
R
0

The standard requires that in a union "[e]ach data member is allocated as if it were the sole member of a struct." (9.5)

It also requires that struct { float x, y; } and float v[2] must have the same internal representation (9.2) and thus you could safely reinterpret cast one as the other

Taken together these two rules guarantee that the union you describe will function provided that it is genuinely written to memory. However, because the standard only requires that the last data member written be valid it's theoretically possible to have an implementation that fails if the union is only used as a local variable. I'd be amazed if that ever actually happens, however.

Refine answered 24/6, 2013 at 10:56 Comment(1)
...the only "guarantee" here is that there is no guarantee if you try to read a member other than the one last written. I don't think talk about requirements for internal representation enters into that. Sure, GCC and other major compilers will happily provide the 'expected' type-punning behaviour, but I wouldn't imply to anyone that they can rely on this.Congresswoman
G
-7

I did not get why you have used float v[2];

The simple union for a point structure can be defined as:

union{

struct {

    float a;
    float b;
};

} Point;

You can access the values in unioin as:

Point.a = 10.5; 

point.b = 12.2; //example
Geralyngeraniaceous answered 24/6, 2013 at 10:54 Comment(2)
The whole point of the union is that you can access x as both Point.x and Point.v[0]. This is useful for example when dealing with different kinds of APIs that prefer either the array or the coordinate-based form.Anticipation
That's not what the question is about and your union doesn't even make sense to be union at all, a struct would have done (but again, this is entirely irrelevant for the actual question, aynway).Lodovico

© 2022 - 2024 — McMap. All rights reserved.