Reinterpret struct with members of the same type as an array in a standard compliant way [duplicate]
Asked Answered
M

5

9

In various 3d math codebases I sometimes encounter something like this:

struct vec {
    float x, y, z;

    float& operator[](std::size_t i)
    {
        assert(i < 3);
        return (&x)[i];
    }
};

Which, AFAIK is illegal because implementations are allowed to spuriously add padding between members, even if they are of the same type, though none will do so in practice.

Can this be made legal by imposing constraints via static_asserts?

static_assert(sizeof(vec) == sizeof(float) * 3);

I.e. does static_assert not being triggered implies operator[] does what is expected and doesn't invoke UB at runtime?

Methane answered 1/1, 2017 at 21:24 Comment(4)
Personally, I'd have a float [3] member, possibly with accessors like x(), y() and z()Crystlecs
I'd make something like tuple does: a get<i> function that select at compile time the member.Dziggetai
@skypjack tuple being introduced lately will not prevent you to write a get<i>(vec) function.Dziggetai
@GuillaumeRacicot Sorry, misunderstanding. I got you were suggesting to use a tuple. My fault.Pearlinepearlman
C
6

No, it is not legal because when adding an integer to a pointer, the following applies ([expr.add]/5):

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

y occupies the memory location one past the end of x (considered as an array with one element) so adding 1 to &x is defined, but adding 2 to &x is undefined.

Candlestick answered 1/1, 2017 at 21:32 Comment(9)
But what if it so happens that (&x + 1 == &y) && (&y + 1 == &z)? Is it really possible that &x + 2 != &z or that such addition overflows?Methane
@yurikilochek: You asked what the standard says happens. That's what it says happens.Trembly
@NicolBolas so &x + 1 + 1 == &z would be ok then?Methane
@yurikilochek: No, it would not. You cannot advance a one-past-the-end pointer. The quote above makes that quite clear. "the behavior is undefined".Trembly
@NicolBolas but &x + 1 is the same pointer as &y, which I can advance.Methane
@yurikilochek: You can advance &y because the standard says so. You cannot advance &x + 1 because the standard says so. The fact that &y may indeed have the same value as &x + 1 is irrelevant to what the standard says you can do with those expressions.Trembly
Let us continue this discussion in chat.Methane
@yurikilochek: Pointers aren't numbers, and they aren't addresses. They are pointers. They point to things. A pointer to one thing cannot be safely "hacked" to now point to something else, even if you think the "numbers" line up.Assignable
@yurikilochek: @x+1 is defined by the standard as one past end of array. That means that you can neither dereference it, nor increase it again.The only thing you can do with it is to compare it with an address inside the array, on decrement it provided the result points inside the array (or is still one past end). Every other action explicitely invokes undefined behaviour.Scorn
A
2

You can never be sure that this will work

There is no guarantee of contiguity of subsequent members, even if this will frequently work perfectly in practice thanks to usual float alignment properties and permissive pointer arithmetic.

This is laid down in the following clause of the C++ standard:

[class.mem]/18: Non-static data-members (...) with the same access control are allocated so that later members have higher addresses within the class object. Implementation alignment requirements might cause two adjacent members not to be allocated after each other.

There is no way to make this legal using static_assert nor alignas constraints. All you can do is to prevent the compilation, when the elements are not contiguous, using the property that the address of each object is unique:

    static_assert (&y==&x+1 && &z==&y+1, "PADDING in vector"); 

But you can reimplement the operator to make it standard compliant

A safe alternative, would be to reimplement operator[] to get rid of the contiguity requirement for the three members:

struct vec {
    float x,y,z; 

    float& operator[](size_t i)
    {
        assert(i<3); 
        if (i==0)     // optimizing compiler will make this as efficient as your original code
            return x; 
        else if (i==1) 
            return y; 
        else return z;
    }
};

Note that an optimizing compiler will generate very similar code for both the reimplementation and for your original version (see an example here). So rather choose the compliant version.

Afar answered 1/1, 2017 at 21:40 Comment(2)
Re There is no guarantee of contiguity of subsequent members: Actually there is, because vec is a standard-layout type. The pointer arithmetic is still illegal though.Intolerant
@Intolerant Even if it may work with all mainstream compilers, I think that [class.mem]/18 says that there is no such guarantee: “non-static data-members (...) with the same access control are allocated so that later members have higher addresses within the class object. Implementation alignment requirements might cause two adjacent members not to be allocated after each other”. Do you have any quote in the standard that would support your statement ? Standard layout only gives assurance that all data of object o is comprised in the sizeof(o) bytes following its start address, nothing about gaps.Afar
D
2

Type aliasing (use of more then one type for essentially the same data) is a huge problem in C++. If you keep member functions out of structs and maintain them as PODs, things ought to work. But

  static_assert(sizeof(vec) == sizeof(float) * 3);

can't make accessing one type as another technically legal. In practice of course there will be no padding, but C++ isn't clever enough to realise that vec is an array of floats and an array of vecs is an array of floats constrained to be a multiple of three, and the casting &vecasarray[0] to a vec * is legal but casting &vecasarray[1] is illegal.

Deutoplasm answered 1/1, 2017 at 22:24 Comment(0)
S
2

According to the standard, it is clearly Undefined Behaviour, because you either do pointer arithmetics outside of an array or alias the content of a struct and an array.

The problem is that math3D code can be used intensively, and low level optimization makes sense. The C++ conformant way would be to directly store the array, and use accessors or references to individual members of the array. And neither of those 2 options are perfectly fine:

  • accessors:

    struct vec {
    private:
        float arr[3];
    public:
        float& operator[](std::size_t i)
        {
            assert(i < 3);
            return arr[i];
        }
        float& x() & { return arr[0];}
        float& y() & { return arr[1];}
        float& z() & { return arr[2];}
    };
    

    The problem is that using a function as a lvalue is not natural for old C programmers: v.x() = 1.0; is indeed correct but I'd rather avoid a library that would force me to write that. Of course we could use setters, but if possible, I prefere to write v.x = 1.0; than v.setx(1.0);, because of the common idiom v.x = v.z = 1.0; v.y = 2.0;. It is only my opinion, but I find it neater than v.x() = v.z() = 1.0; v.y() = 2.0; or v.setx(v.sety(1.0))); v.setz(2.0);.

  • references

    struct vec {
    private:
        float arr[3];
    public:
        float& operator[](std::size_t i)
        {
            assert(i < 3);
            return arr[i];
        }
        float& x;
        float& y;
        float& z;
        vec(): x(arr[0]), y(arr[1]), z(arr[2]) {}
    };
    

    Nice! We can write v.x and v[0], both representing the same memory... unfortunately, the compilers are still not smart enough to see that the refs are just aliases for an in struct array and the size of the struct is twice the size of the array!

For those reasons, the incorrect aliasing is still commonly used...

Scorn answered 2/1, 2017 at 12:41 Comment(1)
Regarding the accessors, they should be float& y() { return arr[1];} and float& z() { return arr[2];}, right?Pinworm
A
-3

How about storing the data member as array and access them by names?

struct vec {
    float p[3];

    float& x() { return p[0]; }
    float& y() { return p[1]; }
    float& z() { return p[2]; }

    float& operator[](std::size_t i)
    {
        assert(i < 3);
        return p[i];
    }
};

EDIT: For the original approach, if x, y and z are all the member variables you have, then the struct will always be the size of 3 floats, so static_assert can be used for checking that operator[] will access within bounded size.

See also: C++ struct member memory allocation

EDIT 2: Like Brian said in another answer, (&x)[i] itself is undefined behaviors in the standard. However, given that the 3 floats are the only data members, the code in this context should be safe.

To be pedantic on syntax correctness:

struct vec {
  float x, y, z;
  float* const p = &x;

  float& operator[](std::size_t i) {
    assert(i < 3);
    return p[i];
  }
};

Although this will increase each vec by the size of a pointer.

Aluminum answered 1/1, 2017 at 21:29 Comment(3)
This doesn't answer the question - Can this be made legal by imposing constraints via static_asserts?Pearlinepearlman
Thanks for letting me know. Edited my answer.Aluminum
Although it does not answer the question, it is still probably the best way to do it in term of defined behavior with good performance.Utimer

© 2022 - 2024 — McMap. All rights reserved.