Strict aliasing rule
Asked Answered
D

2

6

I'm reading notes about reinterpret_cast and it's aliasing rules ( http://en.cppreference.com/w/cpp/language/reinterpret_cast ).

I wrote that code:

struct A
{
  int t;
};

char *buf = new char[sizeof(A)];

A *ptr = reinterpret_cast<A*>(buf);
ptr->t = 1;

A *ptr2 = reinterpret_cast<A*>(buf);
cout << ptr2->t;

I think these rules doesn't apply here:

  • T2 is the (possibly cv-qualified) dynamic type of the object
  • T2 and T1 are both (possibly multi-level, possibly cv-qualified at each level) pointers to the same type T3 (since C++11)
  • T2 is an aggregate type or a union type which holds one of the aforementioned types as an element or non-static member (including, recursively, elements of subaggregates and non-static data members of the contained unions): this makes it safe to cast from the first member of a struct and from an element of a union to the struct/union that contains it.
  • T2 is the (possibly cv-qualified) signed or unsigned variant of the dynamic type of the object
  • T2 is a (possibly cv-qualified) base class of the dynamic type of the object
  • T2 is char or unsigned char

In my opinion this code is incorrect. Am I right? Is code correct or not?

On the other hand what about connect function (man 2 connect) and struct sockaddr?

   int connect(int sockfd, const struct sockaddr *addr,
               socklen_t addrlen);

Eg. we have struct sockaddr_in and we have to cast it to struct sockaddr. Above rules also doesn't apply, so is this cast incorrect?

Diamagnetism answered 24/7, 2015 at 16:5 Comment(8)
Make that char buf[sizeof(A)] and gcc will detect both violations at -Wstrict-aliasing=2Mathison
Isn't connect a C function?Cutie
Yes, but I focus on struct sockaddr not on a function.Diamagnetism
Type aliasing rules are about access to stored objects, not about casts. Casting some pointer type to another pointer type can't break strict aliasing rules; you need to dereference a pointer to break the rules.Marna
Yes, I agree. But I'm dereferencing it, and changing it's object. ptr->t = 1;Diamagnetism
@Adam, yes, your code breaks the rules. But you ask about the connect method, and specifically - "Above rules also doesn't apply, so is this cast incorrect?" - as I say, the aliasing rules are not about casts.Marna
Ahhh, in that mean. Ok, thank you.Diamagnetism
Please note I have edited my answer to take into account alignment considerations (with which I wasn't overly familiar in 2015)Eal
E
7

Yeah, it's invalid, but not because you're converting a char* to an A*: it's because you are not obtaining a A* that actually points to an A* and, as you've identified, none of the type aliasing options fit.

You'd need something like this:

#include <new>
#include <iostream>

struct A
{
  int t;
};

char *buf = new char[sizeof(A)];

A* ptr = new (buf) A;
ptr->t = 1;

// Also valid, because points to an actual constructed A!
A *ptr2 = reinterpret_cast<A*>(buf);
std::cout << ptr2->t;

Now type aliasing doesn't come into it at all (though keep reading because there's more to do!).

In reality, this is not enough. We must also consider alignment. Though the above code may appear to work, to be fully safe and whatnot you will need to placement-new into a properly-aligned region of storage, rather than just a casual block of chars.

The standard library (since C++11) gives us std::aligned_storage to do this:

using Storage = std::aligned_storage<sizeof(A), alignof(A)>::type;
auto* buf = new Storage;

Or, if you don't need to dynamically allocate it, just:

Storage data;

Then, do your placement-new:

new (buf) A();
// or: new(&data) A();

And to use it:

auto ptr = reinterpret_cast<A*>(buf);
// or: auto ptr = reinterpret_cast<A*>(&data);

All in it looks like this:

#include <iostream>
#include <new>
#include <type_traits>

struct A
{
  int t;
};

int main()
{
    using Storage = std::aligned_storage<sizeof(A), alignof(A)>::type;

    auto* buf = new Storage;
    A* ptr = new(buf) A();

    ptr->t = 1;

    // Also valid, because points to an actual constructed A!
    A* ptr2 = reinterpret_cast<A*>(buf);
    std::cout << ptr2->t;
}

(live demo)

Even then, since C++17 this is somewhat more complicated; see the relevant cppreference pages for more information and pay attention to std::launder.

Of course, this whole thing appears contrived because you only want one A and therefore don't need array form; in fact, you'd just create a bog-standard A in the first place. But, assuming buf is actually larger in reality and you're creating an allocator or something similar, this makes some sense.

Eal answered 24/7, 2015 at 16:19 Comment(9)
Meh, switching to char buf[sizeof(A)] still results in a strict-aliasing warning. Maybe I'm missing something then. Alignment?Eal
"none of the type aliasing options don't fit." - double negative, unintentional I presume.Marna
There were some discussions on SO whether the warning after placement-new is a false positiveMathison
@T.C. Vaguely recall a debate over this. I think I decided to stick with the notion that unless you explicitly created an A, trivial or otherwise, you were in UB territory. Y'know, just to be on the safe side. :) Anyway, at worst it'll be a no-op beyond suppressing useless warnings/broken optimisations.Eal
@LightnessRacesinOrbit here is about false positive warning after new: #27004227Diamagnetism
@LightnessRacesinOrbit I think this needs a minor tweak. Technically, the compiler doesn't need to give the allocation of the char array the correct alignment for A so I think that technically there is UB here. Instead of char *buf = new char[sizeof(A)];, you need something like: char *buf = std::aligned_alloc(alignof(A), lcm(alignof(A), sizeof(A)) ); where lcm is some least common multiple function. In practice new always aligns to something larger than 1, but technically when allocating char, new doesn't have to align to larger than 1.Gothicize
@JamesMatta There we go. Thanks for the input - I didn't know any of this in 2015Eal
@LightnessRacesinOrbit I didn't really either. I knew about alignment but not placement new or that putting an object at something other than it's proper alignment was UB. I just stumbled across this question when it was referenced in the Cpplang slackGothicize
Is not char *buf = new char[sizeof(A)]; is aligned for any type by standard, as explained here: https://mcmap.net/q/319700/-does-new-char-actually-guarantee-aligned-memory-for-a-class-type. If so, do we really need the std::aligned_storage approach?Barite
F
1

The C aliasing rules from which the rules of C++ were derived included a footnote specifying that the purpose of the rules was to say when things may alias. The authors of the Standard didn't think it necessary to forbid implementations from applying the rules in needlessly restrictive fashion in cases where things don't alias, because they thought compiler writers would honor the proverb "Don't prevent the programmer from doing what needs to be done", which the authors of the Standard viewed as part of the Spirit of C.

Situations where it would be necessary to use an lvalue of an aggregate's member type to actually alias a value of the aggregate type are rare, so it's entirely reasonable that the Standard doesn't require compilers to recognize such aliasing. Applying the rules restrictively in cases that don't involve aliasing, however, would cause something like:

union foo {int x; float y;} foo;
int *p = &foo.x;
*p = 1;

or even, for that matter,

union foo {int x; float y;} foo;
foo.x = 1;

to invoke UB since the assignment is used to access the stored values of a union foo and a float using an int, which is not one of the allowed types. Any quality compiler, however, should be able to recognize that an operation done on an lvalue which is visibly freshly derived from a union foo is an access to a union foo, and an access to a union foo is allowed to affect the stored values of its members (like the float member in this case).

The authors of the Standard probably declined to make the footnote normative because doing so would require a formal definition of when an access via freshly-derived lvalue is an access to the parent, and what kinds of access patterns constitute aliasing. While most cases would be pretty clear cut, there are some corner cases which implementations intended for low-level programming should probably interpret more pessimistically than those intended for e.g. high-end number crunching, and the authors of the Standard figured that anyone who could figure out how to handle the harder cases should be able to handle the easy ones.

Fatigued answered 24/12, 2018 at 18:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.