Aliasing accesses through a std::bit_cast()ed pointer
Asked Answered
A

1

3

Violating strict-aliasing rules yields undefined behavior, e.g. when sending a struct over the network into a char buffer, and then that char pointer is C-style/reinterpret_cast casted to a struct pointer.

The C++ std::bit_cast() function looks like it could be used to cast such pointers in an (implementation?) defined way, i.e. without violating strict-aliasing rules.

Example:

#include <sys/types.h>
#include <netinet/in.h>

#include <bit>

int get_sock_addr(const struct sockaddr *a)
{
    struct sockaddr_in *x = std::bit_cast<struct sockaddr_in*>(a);
    return x->sin_addr.s_addr;
}

So the caller of get_sock_addr() somehow obtained a sockaddr pointer and has determined that it actually points to a sockaddr_in struct.

So, is such pointer casting via std::bit_cast() a valid use-case?

Or does it somehow yield undefined behavior, as well?

If it's defined behavior, does the standard classify such pointer-casting as implementation-defined behavior?


The std::bit_cast() proposal mentions:

If no value representation corresponds to To's object representation then the returned value is unspecified.

So is a standard-conforming compiler possible where different pointer representations are incompatible such that they can't correspond to each other?

Acosta answered 30/5, 2021 at 21:17 Comment(6)
Your example, with its assumptions, is the standard, valid use case for reinterpret_cast.Blessed
@DavisHerring hm, it really depends on what is done with the sockaddr pointer before/after calling the example get_sock_addr function, right? For example, with something like sockaddr *a = getfromsomewhere(); if (a->sa_family == AF_INET) addr = get_sock_addr(a); ... } ... the sockaddr_in object is accessed via 2 aliasing pointers of types which aren't covered by the strict-aliasing rules, correct?Acosta
In the question you simply said “it actually points to a sockaddr_in struct”. The code you just gave tries to validate that assumption but is actually incompatible with it (unfortunately, since this is how traditional C interfaces are designed). The common-initial-subsequence rules are meant to allow this sort of tagging, but they require an actual union.Blessed
@DavisHerring well, in the question I said that the caller 'has determined that it actually points to a sockaddr_in struct'. So the code I gave in my last comment is one possible implementation of this determination step. Sure, including that step in the original example code would have made a better example for the purpose of the question, arguably.Acosta
We are in violent agreement. That code is the obvious means of determining the actual type hidden behind the pointer—but C++ doesn’t allow you to do that, since it involves using the object as a different type to make that very determination.Blessed
@maxschlepzig: Is there any indication of if/when the authors of Standards decided that programmers should jump through hoops to accomplish things that could easily be done via pointer casts in pre-standard C, versus merely intending that implementations not be required to support such constructs in cases where their customers wouldn't need them?Simasimah
S
3

Converting the pointer value is irrelevant. What matters is the object. You have a pointer to an object of type X, but the pointer's type is Y. Trying to access the object of type X through a pointer/reference to unrelated type Y is where the UB comes from.

How you obtained those pointers is mostly irrelevant. So bit_cast is no better than reinterpret_cast in this regard.

If there is no sockaddr_in there, then you can't pretend that there is one. However, it's possible that implicit object creation in C++20 already solves this matter, depending on your code. If it does, then it still doesn't matter how you get the pointer.

Surfacetoair answered 30/5, 2021 at 22:19 Comment(2)
Good point about object creation. Wondered some time ago why I couldn't find the C effective types concept in a C++11 standards document.Acosta
@maxschlepzig: Soundly performing alias analysis based upon an object-based or "effective type"-based model requires either having some kind of action to indicate when storage of one type is being repurposed as another, foregoing optimizations that would "probably" be correct but cannot be proven so, or tolerating optimizations that are "probably" correct but might not be, or simply saying that any storage which has ever been used as one type may never be used as any other. The C++ implicit object creation model is only workable if one is willing to accept one of the above limitations.Simasimah

© 2022 - 2024 — McMap. All rights reserved.