When is char* safe for strict pointer aliasing?
Asked Answered
J

2

27

I've been trying to understand the strict aliasing rules as they apply to the char pointer.

Here this is stated:

It is always presumed that a char* may refer to an alias of any object.

Ok so in the context of socket code, I can do this:

struct SocketMsg
{
   int a;
   int b;
};

int main(int argc, char** argv)
{
   // Some code...
   SocketMsg msgToSend;
   msgToSend.a = 0;
   msgToSend.b = 1;
   send(socket, (char*)(&msgToSend), sizeof(msgToSend);
};

But then there's this statement

The converse is not true. Casting a char* to a pointer of any type other than a char* and dereferencing it is usually in violation of the strict aliasing rule.

Does this mean that when I recv a char array, I can't reinterpret cast to a struct when I know the structure of the message:

struct SocketMsgToRecv
{
    int a;
    int b;
};

int main()
{
    SocketMsgToRecv* pointerToMsg;
    char msgBuff[100];
    ...
    recv(socket, msgBuff, 100);
    // Ommiting make sure we have a complete message from the stream
    // but lets assume msgBuff[0]  has a complete msg, and lets interpret the msg

    // SAFE!?!?!?
    pointerToMsg = &msgBuff[0];

    printf("Got Msg: a: %i, b: %i", pointerToMsg->a, pointerToMsg->b);
}

Will this second example not work because the base type is a char array and I'm casting it to a struct? How do you handle this situation in a strictly aliased world?

Jumna answered 4/11, 2008 at 16:31 Comment(1)
Doesn't the second piece of code require you to casr explicitly? Have you enabled all warnings?Mauchi
M
6

Re @Adam Rosenfield: The union will achieve alignment so long as the supplier of the char* started out doing something similar.

It may be useful to stand back and figure out what this is all about.

The basis for the aliasing rule is the fact that compilers may place values of different simple types on different memory boundaries to improve access and that hardware in some cases may require such alignment to be able to use the pointer at all. This can also show up in structs where there is a variety of different-sized elements. The struct may be started out on a good boundary. In addition, the compiler may still introduce slack bites in the interior of the struct to accomplish proper alignment of the struct elements that require it.

Considering that compilers often have options for controlling how all of this is handled, or not, you can see that there are many ways that surprises can occur. This is particularly important to be aware of when passing pointers to structs (cast as char* or not) into libraries that were compiled to expect different alignment conventions.

What about char*?

The presumption about char* is that sizeof(char) == 1 (relative to the sizes of all other sizable data) and that char* pointers don't have any alignment requirement. So a genuine char* can always be safely passed around and used successfully without concern for alignment, and that goes for any element of a char[] array, performing ++ and -- on the pointers, and so on. (Oddly, void* is not quite the same.)

Now you should be able to see how if you transfer some sort of structure data into a char[] array that was not itself aligned appropriately, attempting to cast back to a pointer that does require alignment(s) can be a serious problem.

If you make a union of a char[] array and a struct, the most-demanding alignment (i.e., that of the struct) will be honored by the compiler. This will work if the supplier and the consumer are effectively using matching unions so that casting of the struct* to char* and back works just fine.

In that case, I would hope that the data was created in a similar union before the pointer to it was cast to char* or it was transferred any other way as an array of sizeof(char) bytes. It is also important to make sure any compiler options are compatible between the libraries relied upon and your own code.

Muniz answered 4/11, 2008 at 20:19 Comment(4)
are all 3 of char, signed char, unsigned char OK for aliasing ? and with any CV-qualification combination as well ?Simdars
The aliasing rules have nothing to do with alignment. Per the C89 rationale, given global declarations like int i; float *fp;, the purpose is to allow compilers to keep i in a register across accesses to *fp. The idea was that a compiler shouldn't have to pessimistically assume that a write to *fp might alter i when it had no reason to expect that *fp would point at something that wasn't a float*. I don't think the rule was ever intended to let compilers ignore cases where aliasing is obvious (taking the address of an object should give a compiler a strong clue...Endres
...that the object in question is about to be accessed via pointer, and casting an int* to a float* should give the compiler a strong clue that an int is likely to be modified via write to a float*, but gcc no longer feels any obligation to notice such things.Endres
Strict aliasing is not because of alignment requirement. How can this answer get 9 upvotes?Mauchi
D
4

Correct, the second example is in violation of the strict aliasing rules, so if you compile with the -fstrict-aliasing flag, there's a chance you may get incorrect object code. The fully correct solution would be to use a union here:

union
{
  SocketMsgToRecv msg;
  char msgBuff[100];
};

recv(socket, msgBuff, 100);

printf("Got Msg: a: %i, b: %i", msg.a, msg.b);
Dominy answered 4/11, 2008 at 16:37 Comment(5)
Is this in compliance with the standard or just compiler letting you get away with writing to one member and reading from another?Tabb
The union is completely unnecessary. Simply pass a pointer to the structure (cast to char *) to recv.Clemence
Note that -fstrict-aliasing is on by default at -O2 and higher in gccTrellis
@R..GitHubSTOPHELPINGICE could you elaborate on that further please?Tyburn
There are multiple things wrong with this answer. (1) recv has four arguments. (2) There’s no need for type punning via a union here. (3) But we also don’t need to cast, contrary to what a previous comment said. (4) In fact, the conventional usage is as simple as SocketMsgToRecv msg; ssize_t ret = recv(socket, &msg, sizeof msg, flags); — and of course we always need to handle errors. (I realise this is an old answer but it’s one of the top hits on Google.)Viscount

© 2022 - 2024 — McMap. All rights reserved.