will casting around sockaddr_storage and sockaddr_in break strict aliasing

Asked 11/2, 2017 at 16:19 Answered 12/2, 2017 at 17:36

Solved c++c linux sockets strict-aliasing

Following my previous question, I'm really curious about this code -

case AF_INET: 
    {
        struct sockaddr_in * tmp =
            reinterpret_cast<struct sockaddr_in *> (&addrStruct);
        tmp->sin_family = AF_INET;
        tmp->sin_port = htons(port);
        inet_pton(AF_INET, addr, tmp->sin_addr);
    }
    break;

Before asking this question, I've searched across SO about same topic and have got mix responses about this topic. For example, see this, this and this post which say that it is somehow safe to use this kind of code. Also there's another post that says to use unions for such task but again the comments on accepted answer beg to differ.

Microsoft's documentation on same structure says -

Application developers normally use only the ss_family member of the SOCKADDR_STORAGE. The remaining members ensure that the SOCKADDR_STORAGE can contain either an IPv6 or IPv4 address and the structure is padded appropriately to achieve 64-bit alignment. Such alignment enables protocol-specific socket address data structures to access fields within a SOCKADDR_STORAGE structure without alignment problems. With its padding, the SOCKADDR_STORAGE structure is 128 bytes in length.

Opengroup's documentation states -

The header shall define the sockaddr_storage structure. This structure shall be:

Large enough to accommodate all supported protocol-specific address structures

Aligned at an appropriate boundary so that pointers to it can be cast as pointers to protocol-specific address structures and used to access the fields of those structures without alignment problems

Man page of socket also says same -

In addition, the sockets API provides the data type struct sockaddr_storage. This type is suitable to accommodate all supported domain-specific socket address structures; it is large enough and is aligned properly. (In particular, it is large enough to hold IPv6 socket addresses.)

I've seen multiple implementation using such casts in both C and C++ languages in the wild and now I'm uncertain of the fact which one is right since there are some posts that contradict with above claims - this and this.

So which one is the safe and right way to fill up a sockaddr_storage structure? Are these pointer casts safe? or the union method? I'm also aware of the getaddrinfo() call but that seems a little complicated for the above task of just filling the structs. There is one other recommended way with memcpy, is this safe?

Gorgerin answered 11/2, 2017 at 16:19 Comment(2)

You should never call reinterpret_cast if you are not sure and in particular you should never do it on structures you have not defined (except if it clearly documented as valid). – Gluttonize 11/2, 2017 at 18:3

@Gluttonize documented by whom? posix? microsoft? both document that it will align without problems on a cast operation but reading different ques/ans/comments I'm not certain if compilers allow this.. – Gorgerin 11/2, 2017 at 18:5

C and C++ compilers have become much more sophisticated in the past decade than they were when the sockaddr interfaces were designed, or even when C99 was written. As part of that, the understood purpose of "undefined behavior" has changed. Back in the day, undefined behavior was usually intended to cover disagreement among hardware implementations as to what the semantics of an operation was. But nowadays, thanks ultimately to a number of organizations who wanted to stop having to write FORTRAN and could afford to pay compiler engineers to make that happen, undefined behavior is a thing that compilers use to make inferences about the code. Left shift is a good example: C99 6.5.7p3,4 (rearranged a little for clarity) reads

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If the value of [E2] is negative or is greater than or equal to the width of the promoted [E1], the behavior is undefined.

So, for instance, 1u << 33 is UB on a platform where unsigned int is 32 bits wide. The committee made this undefined because different CPU architectures' left-shift instructions do different things in this case: some produce zero consistently, some reduce the shift count modulo the width of the type (x86), some reduce the shift count modulo some larger number (ARM), and at least one historically-common architecture would trap (I don't know which one, but that's why it's undefined and not unspecified). But nowadays, if you write

unsigned int left_shift(unsigned int x, unsigned int y)
{ return x << y; }

on a platform with 32-bit unsigned int, the compiler, knowing the above UB rule, will infer that y must have a value in the range 0 through 31 when the function is called. It will feed that range into interprocedural analysis, and use it to do things like remove unnecessary range checks in the callers. If the programmer has reason to think they aren't unnecessary, well, now you begin to see why this topic is such a can of worms. (Modern compilers can optimize x << (y&31) into a single shift instruction for ISAs like x86 where the shift instruction implements that masking.)

For more on this change in the purpose of undefined behavior, please see the LLVM people's three-part essay on the subject (1 2 3).

Now that you understand that, I can actually answer your question.

These are the definitions of struct sockaddr, struct sockaddr_in, and struct sockaddr_storage, after eliding some irrelevant complications:

struct sockaddr {
    uint16_t sa_family;
};
struct sockaddr_in { 
    uint16_t sin_family;
    uint16_t sin_port;
    uint32_t sin_addr;
};
struct sockaddr_storage {
    uint16_t ss_family;
    char __ss_storage[128 - (sizeof(uint16_t) + sizeof(unsigned long))];
    unsigned long int __ss_force_alignment;
};

This is poor man's subclassing. It is a ubiquitous idiom in C. You define a set of structures that all have the same initial field, which is a code number that tells you which structure you've actually been passed. Back in the day, everyone expected that if you allocated and filled in a struct sockaddr_in, upcast it to struct sockaddr, and passed it to e.g. connect, the implementation of connect could dereference the struct sockaddr pointer safely to retrieve the sa_family field, learn that it was looking at a sockaddr_in, cast it back, and proceed. The C standard has always said that dereferencing the struct sockaddr pointer triggers undefined behavior—those rules are unchanged since C89—but everyone expected that it would be safe in this case because it would be the same "load 16 bits" instruction no matter which structure you were really working with. That's why POSIX and the Windows documentation talk about alignment; the people who wrote those specs, back in the 1990s, thought that the primary way this could actually be trouble was if you wound up issuing a misaligned memory access.

But the text of the standard doesn't say anything about load instructions, nor alignment. This is what it says (C99 §6.5p7 + footnote):

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:⁷³⁾

a type compatible with the effective type of the object,

a qualified version of a type compatible with the effective type of the object,

a type that is the signed or unsigned type corresponding to the effective type of the object,

a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

a character type.

⁷³⁾ The intent of this list is to specify those circumstances in which an object may or may not be aliased.

struct types are "compatible" only with themselves, and the "effective type" of a declared variable is its declared type. So the code you showed...

struct sockaddr_storage addrStruct;
/* ... */
case AF_INET: 
{
    struct sockaddr_in * tmp = (struct sockaddr_in *)&addrStruct;
    tmp->sin_family = AF_INET;
    tmp->sin_port = htons(port);
    inet_pton(AF_INET, addr, tmp->sin_addr);
}
break;

... has undefined behavior, and compilers can make inferences from that, even though naive code generation would behave as expected. What a modern compiler is likely to infer from this is that the case AF_INET can never be executed. It will delete the entire block as dead code, and hilarity will ensue.

So how do you work with sockaddr safely? The shortest answer is "just use getaddrinfo and getnameinfo." They deal with this problem for you.

But maybe you need to work with an address family, such as AF_UNIX, that getaddrinfo doesn't handle. In most cases you can just declare a variable of the correct type for the address family, and cast it only when calling functions that take a struct sockaddr *

int connect_to_unix_socket(const char *path, int type)
{
    struct sockaddr_un sun;
    size_t plen = strlen(path);
    if (plen >= sizeof(sun.sun_path)) {
        errno = ENAMETOOLONG;
        return -1;
    }
    sun.sun_family = AF_UNIX;
    memcpy(sun.sun_path, path, plen+1);

    int sock = socket(AF_UNIX, type, 0);
    if (sock == -1) return -1;

    if (connect(sock, (struct sockaddr *)&sun,
                offsetof(struct sockaddr_un, sun_path) + plen)) {
        int save_errno = errno;
        close(sock);
        errno = save_errno;
        return -1;
    }
    return sock;
}

The implementation of connect has to jump through some hoops to make this safe, but that is Not Your Problem.

[EDIT Jan 2023: What this answer used to say about sockaddr_storage was wrong and I'm embarrassed to admit I didn't notice the problem for six years.] It is tempting to use struct sockaddr_storage as a convenient way to know how big to make the buffer for a call to getpeername, in a server that needs to handle both IPv4 and IPv6 addresses. However, it is less error-prone and has fewer strict-aliasing problems if you use a union with each of the concrete address families you care about, plus plain struct sockaddr:

#ifndef NI_IDN
#define NI_IDN 0
#endif

union sockaddr_ipvX {
    struct sockaddr sa;
    struct sockaddr_in sin;
    struct sockaddr_in6 sin6;
};

char *get_peer_hostname(int sock)
{
    union sockaddr_ipvX addrbuf;
    socklen_t addrlen = sizeof addrbuf;

    if (getpeername(sock, &addrbuf.sa, &addrlen))
        return 0;

    char *peer_hostname = malloc(MAX_HOSTNAME_LEN+1);
    if (!peer_hostname) return 0;

    if (getnameinfo(&addrbuf.sa, addrlen,
                    peer_hostname, MAX_HOSTNAME_LEN+1,
                    0, 0, NI_IDN) {
        free(peer_hostname);
        return 0;
    }
    return peer_hostname;
}

With this formulation, not only do you not need to write any casts to call getpeername or getnameinfo, you can safely access addrbuf.sa.sa_family and then, when sa_family == AF_INET, addrbuf.sin.sin_*.

A final note: if the BSD folks had defined the sockaddr structures just a little bit differently ...

struct sockaddr {
    uint16_t sa_family;
};
struct sockaddr_in { 
    struct sockaddr sin_base;
    uint16_t sin_port;
    uint32_t sin_addr;
};
struct sockaddr_storage {
    struct sockaddr ss_base;
    char __ss_storage[128 - (sizeof(uint16_t) + sizeof(unsigned long))];
    unsigned long int __ss_force_alignment;
};

... upcasts and downcasts would have been perfectly well-defined, thanks to the "aggregate or union that includes one of the aforementioned types" rule. If you're wondering how you should deal with this problem in new C code, here you go.

Mendy answered 12/2, 2017 at 17:36 Comment(32)

PERFECT! I still have 22 hours remaining to open a bounty, rest assured I'll award this answer one. "But maybe you need to work with an address family, such as AF_UNIX, that getaddrinfo doesn't handle." Exactly, I'm currently building a lib and since AF_INET, AF_INET6 and AF_UNIX have lots of common functions.. eg send, recv and their friends which take a sockaddr * as parameter, I thought making a single sockaddr_storage variable and fill it conditionally and then passing it to underlying sockaddr * accepting calls. Now I can solve that by function overloading by creating .. – Gorgerin 12/2, 2017 at 18:51

.. different functions that take sockaddr_in, sockaddr_in6, sockaddr_un structures respectively but they all will have exactly same code but different structures since I've abstracted away the difference with templates. I completely agree with you about creating exact structure for need eg sockaddr_in for AF_INET instead of general sockaddr_storage but that'll result in MxN problem where I've to maintain M functions for N different types with exactly same implementation. But I'll see if I can get past with runtime if(AF_INET or other) conditions or function overloading too.... – Gorgerin 12/2, 2017 at 18:55

But let's say, forgive me for asking :p, if I need to follow down the sockaddr_storage general approach, will this method work - https://mcmap.net/q/536326/-fill-sockaddr_storage-struct-with-values-of-sockaddr_in. Because I do have a case of datagram recvfrom call where I'll have to use sockaddr_storage since I'm not sure about the IPv4 mapped IPv6 addresses. Anyways I'm really grateful to you for this excellent answer and I'll provide a reference to it wherever possible. – Gorgerin 12/2, 2017 at 19:0

I would need to see more of your code to understand what you are trying to accomplish and whether it is UB-safe. You should ask a new question about that. But I also want to reiterate that you should be using getaddrinfo and getnameinfo. To the extent I understand what you are trying to do, it really sounds to me like the majority of it is already handled by those functions. – Mendy 12/2, 2017 at 19:18

Declaration and Definition and creation of sockaddr_storage – Gorgerin 12/2, 2017 at 19:19

The "first member" carveout lets you access sin_family safely even with strict aliasing and a cast to sockaddr *. There is no reason to make an initial struct sockaddr member. – Flutter 13/2, 2017 at 6:35

@Flutter I used to think so too, but neither gcc nor clang agrees. Counterexample at godbolt.org/g/WpQGQT ; see blog.regehr.org/archives/1466 and the paper it links to for extensive further analysis. – Mendy 13/2, 2017 at 12:35

OK, that's nasty! But I'm not sure I believe the standard permits that "optimisation." You're accessing a->sa_family, which is an object of effective type unsigned short, via the lvalue b->sin_family, which is an lvalue expression of type unsigned short. This is explicitly allowed by C99 6.5.7. – Flutter 13/2, 2017 at 15:27

@Flutter Are you, though? Are you accessing an object of type unsigned short through an lvalue of type unsigned short? Or are you accessing an object of type struct sockaddr through an lvalue of type struct sockaddr_in? This gets into the "what exactly is an object" argument which, AFAIK, has never been resolved to anyone's satisfaction, and the language in C2011 can still be read either way. – Mendy 13/2, 2017 at 15:31

@Flutter Anyway, either the standard allows this, or there is a widespread and long-standing bug in multiple compilers, which the devs will probably claim is not a bug, and which must be worked around; the implications for application code are the same. I will think about how I can improve my answer on this point but it's already awful long. – Mendy 13/2, 2017 at 15:33

6.5.2 suggests that the following would be ill-formed if the relevant object was struct sockaddr_in: (sin->sin_family = AF_INET) & (sin->sin_port = htons(12345)). I have no reason to write code like that, but it seems insane to declare it ill-formed because I'm modifying the stored value of an object more than once between sequence points. – Flutter 13/2, 2017 at 15:35

@AbhinavGauniyal I reiterate that you need to ask a new question. I don't understand your library, even after having looked at the code, and the comment thread on this question is not the place for us to work it out. Also, please do not bother with a bounty. This is not a difficult question and you need the reputation more than I do. – Mendy 13/2, 2017 at 15:36

Clang is good at having longstanding serious bugs that break tons of real code that the developers claim aren't bugs and which spawn papers about tools using clang to "find bugs in real code!" See people.csail.mit.edu/nickolai/papers/wang-undef-2012-08-21.pdf for example. – Flutter 13/2, 2017 at 15:38

Anyway, I buy that strict aliasing rules, as implemented, are respected by the "little bit differently" example you gave. I learned something---thanks! – Flutter 13/2, 2017 at 15:48

Note that while people in the 1990s may not have disputed that saying struct sockaddr z = *p; would have been UB if p was a pointer to something other than a sockaddr, I don't think people would have said the same about using p->sa_family to access that portion of the Common Initial Sequence. To the contrary, the ability to access members of the CIS through a structure pointer is a big part of what made the CIS useful. The notion that code should use memcpy for such purposes would have been widely recognized as dumb, especially since... – Perquisite 13/2, 2017 at 16:41

...a compiler could assume that p->sa_family would be accessing a 16-bit integer value, but if code used memcpy to copy stuff to a sockaddr before accessing sa_family the compiler would have to allow for the possibility that it could alias anything anywhere. – Perquisite 13/2, 2017 at 16:43

PS--the desire of people to turn C into FORTRAN is particularly absurd when one considers the high-performance computing features that FORTRAN has that C lacks, and that adding directives to enable optimizations would allow better performance without affecting C's usability for embedded or systems code. An "optimization" which improves performance by 0.5% may be worthwhile if it imposes zero costs, but if it's necessary for programmers to add extra code to guard against the optimization it can easily become worse than useless. – Perquisite 13/2, 2017 at 16:52

@tmyklebu: The authors of gcc do not interpret the CIS rule as allowing pointers of one structure type to read members of the CIS of another, even when a complete union declaration containing both types is visible everywhere that either structure is accessed. For some reason, the notion that the visibility of such a union declaration should allow common-initial-sequence access was derisively regarded as a a silly and useless impediment to optimization, despite the fact that structure types are most likely to appear together in unions in circumstances where... – Perquisite 13/2, 2017 at 17:30

...applying the CIS rule to pointers would be useful, and despite the fact that honoring such rule by default would in no way prevent gcc from including a directive to waive such treatment in cases where it isn't needed. I think some people view a default behavior which may generate sub-optimal code in the absence of an optimization directive, as more dangerous than a default behavior which might behave randomly in the absence of an optimization-blocking directive,. – Perquisite 13/2, 2017 at 17:37

@zwol: The authors of gcc claim that the standard is "ambiguous" and that they are therefore entitled to interpret it in a fashion which ignores the "complete union type" language therein. I would not call the willful refusal to abide by the Standard except when using -fno-strict-aliasing a "bug". Since it is normal for compilers to default to a non-conforming mode, the only "defect" is a failure to document include -fno-strict-aliasing among the options that must be specified to put the compiler into standard-conforming mode. – Perquisite 13/2, 2017 at 17:39

Incidentally, the rationale for C89 suggests that the rules were intended to permit compilers to make optimizations in cases where a compiler would have no reason to expect them to matter, e.g. consecutive accesses to a float, an indirected int*, and a float, without any intervening actions that involve a float*. Is there any evidence that the authors of the Standard intended that the rules force programmers jump through hoops to achieve things that could otherwise be done easily in their absence, or did they expect that compiler writers would try to recognize aliasing... – Perquisite 13/2, 2017 at 17:50

...in cases where it was obvious (as was likely the case even on existing compilers that used type-based aliasing assumptions) and figure that there was no need to clutter the Standards with such details? – Perquisite 13/2, 2017 at 17:54

@Perquisite As I have said before, I'm not interested in discussing "how things should be" with you in the comments of this kind of answer, which are about "how things are and how to deal with that." I'm especially not interested in defending or criticizing the behavior of compiler developers, which is a group of people that no longer includes me. – Mendy 13/2, 2017 at 18:1

@Mendy pardon me but I've one last doubt that I very much need to get clear, how about the case where I make a sockaddr_storage ss object and then sockaddr_un *su = (sockaddr_un *)&ss; and then fill it up su->sun_family=AF_UNIX and then call bind() or other system calls on ss. Note that I haven't yet broken the rule of strict aliasing since I have only filled, not dereferenced the pointer of different type. Will the system calls of socket family automatically use correct type according to ss.ss_family data? Is that member even correctly filled while using above code? – Gorgerin 18/2, 2017 at 18:40

Here - books.google.co.in/… – Gorgerin 18/2, 2017 at 18:40

@AbhinavGauniyal You break the rules as soon as you do su->sun_family=AF_UNIX. – Mendy 19/2, 2017 at 13:40

Clang and gcc do not, in general, correctly handle situations where code passes the address of a union member to a function. They will handle such cases correctly in scenarios where the caller and callee are processed separately, and neither can play any role in code generation for the other, but those situations would also be handled correctly without aliasing issues with or without the union. – Perquisite 19/1, 2023 at 18:39

@Perquisite IMNSHO the code I wrote is required to work -- there's no ambiguity about the rules for type punning via a union. If it doesn't, that's bad on multiple levels. Do you have a self-contained testcase for the miscompilation you are describing? – Mendy 19/1, 2023 at 22:47

The code may work by happenstance on today's versions of clang and gcc, but look at godbolt.org/z/rn51cracn for something not too different where gcc behaves, by design, in a manner contrary to what K&R2 would imply. If gcc could tell that the same subscript was being passed to update_fancy1 as read_simple, it would generate sensible code, but I know of no specification that would suggest it should be expected to behave sensibly in that case if it can't be trusted in the general case. – Perquisite 19/1, 2023 at 23:49

@Perquisite Thanks, that's helpful. It seems to me that &addrbuf.sa and (struct sockaddr *)&addrbuf would be equally broken in this case, the issue being that, on the far side of a call to a function taking a struct sockaddr *, the fact that the pointer refers to one member of a union has been masked. Is that your experience? – Mendy 20/1, 2023 at 18:33

@zwol: Yeah. While there are many circumstances where code taking the address of a union member would work, the only situation I know of where taking the address of a union member would work but casting the union's address wouldn't also work is when a union member is an array, the address is taken as a result of pointer decomposition, and the implicitly-formed address is directly dereferenced using the [] operator. GCC recognizes type punning given someUnion.someArray[index] but not *(someUnion.someArray+index) even though the Standard defines the former as meaning the latter. – Perquisite 20/1, 2023 at 18:41

@zwol: IMHO, the Standard should recognize a category of implementations that is unable to handle use of a pointer formed by taking a union member's address to access things of the member's type, and allow such implementations to make the address-of operator return a typeless pointer that can be assigned only to a void*. – Perquisite 20/1, 2023 at 19:14

Yes, it's an aliasing violation to do this. So don't. There's no need to ever use sockaddr_storage; it was a historical mistake. But there are a few safe ways to use it:

malloc(sizeof(struct sockaddr_storage)). In this case, the pointed-to memory does not have an effective type until you store something to it.
As part of a union, explicitly accessing the member you want. But in this case just put the actual sockaddr types you want (in and in6 and maybe un) in the union rather than sockaddr_storage.

Of course in modern programming you should never need to create objects of type struct sockaddr_* at all. Simply use getaddrinfo and getnameinfo to translate addresses between string representations and sockaddr objects, and treat the latter as completely opaque objects.

Gath answered 11/2, 2017 at 16:34 Comment(49)

@jww: I don't follow. FWIW the accepted answer to the question you told me to read is by yours truly. ;-) – Gath 12/2, 2017 at 0:15

Well, in modern programming, you should never need to compile without -fno-strict-aliasing at all. :) – Flutter 13/2, 2017 at 6:35

@tmyklebu: Failure to specify that option can cause gcc to generate bogus code even for programs which never read any storage as any type other than what was used to write it. That may not be deliberate, but it's indicative of a compiler design that may omit during early optimization stages code which wouldn't generate any instructions but is nonetheless required to block other "optimizations". – Perquisite 13/2, 2017 at 18:3

@tmyklebu: You don't need to. You just need to refrain from writing incorrect code, and almost all code using sockaddr_storage is incorrect. My answer covers a few of the non-incorrect ways you could go about using it, but stresses that you shouldn't be using it to begin with. – Gath 13/2, 2017 at 18:14

@R..: It's quite hard to "refrain from writing incorrect code." Especially when the only indication you've written incorrect code happens months later, in production, and apparently in an unrelated piece of code. The "safe idioms" for dealing with this involve writing memcpy when you mean assignment or doing all accesses via union. These generate illegible code that may be terribly slow and have different atomicity properties than intended if the compiler doesn't catch your meaning. Far safer and easier just to switch the land mines off. – Flutter 13/2, 2017 at 18:38

@R..: I am reminded of an Americanism: "Guns don't kill people; people kill people." Compilers don't make code fast; people make code fast. Strict aliasing, as interpreted by compiler vendors today, creates a barrier to optimisation (the kind done by people) by creating additional "gotchas" for the optimiser to avoid. We can already use restrict when we want to tell the compiler that accesses via distinct pointers in some set don't alias. – Flutter 13/2, 2017 at 18:47

@tmyklebu: To that I would add that there is a huge difference between the scenario offered in the C89 rationale as a justification for aliasing rules, versus the scenarios that modern compilers "optimize". A lot of people who despise modern interpretations of the rule have no objection to the notion that compilers should assume things won't alias when there's no evidence that they will (as was the case in the rationale's example). Some simple tweaks to the aliasing rules would fix the vast majority of code that presently requires -fno-strict-aliasing without... – Perquisite 13/2, 2017 at 20:1

@tmyklebu: If you have any casts of pointer types, you most likely have incorrect code. It's pretty damn simple to refrain from doing any. This case (sockaddr_storage) is a classic, textbook example of how much of UB comes from doing gratuitously idiotic things (poking at sockaddr internals) when you could just use the right approach (opaque objects) and have your code be much simpler and avoid the risk of anything going wrong. – Gath 13/2, 2017 at 20:8

@supercat: You're wrong. Locally, evidence of possible aliasing is not too hard to find. The problem is that the possibility of aliasing is not a local property. You can see an example where they tried to do what you suggested in the "union exception". Confusingly it only works when you directly access the union object and the original object is a union. It can't work when you take pointers to members of the union and access them from elsewhere (because the aliasing is not localized) and it can't work (despite many people naively expecting it to) when you cast a pointer to something else.. – Gath 13/2, 2017 at 20:12

...significantly hurting optimization in most cases (the key concept would be that a cast from T* to non-void pointer type, either directly or through void*, should be treated as a potential access to a T). Defining all the corner cases would require some extra verbosity, but the key point is that pointer casts are seldom present in code which would be amenable to optimization, but are usually present in code which relies upon aliasing, and would thus provide an easy way for compilers to distinguish them. – Perquisite 13/2, 2017 at 20:12

...to a pointer-to-union type in order to type-pun. The solution is not making more confusing localized-access-only exceptions to aliasing rules, but rather getting people to stop type-punning. It's wrong and it's useless. – Gath 13/2, 2017 at 20:13

A much more productive course of action for compilers would be to detect what aliasing they can (localized only) and generate the equivalent of __builtin_trap() for it, rather than trying to patch it up and make it work. Once wrong programs start crashing all over the place, people will fix them. And you have to either get pretty unlucky, or work hard at it, to write a program with no localized aliasing but distant non-local aliasing. – Gath 13/2, 2017 at 20:16

@R..: Most code that relies upon aliasing involve casting a pointer to some object and using the resulting pointer as the exclusive means of access to the object during the pointer's lifetime (similar to what restrict would imply). Optimizations that rely upon aliasing generally involve two identifiable accesses to an object and a presumption that no other accesses offer between them. Even if every pointer cast from T* to a non-void pointer or to a void* which leaves the compiler's sight were treated as breaking a connection between the preceding and following accesses to a T, ... – Perquisite 13/2, 2017 at 20:17

...the costs of that would be trivial compared with the cost of -fno-strict-aliasing. Also, I would suggest that even if you don't find any aliasing constructs useful, the widespread usage of a number of aliasing patterns to do things that can't otherwise be done constitutes prima facie evidence that they are useful--just not to you. – Perquisite 13/2, 2017 at 20:20

@R..: Also, w/rt local vs. distant analysis, a compiler is required to make pessimistic assumptions about code it can't see. If a function accesses a T* directly, any caller that can't see what T* was accessed must assume that it may have accessed any T whose address it could acquire. What is difficult about saying that if a function casts a T* to some other type and can't tell what is done with it, callers should have to assume--as with the function that accessed a T* directly, that it might access a T? – Perquisite 13/2, 2017 at 20:42

@supercat: Under your idea, a function accessing data via A* and B* would always have to assume they might alias. You'd be stuck with awful codegen (spurious spills/reloads, no vectorization) unless you used restrict all over the place, i.e. the same thing people get when they foolishly use -fno-strict-aliasing. This is a real pain, as it makes it completely impractical to use "context" structures in-place because every access goes back to memory; you have to manually cache all the members in local variables. And all this to solve a non-problem. – Gath 13/2, 2017 at 22:27

@R..: Compilers generate comically bad code when they're left in charge of vectorising anything nontrivial. You still need people to vectorise code if you want to get good results consistently. Regarding the "spurious" spills and reloads, you've got a choice between them and silent generation of wrong code when you aren't expecting it. – Flutter 13/2, 2017 at 22:46

@R..: You're misinderstanding my proposed rule, which would allow a compiler to assume within the function that the arguments didn't alias unless a pointer of one type was cast to the other, in which case it would only have to consider aliasing between the cast and the first usage of the storage in question by some means other than by a pointer derived from the cast. If a compiler sees two accesses to an object of type A, and between those accesses there are no potential accesses to an A* or char*[etc.], and no casts from A* to some other type, then a compiler may assume that... – Perquisite 13/2, 2017 at 22:53

@supercat: That breaks when the cast and the access are non-local with respect to each other, which can easily happen in practice, and which is really going to mess up users' intuitions when it does. – Gath 13/2, 2017 at 22:57

...nothing will alias the object of type A between those accesses. – Perquisite 13/2, 2017 at 22:59

@R..: If you think type punning is dumb and useless, try getting rid of it in Linux. Hell, try getting rid of it in int float_to_int_bitwise(float) and more complex code in the same vein without sacrificing readability even further. – Flutter 13/2, 2017 at 23:0

@R..: I'm not calling for compilers to support situations where the cast and usage of the cast value are separated by an access to the underlying value via means unrelated to the result of the cast. That's far less common than code which casts a pointer and then performs all the accesses that are going to occur using the result of the cast before the object is next accessed via unrelated means. – Perquisite 13/2, 2017 at 23:4

@tmyklebu: Regarding "spurious" spills, an even better choice between accepting them or bogus code would be to use restrict to let the compiler avoid spurious spills. Unlike type-based aliasing, that will even work when using character pointers. Type-based aliasing assumptions may allow optimizations in a few cases where restrict wouldn't be usable, so an option to have a compiler apply TBAA conservatively (rather than disabling it altogether) may be helpful, but in code that uses restrict effectively should generally work decently even without it. – Perquisite 13/2, 2017 at 23:48

@Flutter Re: "never need to compile without -fno-strict-aliasing": if I understand correctly, Linux kernel is built using -fno-strict-aliasing. – Pull 11/5, 2022 at 21:47

@pmor: Both clang and gcc use abstraction models that are incompatible with the Standard's rule that allows recycling of storage to hold different types. Defect Report 236 was proposed to change the Standard so that no storage which ever held any particular type could ever be used to hold any other, but that request was rejected. Despite that, clang and gcc continue to use an abstraction model where the Effective Type of storage that has written as type T and later as type U may later spontaneously revert to T, even if it is never read as any type other than the last written. – Perquisite 30/9, 2022 at 15:32

Further, gcc sometimes treats constructs like if (flag) *(long)ptr = 1; else *(long long)ptr=1; as equivalent to *(long long)ptr=1; and assumes they will never access a long even if flag is 1. Consequently, specifying any optimizations without using -fno-strict-aliasing will yield a language dialect which is should be viewed as only suitable for use with programs that never repurpose storage for use as different types, and I would trust code that specifies that it should only be compiled with -fno-strict-aliasing far more than I would trust code that doesn't specify that. – Perquisite 30/9, 2022 at 15:35

@supercat: Citation needed. That would be a serious bug if it does, and I'm pretty sure it does not do that. It skips the branch and emits a single version of the code for both, but the type-based aliasing knowledge can't assume the type afterwards, and I haven't seen any evidence that it does. If you have such evidence, please post it, as this is material for a critical CVE-worthy codegen bug report. – Gath 30/9, 2022 at 19:50

@R..GitHubSTOPHELPINGICE: See godbolt.org/z/83v4ssrn4 for an example of the described behavior. – Perquisite 30/9, 2022 at 20:1

@R..GitHubSTOPHELPINGICE: Also, see godbolt.org/z/jfv1Ge6v4 for an example of how clang and gcc are unable to abide the response to Defect Report 236. Function test1() writes storage with a long value which should be ignored. Function test2() writes the same storage with a long long and reads it back. Function test3() writes the storage twice with long and reads it back. All strictly conforming, but neither gcc can process it correctly. – Perquisite 30/9, 2022 at 20:37

@supercat: Lovely. Has this been filed on the bug tracker? If not, it's going to be soon... – Gath 30/9, 2022 at 20:45

@R..GitHubSTOPHELPINGICE: You should probably post both bugs on the bug tracker, since I don't have an account there, and am unaware of them having been posted recently. Since bugs I've squawked about that were put on the bug tracker years ago have yet to be fixed, I've given up on worrying about which bugs are on the reporting system or not. Given a choice between correctly supporting corner cases mandated by the Standard or performing optimizations that, while unsound, will "usually" work, I think the gcc maintainers would rather do the latter. – Perquisite 30/9, 2022 at 21:5

@R..GitHubSTOPHELPINGICE: To properly handle these and many other aliasing issues, gcc would need to treat type and data dependencies as a directed acyclic graph, but from what I can tell it instead treats them as an equivalence relation. A DAG-based abstraction would be able to block potential optimizing transforms in the few cases they'd cause trouble while applying them elsewhere, but an equivalence-based one cannot. – Perquisite 30/9, 2022 at 21:15

@supercat: I think another option would be using lower-precision knowledge for TBAA, e.g. just the "mode" (scalar type size) rather than the actual type to infer non-aliasing. – Gath 30/9, 2022 at 21:39

@supercat: Filed as the memorable bug number 107107 :-) – Gath 30/9, 2022 at 21:40

@R..GitHubSTOPHELPINGICE: Or an option to treat pointers, collectively, as one type, and all other types as compatible with each other but not with pointers, which is what CompCertC does, which would allow some optimizations in programs which would otherwise require -fno-strict-aliasing. On the flip side, a compiler which was paranoid in the presence of typecasts or volatile-qualified accesses could perform TBAA much more aggressively in their absence, while still being compatible with most non-portable programs. Add a mode to throw out the "character type" exception and performance... – Perquisite 30/9, 2022 at 22:3

...could often be better than would be allowable in a conforming implementation, since even code which relies upon the character-type exception will almost always have type casts or other indications that a compiler needs to perform actions "literally" unless it can fully understand everything about the pointers involved. – Perquisite 30/9, 2022 at 22:4

@R..GitHubSTOPHELPINGICE: Nice write-up. Have you looked at the other example I posted, which shows another problem with collapsing code, in this case between the second statement of test2() and the second statement of test3(), inclusive? – Perquisite 30/9, 2022 at 22:13

@supercat: I haven't looked at it yet. Do you think it merits a different bug report (i.e. does it likely have a different root cause), or just comments on the same one about other affected constructs? – Gath 30/9, 2022 at 23:9

@supercat: Are you sure just the pointer/non-pointer thing would suffice? I would expect to hit the same bug with branches using void * vs uintptr_t lvalues to modify the object, where the code has perfectly well-defined behavior but breaks due to GCC collapsing the branches to the same code and losing track of the type of the lvalue the store happened through on the abstract machine. – Gath 30/9, 2022 at 23:12

@R..GitHubSTOPHELPINGICE: It's similar in that it reveals type-based aliasing logic that gets broken when operations get consolidated, but the nature of the operations being consolidated is different, and this example breaks in both clang and gcc. – Perquisite 1/10, 2022 at 5:28

@supercat: OK, comments on the first bug seem to agree it might be different so I'll file it separately. Thanks. – Gath 1/10, 2022 at 14:16

@R..GitHubSTOPHELPINGICE: If the compiler merges constructs that affect the bit patterns without accommodating different TBAA implementations, grouping types as equivalent won't fix all such potential problems. On the other hand, if there were an option to e.g. only treated pointers and "everything else" as the only two non-aliasable types, the only programs for which that would not be sufficient would be those that deliberately produced pointers with the same representations as integers or vice versa. – Perquisite 1/10, 2022 at 16:45

@R..GitHubSTOPHELPINGICE: On the other hand, my general perception of gcc is that if there programs which other compilers would process usefully, and which the Standard allows other compilers to process usefully, the maintainers of gcc and clang would rather argue about whether the Standard requires them to behave meaningfully, than acknowledge that a compiler which behaves meaningfully even when not required to do so may for many purposes be more useful than one which would use the Standard as an excuse to behave in meaningless fashion. – Perquisite 1/10, 2022 at 16:49

@supercat: I've now filed the second as gcc.gnu.org/bugzilla/show_bug.cgi?id=107115 – Gath 1/10, 2022 at 20:1

@R..GitHubSTOPHELPINGICE Was the second one filed to LLVM? I cannot find it. – Pull 4/10, 2022 at 19:43

@pmor: If it does get reported to clang, I wonder if the maintainers will make a minimal tweak to fix the test case, or whether they'll actually fix the underlying problem which the test case is designed to illustrate? – Perquisite 5/10, 2022 at 5:50

@Perquisite Created: github.com/llvm/llvm-project/issues/58225. – Pull 7/10, 2022 at 9:48

@pmor: I was being snarky with that last remark, but it will be interesting to see what kind of fix emerges. It seems like people are working on gcc, but I find concerning the quote "Yep, we've been playing whack-a-mole already, so we should just continue and fix places that pop up with this issue" since that suggests a willingness to accept a design that will never be fundamentally sound. – Perquisite 7/10, 2022 at 14:49

@pmor: Ignore my last [deleted] comment. I had was A/B testing the two compiler versions and failed to select the latest. – Perquisite 7/10, 2022 at 15:21

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags