GCC and strict aliasing between arrays of a same type

Asked 15/11, 2016 at 10:4 Answered 15/11, 2016 at 16:10

Context

“Strict aliasing”, named after the GCC optimization, is an assumption by the compiler that a value in memory will not be accessed through an lvalue of a type (the “declared type”) very different from the type the value was written with (the “effective type”). This assumption allows code transformations that would be incorrect if the possibility had to be taken into account that writing to a pointer to float could modify a global variable of type int.

Both GCC and Clang, extracting the most meaning out of a standard description full of dark corners, and having a bias for performance of generated code in practice, assume that a pointer to the int first member of a struct thing does not alias a pointer to the int first member of a struct object:

struct thing { int a; };
struct object { int a; };

int e(struct thing *p, struct object *q) {
  p->a = 1;
  q->a = 2;
  return p->a;
}

Both GCC and Clang infer that the function always return 1, that is, that p and q cannot be aliases for the same memory location:

e:
        movl    $1, (%rdi)
        movl    $1, %eax
        movl    $2, (%rsi)
        ret

As long as one agrees with the reasoning for this optimization, it should be no surprise that p->t[3] and q->t[2] are also assumed to be disjoint lvalues in the following snippet (or rather, that the caller causes UB if they alias):

struct arr { int t[10]; };

int h(struct arr *p, struct arr *q) {
  p->t[3] = 1;
  q->t[2] = 2;
  return p->t[3];
}

GCC optimizes the above function h:

h:
        movl    $1, 12(%rdi)
        movl    $1, %eax
        movl    $2, 8(%rsi)
        ret

So far so good, as long as one sees p->a or p->t[3] as somehow accessing a whole struct thing (resp. struct arr), it is possible to argue that making the locations alias would break the rules laid out in 6.5:6-7. An argument that this is GCC's approach is this message, part of a long thread that also discussed the role of unions in strict aliasing rules.

Question

I have doubts, however, about the following example, in which there is no struct:

int g(int (*p)[10], int (*q)[10]) {
  (*p)[3] = 1;
  (*q)[4] = 2;
  return (*p)[3];
}

GCC versions 4.4.7 through the current version 7 snapshot on Matt Godbolt's useful website optimize function g as if (*p)[3] and (*q)[4] could not alias (or rather, as if the program had invoked UB if they did):

g:
        movl    $1, 12(%rdi)
        movl    $1, %eax
        movl    $2, 16(%rsi)
        ret

Is there any reading of the standard that justifies this very strict approach to strict aliasing? If GCC's optimization here can be justified, would the arguments apply as well to the optimization of functions f and k, which are not optimized by GCC?

int f(int (*p)[10], int (*q)[9]) {
  (*p)[3] = 1;
  (*q)[3] = 2;
  return (*p)[3];
}

int k(int (*p)[10], int (*q)[9]) {
  (*p)[3] = 1;
  (*q)[2] = 2;
  return (*p)[3];
}

I'm willing to take this up with the GCC devs, but I should first decide without I am reporting a correctness bug for function g or a missed optimization for f and k.

Falco answered 15/11, 2016 at 10:4 Comment(9)

@Olaf I am only describing the behavior of Clang and GCC. Please check what they do for yourself, the link to godbolt.org is in the post. – Falco 15/11, 2016 at 10:16

1) I refer explicitly to your second paragraph in the question. Please read the standard carefully. 2) There already are some Q&A here about this subject. 3) If it can prove two pointers do not alias, a compiler is free to exploit that. Please provide more context. Are caller and callee in the same TU? Unless one of p or q is qualified restrict, they can alias. – Hanahanae 15/11, 2016 at 10:19

@Olaf perhaps this example is more convincing that I am describing the actual behavior of GCC and Clang: godbolt.org/g/TcCRwd The expressions &p->a and &q->a definitely have type “pointer to int”, have they not? – Falco 15/11, 2016 at 10:19

@Olaf thanks for the suggestion of reading the standard, this might be useful. I do not think that existing answers cover my question. My question apparently already contains enough context that people will redirect me to the FAQ without reading what my actual question is, so thanks, but I will not add more. Feel free do downvote and not answer if you think it's a bad question. – Falco 15/11, 2016 at 10:25

@Olaf “Are caller and callee in the same TU?” Whether the caller and the caller are in the same translation unit is utterly irrelevant to how the strict aliasing rules are applied. “Unless one of p or q is qualified restrict, they can alias.” As the assembly snippets in the question show, GCC and Clang assume that p and q do not alias. If you report this behavior to the GCC or Clang developers, it will be closed as “NOT A BUG”. I do not know how else I can express this. If you do not believe this behavior exist, just do not answer the question. – Falco 15/11, 2016 at 10:30

Regarding the struct example, the "strict aliasing" rule 6.5/8 says: "a type compatible with the effective type of the object". Then peek at the definition of compatible type in 6.2.7/1. It says that if two structs have the same tag and same members, they are compatible. Do you get the same machine code if you name the two structs the same? I'd expect the compiler to generate more instructions. – Lucero 15/11, 2016 at 13:7

Are you looking for a gcc/clang answer or general C one? If gcc/clang, suggest adding those tags. – Sealskin 15/11, 2016 at 14:41

@chux While Clang optimizes e, it does not optimize the other examples in the question, so there is no indication that Clang developers believe that lv1[2] cannot alias lv2[3] when lv1 and lv2 are lvalues of type int[10]. I will tag it “gcc”. – Falco 15/11, 2016 at 14:48

The pointers in g can alias and point to the same array. There is nothing in the standard that would prohibit that. Of course the assignment (*p)[3] = 1; cannot possibly overlap with the assignment (*q)[4] = 2;, so the compiler doesn't have to reload the value, and can optimize to always return 1. – Pasturage 17/11, 2016 at 8:51

In:

int g(int (*p)[10], int (*q)[10]) {
  (*p)[3] = 1;
  (*q)[4] = 2;
  return (*p)[3];
}

*p and *q are lvalues of array type; If they may overlap, access to them is governed by section 6.5 paragraph 7 (the so called "strict aliasing rule"). However, since their type is the same, that does not present a problem for this code. The standard is however remarkably vague regarding a number of relevant concerns that would be required to give a comprehensive answer to this question, such as:

Do (*p) and (*q) actually necessitate "access" (as the term is used in 6.5p7) to the arrays to which they point? If they do not, it's tempting to take the view that the expressions (*p)[3] and (*q)[4] essentially degrade to pointer arithmetic and dereference of two int *s which can clearly alias. (This isn't an entirely unreasonable standpoint; 6.5.2.1 Array Subscripting says that One of the expressions shall have type ‘‘pointer to complete object type’’, the other expression shall have integer type, and the result has type ‘‘type’’ - so the array lvalue has necessarily degraded to a pointer as per usual conversion rules; the only question is whether the array was accessed before the conversion occurred).
However, to defend the view that (*p)[3] is purely equivalent to *((int *)p + 3), we'd have to show that (*p)[3] doesn't require evaluation of (*p), or that if it does, the access does not have undefined behaviour (or defined but unwanted behaviour). I don't believe there is any justification in the precise wording of the standard to allow that (*p) is not evaluated; this implies that the expression (*p) must not have undefined behaviour if the behaviour of (*p)[3] is defined. So, the question really boils down to whether *p and *q have defined behaviour if they refer to partially overlapping arrays of the same type, and indeed whether it is possible that they can simultaneously do so.

For the definition of the * operator, the standard says:

if it points to an object, the result is an lvalue designating the object

does this mean that the pointer must point to the start of the object? (It seems likely that this is what is meant). Does the object have to have been established somehow before it can be accessed (and does establishing an object disestablish any overlapping object)? If both are the case, *p and *q cannot overlap - since establishing either object would invalidate the other - and so (*p)[3] and (*q)[4] cannot alias.

The problem is that there is no suitable guidance on these questions. In my view, a conservative approach should be taken: do not assume that this kind of aliasing is legal.

In particular, the "effective type" wording in 6.5 suggests a means by which an object of a particular type can be established. It seems like a good bet that this is intended to be definitive; that is, that you cannot establish an object other than by setting its effective type (including by means of it having a declared type), and that access by other types is restricted; further, establishing an object un-establishes any existing overlapping object (to be clear, this is extrapolation, not the actual wording). Thus if (*p)[3] and (*q)[4] could alias, then either p or q does not point to an object, and therefore one of either *p or *q has undefined behaviour.

Roundly answered 15/11, 2016 at 16:10 Comment(3)

I really like the way you reframed the question. Thanks for taking the time to investigate, and write it up. – Falco 17/11, 2016 at 20:2

The Standard defines the term "access" strictly in terms of reads and writes, and defines array accesses strictly in terms of taking the array's address and adding an offset. It would be much better if it defined the use of [] on an array type as a different operation from the use of [] on a pointer type (with identical behaviors in all defined cases, but with some cases where one would be defined and some would not). For example, given int foo[5][4], such a rule would allow a compiler given foo[0][x] to assume it won't alias foo[1][0], but would allow... – Geri 28/11, 2016 at 23:16

...a programmer to access the whole array via *(foo[0]+x). With regard to the case at hand, a compiler could be entitled to assume that expressions which dereference the array types and use subscript notation would not alias, but would need to recognize that those which dereference the array types and use pointer arithmetic might alias. – Geri 28/11, 2016 at 23:18

IMHO, the standard does not allow arrays of definite size to overlap at the same time(*). Draft n1570 says in 6.2.7 Compatible type and composite type (emphasis mine):

§2 All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.

§3 A composite type can be constructed from two types that are compatible; it is a type that is compatible with both of the two types and satisfies the following conditions:

If both types are array types, the following rules are applied:

If one type is an array of known constant size, the composite type is an array of that size.
...

As an object shall have its stored value accessed only by an lvalue expression that has a compatible type (simplified reading of 6.5 Expressions §7), you cannot alias arrays of different sizes, nor can you have arrays of the same size that overlap. So, in function g, p and q should either point to the same array or to non-overlapping arrays, which allows the optimization.

For functions f and k, my understanding is that the optimization would be allowed per the standard but has not been implemented by the developers. We must remember that as soon as one of the parameters is a simple pointer, it is allowed to point to any element of the other array and no optimization can occur. So I think that the absence of optimization is just an example of the famous rule for UB: anything can happen, including the expected result.

Integrant answered 15/11, 2016 at 13:8 Comment(2)

I believe that is talking about multiple declarations of the same entity.In the question are two separate entities, p and q, so the quoted paragraphs don't apply. That's not to say I think your answer is necessarily wrong; just that the reasoning doesn't apply to this question. – Roundly 15/11, 2016 at 16:0

You could say that an lvalue of array type doesn't access the array. Such an lvalue decays to pointer when used in any context that would involve an access. For example (*p)[3] = 1; accesses an int using an lvalue of type int , it doesn't access a whole array. – Moderate 15/11, 2016 at 22:23

In:

int g(int (*p)[10], int (*q)[10]) {
  (*p)[3] = 1;
  (*q)[4] = 2;
  return (*p)[3];
}

Do (*p) and (*q) actually necessitate "access" (as the term is used in 6.5p7) to the arrays to which they point? If they do not, it's tempting to take the view that the expressions (*p)[3] and (*q)[4] essentially degrade to pointer arithmetic and dereference of two int *s which can clearly alias. (This isn't an entirely unreasonable standpoint; 6.5.2.1 Array Subscripting says that One of the expressions shall have type ‘‘pointer to complete object type’’, the other expression shall have integer type, and the result has type ‘‘type’’ - so the array lvalue has necessarily degraded to a pointer as per usual conversion rules; the only question is whether the array was accessed before the conversion occurred).
However, to defend the view that (*p)[3] is purely equivalent to *((int *)p + 3), we'd have to show that (*p)[3] doesn't require evaluation of (*p), or that if it does, the access does not have undefined behaviour (or defined but unwanted behaviour). I don't believe there is any justification in the precise wording of the standard to allow that (*p) is not evaluated; this implies that the expression (*p) must not have undefined behaviour if the behaviour of (*p)[3] is defined. So, the question really boils down to whether *p and *q have defined behaviour if they refer to partially overlapping arrays of the same type, and indeed whether it is possible that they can simultaneously do so.