Using restrict to express possible partial overlap
Asked Answered
B

3

0

Given type definitions

struct a { int a; };
struct b { int b; struct a ba;};

and a function taking struct a* a and struct b* b, the type information expresses possible overlap between a->a and b->ba.a but no overlap on b->b.

Therefore in a function such as:

int possibleOverlapOn2ndInt(struct a *a, struct b *b){
    b->b  = 0xbb; //unaliased
    b->ba.a  = 0xba; //may alias a->a
    a->a  = 0xaa; //may alias b->ba.a;
    return b->b /*0xbb*/ + b->ba.a;
}

compilers need not reload b->b in the return line and instead they may substitute the stored constant. Gcc and clang indeed do this.

I am curious if this case of possible partial overlap could be expressed without via restrict without using structs.

I've tried:

int possibleOverlapOn2ndInt_(int a[1], int b[2]){ //<=> (int *a, int *b)
    int *restrict bp = &b[0];
    bp[0] = 0xbb; //unaliased
    b[1] = 0xba; //may alias a[0]
    a[0] = 0xaa; //may alias b[1]
    return bp[0] /*0xbb?*/ + b[1];
}

but got no effect from having the restricted pointer there. https://godbolt.org/z/9Y7zz37rs

Am I using restrict wrong here is it the compilers not optimizing as well as they could?

Bun answered 27/11, 2022 at 14:26 Comment(0)
J
2

Am I using restrict wrong here is it the compilers not optimizing as well as they could?

restrict-qualification puts constraints on program behavior intended to allow optimizations that otherwise might be assumed unsafe on the basis of possible aliasing. In any particular case, however, that does not imply that the compiler has missed an optimization opportunity if it generates the same code both with and without restrict. Therefore, I'll focus mainly on the first part of the question.

These are the constraints associated with use of restrict, from C17 6.7.3.1, somewhat contextualized to your example:

Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T.

D is int *restrict bp = &b[0];.
P is bp.
T is int.

[...] let B denote the block.

B is the whole body of function possibleOverlapOn2ndInt_().

In what follows, a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E.

No such expressions E appear in possibleOverlapOn2ndInt_(), other than bp itself. In particular, neither a nor b is such an expression.

During each execution of B, let L be any lvalue that has &L based on P.

Neither a[0] nor b[0] nor b[1] satisfies that description. The only such lvalue appearing in possibleOverlapOn2ndInt_() is bp[0].

If L is used to access the value of the object X that it designates,

bp[0] is used to access the object it designates.

and X is also modified (by any means),

The object designated by bp[0] is modified (via bp[0] itself).

then the following requirements apply: [...] Every other lvalue used to access the value of X shall also have its address based on P.

This would be violated if a[0] aliased bp[0] (and therefore also b[0]) in any execution of possibleOverlapOn2ndInt_().

[...] If these requirements are not met, then the behavior is undefined.

If the compiler assumed that the behavior of every execution of possibleOverlapOn2ndInt_() has defined behavior, including with respect to restrict qualification, then it could generate code as if the function were written like so:

int possibleOverlapOn2ndInt_2(int a[1], int b[2]) {
    b[0] = 0xbb;
    b[1] = 0xba;  // cannot alias b[0]
    a[0] = 0xaa;  // must not alias b[0], may alias b[1]
    return 0xbb + b[1];
}

The compiler is by no means required to make use of that freedom, however.

Jadajadd answered 27/11, 2022 at 15:15 Comment(11)
Note that the definition of "based upon" has some rather counter-intuitive corner cases. For example, given int *restrict p; ... if (p == somethingNotBasedUponP) { *p = 1;}; the pointer used in the assignment might not be regarded as equivalent to restrict p, since replacing p with a pointer to a copy of the associated data wouldn't result in the store being performed with a different target address.Gin
I'm not following, @supercat. The pointer used in the assignment in your example is p. p is certainly based on p by the definition in the language spec, regardless of whether *p is actually accessed. In the event that p == somethingNotBasedUponP evaluates to 1, and therefore the assignment is executed, C says that the program's behavior is undefined if expression *somethingNotBasedUponP is used to access the object it designates during the same execution of the appropriate containing block. I don't see how any of that suggests that the two appearances of p are not equivalent.Jadajadd
For a pointer expression within an if to be based upon the object restrict p outside the if, there must be some situation where changing the value of p outside the if would change the value of the expression inside. In an example like godbolt.org/z/57WW43jjn there's no way that could happen. While it would seem absurd to suggest that the pointer expression p inside the if wouldn't be based upon the value of pointer objectg restrict p outside, both clang and gcc assume that a pointer that can't change in response to a change in p as though it isn't "based upon" p.Gin
@supercat, what I think what you're talking about is that in your example, the two compilers you tested generate instructions to read the address of the assignment destination from the storage for x instead of from the storage for p, after having tested that the (address) values stored in those locations are the same. I would agree that that's a bit quirky, but I wouldn't characterize it as the one appearance of expression p not being equivalent to the other. And supposing no UB of the overall program, I don't see how the generated code would produce any unexpected behavior.Jadajadd
Passing the addres of x and a value of 0, the code as written would access *p using the lvalue p[i] outside the loop and *p inside the loop, and set that location to 2 and yet return 1, a behavior I'd regard as unexpected, in a situation that would only constitute UB if the pointer expression evaluated inside the if is regarded as not based upon restrict p. The definition of "based upon" was certainly not intended to be interpreted the way clang and gcc treat it, but unless or until the Standard recognizes a dialect that offers stronger behavioral guarantees than clang and gcc...Gin
...uphold, I don't see any way to prevent clang and gcc from continuously defining deviancy down. If an "ordinary" reading of part of the Standard would say that a construct has defined behavior, but it's possible to concoct a vaguely plausible interpretation under which it could be classified as UB, along with some program where the latter interpretation would allow some "optimization" the former would not, clang and gcc will often place higher priority on performing the optimization than on trying to be compatible with reasonable programmer expectations.Gin
I see, @supercat. I guess the compilers have used the success of the equality test to decide that they may convert the store via p to a store via x, and then used the restrict qualification of p and the fact that there isn't (any longer) a store via p to conclude that p[i] does not need to be re-read. But that's not a counter-intuitive corner case in the spec, it's an incorrect optimization. Nevertheless, users still should watch out either way.Jadajadd
Parts of the Standard are designed to allow compilers to perform incorrect optimizations in cases that might not matter, on the presumption that they'll make a bona fide effort to only apply them only in cases that, in actuality, do not matter. Even without the "One Program Rule" loophole, I wouldn't view the clang/gcc behavior here as being non-conforming, but instead view it as one of the many situations which the Committee expected compilers to handle correctly with or without a mandate, but clang and gcc interpreted the lack of a mandate as an invitation to optimize erroneously.Gin
@supercat, I created a correct, strictly-conforming program around your function. My gcc/glibc-based C implementation accepts it, and produces output consistent with the C language specification ("2") when optimization is not enabled. But my C implementation produces output inconsistent with the C language specification ("1") when when optimization is enabled. Regardless of whether you characterize that as non-conformance, I can and do characterize it as a symptom of a bug in my compiler.Jadajadd
The issue has languished for years in the bug reporting systems of both clang and gcc for many years, and I think the maintainers regard it as the designed behavior. The only way I can see such issues ever being resolved would be for the Standard to abandon efforts at compromising between optimization and semantics, and recognize an "aggressive compiler" dialect in which clang/gcc optimizations are valid, but limit such optimizations in programs that start with #if __DIRECTIVE_NAME_TBD #error Unsuitable compiler configuration #endif. Programmers could then start prefixing,...Gin
...programs that are actually supposed to work with those three lines, while the authors of clang and gcc can howl at the moon about how programs that clang and gcc would process nonsensically if allowed to do so, but which they can now only compile with optimizations disabled, are "broken".Gin
G
1

From what I can tell, while clang and gcc are sometimes extremely aggressive in their treatment of restrict qualifiers on function arguments, treating as UB some situations at least some authors of the Standard probably intended to define, neither seems to do apply even the most straightforward possible optimizations when the qualifier is used elsewhere. For example, given:

int x[10];
int test1(int *restrict p, int i)
{
    if (p+i != x)
        return 0;
    p[0] = 1;
    p[i] = 2;
    return 23 * p[0];
}
int test2(int *p, int *q)
{
    int *restrict pp = p;
    *pp = 1;
    *q = 2;
    return 23 * *pp;
}

both clang and gcc will generate code for test1 that returns 23 in the case where i is zero and p happens to equal x, but both will generate code for test2 that reads *pp and multiplies the result by 23, rather than recognizing that pp and q cannot alias.

Gin answered 28/11, 2022 at 19:6 Comment(1)
That's an interesting example. Thanks! I got 46 from gcc on test1(x,0), and 23 from clang. Both (un)equally wrong.Bun
A
0

Use it in the parameter list. bp is not restricted as it holds reference from the not-restricted pointer.

int possibleOverlapOn2ndInt_(int * restrict a, int *b){ //<=> (int *a, int *b)
    b[0] = 0xbb; //unaliased
    b[1] = 0xba; //may alias a[0]
    a[0] = 0xaa; //may alias b[1]
    return b[0] /*0xbb?*/ + b[1];

https://godbolt.org/z/89jb5hd4c

Maybe I misunderstood the question.

Atherosclerosis answered 27/11, 2022 at 14:43 Comment(2)
Using it there expresses "no possibility of overlap at all". The modified function returns a constant (0xbb+0xba). What I want is the same codegen as with the struct case: not a constant but 0xbb+b[1] where b[1] must get reloaded because of the possibility of overlap with a[0].Bun
@PSkocik code with no arrtibutes does exactly what you want. I do not understand why you need restrict.Atherosclerosis

© 2022 - 2025 — McMap. All rights reserved.