GCC Aliasing Checks w/Restrict pointers
Asked Answered
L

5

35

Consider the following two snippets:

#define ALIGN_BYTES 32
#define ASSUME_ALIGNED(x) x = __builtin_assume_aligned(x, ALIGN_BYTES)

void fn0(const float *restrict a0, const float *restrict a1,
         float *restrict b, int n)
{
    ASSUME_ALIGNED(a0); ASSUME_ALIGNED(a1); ASSUME_ALIGNED(b);

    for (int i = 0; i < n; ++i)
        b[i] = a0[i] + a1[i];
}

void fn1(const float *restrict *restrict a, float *restrict b, int n)
{
    ASSUME_ALIGNED(a[0]); ASSUME_ALIGNED(a[1]); ASSUME_ALIGNED(b);

    for (int i = 0; i < n; ++i)
        b[i] = a[0][i] + a[1][i];
}

When I compile the function as gcc-4.7.2 -Ofast -march=native -std=c99 -ftree-vectorizer-verbose=5 -S test.c -Wall I find that GCC inserts aliasing checks for the second function.

How can I prevent this such that the resulting assembly for fn1 is the same as that for fn0? (When the number of parameters increases from three to, say, 30 the argument-passing approach (fn0) becomes cumbersome and the number of aliasing checks in the fn1 approach becomes ridiculous .)

Assembly (x86-64, AVX capable chip); aliasing cruft at .LFB10

fn0:
.LFB9:
    .cfi_startproc
    testl   %ecx, %ecx
    jle .L1
    movl    %ecx, %r10d
    shrl    $3, %r10d
    leal    0(,%r10,8), %r9d
    testl   %r9d, %r9d
    je  .L8
    cmpl    $7, %ecx
    jbe .L8
    xorl    %eax, %eax
    xorl    %r8d, %r8d
    .p2align 4,,10
    .p2align 3
.L4:
    vmovaps (%rsi,%rax), %ymm0
    addl    $1, %r8d
    vaddps  (%rdi,%rax), %ymm0, %ymm0
    vmovaps %ymm0, (%rdx,%rax)
    addq    $32, %rax
    cmpl    %r8d, %r10d
    ja  .L4
    cmpl    %r9d, %ecx
    je  .L1
.L3:
    movslq  %r9d, %rax
    salq    $2, %rax
    addq    %rax, %rdi
    addq    %rax, %rsi
    addq    %rax, %rdx
    xorl    %eax, %eax
    .p2align 4,,10
    .p2align 3
.L6:
    vmovss  (%rsi,%rax,4), %xmm0
    vaddss  (%rdi,%rax,4), %xmm0, %xmm0
    vmovss  %xmm0, (%rdx,%rax,4)
    addq    $1, %rax
    leal    (%r9,%rax), %r8d
    cmpl    %r8d, %ecx
    jg  .L6
.L1:
    vzeroupper
    ret
.L8:
    xorl    %r9d, %r9d
    jmp .L3
    .cfi_endproc
.LFE9:
    .size   fn0, .-fn0
    .p2align 4,,15
    .globl  fn1
    .type   fn1, @function
fn1:
.LFB10:
    .cfi_startproc
    testq   %rdx, %rdx
    movq    (%rdi), %r8
    movq    8(%rdi), %r9
    je  .L12
    leaq    32(%rsi), %rdi
    movq    %rdx, %r10
    leaq    32(%r8), %r11
    shrq    $3, %r10
    cmpq    %rdi, %r8
    leaq    0(,%r10,8), %rax
    setae   %cl
    cmpq    %r11, %rsi
    setae   %r11b
    orl %r11d, %ecx
    cmpq    %rdi, %r9
    leaq    32(%r9), %r11
    setae   %dil
    cmpq    %r11, %rsi
    setae   %r11b
    orl %r11d, %edi
    andl    %edi, %ecx
    cmpq    $7, %rdx
    seta    %dil
    testb   %dil, %cl
    je  .L19
    testq   %rax, %rax
    je  .L19
    xorl    %ecx, %ecx
    xorl    %edi, %edi
    .p2align 4,,10
    .p2align 3
.L15:
    vmovaps (%r9,%rcx), %ymm0
    addq    $1, %rdi
    vaddps  (%r8,%rcx), %ymm0, %ymm0
    vmovaps %ymm0, (%rsi,%rcx)
    addq    $32, %rcx
    cmpq    %rdi, %r10
    ja  .L15
    cmpq    %rax, %rdx
    je  .L12
    .p2align 4,,10
    .p2align 3
.L20:
    vmovss  (%r9,%rax,4), %xmm0
    vaddss  (%r8,%rax,4), %xmm0, %xmm0
    vmovss  %xmm0, (%rsi,%rax,4)
    addq    $1, %rax
    cmpq    %rax, %rdx
    ja  .L20
.L12:
    vzeroupper
    ret
.L19:
    xorl    %eax, %eax
    jmp .L20
    .cfi_endproc
Leveloff answered 25/3, 2013 at 11:21 Comment(8)
Does the option --param vect-max-version-for-alias-checks=n help at all?Trihedron
It helps when a lot of pointers are in play (often GCC will just give up trying to vectorize a function unless n ~ 100). However, I am wondering how I can convince GCC that these checks are pointless.Leveloff
Could you show the assembly your compiler generates?Ferricyanide
This paper might be of help: cs.cmu.edu/~dkoes/research/techreport.pdfTrihedron
@Freddie Witherden so whats about -fno-strict-aliasing now? Did it help you?Maupassant
this does not compile: 'void fn1(const float *restrict *restrict a, float *restrict b, int n)' due to the repeated '*restrict' modifierReconstruct
@teppic: The paper looks interesting, but I think the difficulty with restrict centers mainly around ambiguities regarding the phrase "based upon", and the "lifetime" of guarded values in the more complex cases. If "definitely based upon" is defined transitively, and two pointers may be assumed not to alias if either (1) one is definitely based upon some restrict-qualified pointer p, and the other is definitely based upon some pointer q whose existence predates p, or (2) one is definitely based upon some restrict-qualified pointer p, all pointers based upon p can be enumerated, ...Viscounty
@teppic: ...and q is not among them, such a definition should be easy for both programmers and compilers to reason about, since "based upon" relationships would be based upon program structure rather than pointer values.Viscounty
A
1

There is away to tell compiler to stop checking aliasing:

please add line:

#pragma GCC ivdep

right in front of the loop you want to vectorize, if you need more information please read:

https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/Loop-Specific-Pragmas.html

Asyndeton answered 19/1, 2015 at 12:13 Comment(1)
We're at 4.7.2, aren't we? 4.7.4 gives me: warning: ignoring #pragma GCC ivdep [-Wunknown-pragmas]Jamboree
N
0

Can this help?

void fn1(const float **restrict a, float *restrict b, int n)
{
    const float * restrict a0 = a[0];
    const float * restrict a1 = a[1];

    ASSUME_ALIGNED(a0); ASSUME_ALIGNED(a1); ASSUME_ALIGNED(b);

    for (int i = 0; i < n; ++i)
        b[i] = a0[i] + a1[i];
}

Edit: second try :). With information from http://locklessinc.com/articles/vectorize/

gcc --fast-math ...

Northerner answered 25/3, 2013 at 12:33 Comment(1)
Sadly not; GCC still emits aliasing checks.Leveloff
M
0

Well, what about the flag

-fno-strict-aliasing

?

As I understood you right you just want to know how to turn this checks off? If thats all, this parameter to gcc commandline should be helping you.

EDIT:

In addition to your comment: isn't it forbidden to use const type restrict pointers?

this is from ISO/IEC 9899 (6.7.3.1 Formal definition of restrict):

1.

Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T.

4.

During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified. Every other lvalue used to access the value of X shall also have its address based on P. Every access that modifies X shall be considered also to modify P, for the purposes of this subclause. If P is assigned the value of a pointer expression E that is based on another restricted pointer object P2, associated with block B2, then either the execution of B2 shall begin before the execution of B, or the execution of B2 shall end prior to the assignment. If these requirements are not met, then the behavior is undefined.

And a much more interesting point, same as with register is this one:

6.

A translator is free to ignore any or all aliasing implications of uses of restrict.

So if you can't find a command parameter which forces gcc to do so, its probably not possible, because from the standard it doesn't have to give the option to do so.

Maupassant answered 9/8, 2013 at 8:2 Comment(10)
@Freddie Witherden I'm not pretty good in assembler, but i know the lines you mentioned will be out of the.o by this flag(that i tested in my cases so far). and I also know this line advises the compiler to don't do any optimization based on aliasing rules, so i suggest he won't check any aliasing with this flag, if he isn't allowed to act depending on thoose checks. So this should solve you proplem, yeah.Maupassant
@Freddie Witherden just to get you right: what you want to do is, prevent the compiler of doing any optimization based on strict- aliasings, don't you? because exactly thats what's this parameters purpose. So it should do what you want as I got you right.Maupassant
Not exactly. I wish for the compiler to respect the restrict qualifier on the variables as passed. Although aliasing related it is different than GCCs strict-aliasing.Leveloff
@Freddie Witherden But isnt it forbidden to use const type restrict pointer after 6.7.3.1 Formal definition of restrict ?Maupassant
s/restrict/__restrict__/; s/std=c99/std=gnu99/; My interest here is purely to get GCC to produce the best possible assembly.Leveloff
@Freddie Witherden yeah, and as i said, probably you can't avoid this checks. I mean the standard says they don't have to give that option and i know they don't do for register, so why should they do for restrict?Maupassant
But GCC evidently does give the option of avoiding the checks. Just compare the assembly for the above two functions: fn0 does not have the checks while fn1 does. I am interested in workarounds/flags that coax GCC into producing the same assembly for both.Leveloff
@Freddie Witherden so show me the source, where you get teh information that gcc gives you the option to force it to respect all your restrict qualifiers. My point of view is: as the standard says: A translator is free to ignore any or all aliasing implications of uses of restrict. You cant influence the use of it as far the compiler explicit says: "We are offering you the option this was:[...]"Maupassant
Unfortunately, the definition of "based upon" seems to have been designed to yield sensible results for most cases, without consideration for corner cases. Nothing in the Standard would imply any intention to forbid equality comparisons between restrict-qualified pointers and other pointers that are coincidentally equal, and in fact there are some cases where such comparisons might serve a useful purpose (e.g. having one loop that accesses p and a static object x to handle cases where p doesn't identify x, and another loop that accesses storage only through p, to...Viscounty
...handle cases where p and x happen to be the same. This would cause no problem if the notion of "based upon" involved applying a transitive relation where expressions like ptr+intVal transitively yield pointers based upon p, but neither clang nor gcc will meaningfully handle constructs that involve pointer equality comparisons between restrict-qualified pointers and other pointers that might happen to match them.Viscounty
W
0

I apologize in advance, because I cannot reproduce results with GCC 4.7 on my machine, but there are two possible solutions.

  1. Use typedef to compose a * restrict * restrict properly. This is, according to a former colleague who developers the LLVM compiler, the single exception to typedef behaving like the preprocessor in C and it exists to allow the anti-aliasing behavior you desire.

    I attempted this below but I'm not sure I succeeded. Please fact-check my attempt carefully.

  2. Use the syntax described in the answers to using restrict qualifier with C99 variable length arrays (VLAs).

    I attempted this below but I'm not sure I succeeded. Please fact-check my attempt carefully.

Here is the code I used to perform my experiments, but I was not able to determine conclusively if either of my suggestions worked as desired.

#define ALIGN_BYTES 32
#define ASSUME_ALIGNED(x) x = __builtin_assume_aligned(x, ALIGN_BYTES)

void fn0(const float *restrict a0, const float *restrict a1,
         float *restrict b, int n)
{
    ASSUME_ALIGNED(a0); ASSUME_ALIGNED(a1); ASSUME_ALIGNED(b);

    for (int i = 0; i < n; ++i)
        b[i] = a0[i] + a1[i];
}

#if defined(ARRAY_RESTRICT)
void fn1(const float *restrict a[restrict], float * restrict b, int n)
#elif defined(TYPEDEF_SOLUTION)
typedef float * restrict frp;
void fn1(const frp *restrict a, float *restrict b, int n)
#else
void fn1(const float *restrict *restrict a, float *restrict b, int n)
#endif
{
    //ASSUME_ALIGNED(a[0]); ASSUME_ALIGNED(a[1]); ASSUME_ALIGNED(b);

    for (int i = 0; i < n; ++i)
        b[i] = a[0][i] + a[1][i];
}
Woodchopper answered 28/4, 2015 at 21:50 Comment(3)
On gcc-4.7.4 both ARRAY_RESTRICT and TYPEDEF_RESTRICT generate the same assembly for fn1 as the default case (aliasing checks).Leveloff
Yeah, that's what I saw, too, but I don't think GCC 4.7 is the most aggressive compiler for auto-vectorization.Woodchopper
Given int *restrict *restrict p, what would mark the beginning and end of the interval during which objects that would be accessed via **p could not be accessed via other means?Viscounty
V
0

Nearly all of the performance advantages that could be reaped via the use of restrict involve one of two usage patterns:

  1. A restrict qualifier applied directly to a named function argument [as opposed to something pointed to thereby]

  2. A restrict qualifier applied directly to a named automatic object which has an initializer.

In both of those contexts, it would be clear that the qualifier "guards" storage accessed by pointers based upon the initial value of the named object, and that the term of such guarding extends from the time the object is initialized until the end of its lifetime.

If a restrict qualifier is used in any other circumstance, it's far less clear what the semantics should be. While the Standard attempts to specify how other types should work, I'm unaware of any compilers trying to apply them.

Given, for example:

extern int x,y;

int *xx = &x, *yy = &y;
int *restrict *restrict pp;

pp = &xx;
int *q = *pp;
*q = 1;
pp = &yy;
... other code

If q is never used after the *q=1; shown above, should the "restrict" qualifier on *pp continue to guard x even after pp itself is changed to point to yy. Is there any evidence that the Committee has considered such issues and reached any consensus, or that compiler writers attempt to meaningfully handle such cases?

Meaningful handling of the restrict qualifier requires that the "guarded pointer value" established thereby has a well-defined lifetime. Trying to handle cases beyond the two described above would require substantial effort while offering relatively minimal benefit.

If the example code were changed to use a declaration int *restrict q = *pp;, then it would be clear that the value of x would be protected in "other code" if it was within the scope of q, but that would be true regardless of whether the compiler recognized the outer-level restrict qualifier on pp. So why bother with such complications?

Viscounty answered 20/10, 2022 at 15:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.