using restrict qualifier with C99 variable length arrays (VLAs)
Asked Answered
K

2

4

I am exploring how different implementations of simple loops in C99 auto-vectorize based upon the function signature.

Here is my code:

/* #define PRAGMA_SIMD _Pragma("simd") */
#define PRAGMA_SIMD

#ifdef __INTEL_COMPILER
#define ASSUME_ALIGNED(a) __assume_aligned(a,64)
#else
#define ASSUME_ALIGNED(a)
#endif

#ifndef ARRAY_RESTRICT
#define ARRAY_RESTRICT
#endif

void foo1(double * restrict a, const double * restrict b, const double * restrict c) 
{ 
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < 2048; ++i) {
        if (c[i] > 0) {
            a[i] = b[i];
        } else {
            a[i] = 0.0;
        } 
    }
}

void foo2(double * restrict a, const double * restrict b, const double * restrict c) 
{ 
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < 2048; ++i) {
        a[i] = ((c[i] > 0) ? b[i] : 0.0);
    }
}

/* Undetermined size version */

void foo3(int n, double * restrict a, const double * restrict b, const double * restrict c) 
{ 
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < n; ++i) {
        if (c[i] > 0) {
            a[i] = b[i];
        } else {
            a[i] = 0.0;
        } 
    }
}

void foo4(int n, double * restrict a, const double * restrict b, const double * restrict c) 
{ 
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < n; ++i) {
        a[i] = ((c[i] > 0) ? b[i] : 0.0);
    }
}

/* Static array versions */

void foo5(double ARRAY_RESTRICT a[2048], const double ARRAY_RESTRICT b[2048], const double ARRAY_RESTRICT c[2048]) 
{ 
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < 2048; ++i) {
        if (c[i] > 0) {
            a[i] = b[i];
        } else {
            a[i] = 0.0;
        } 
    }
}

void foo6(double ARRAY_RESTRICT a[2048], const double ARRAY_RESTRICT b[2048], const double ARRAY_RESTRICT c[2048]) 
{ 
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < 2048; ++i) {
        a[i] = ((c[i] > 0) ? b[i] : 0.0);
    }
}

/* VLA versions */

void foo7(int n, double ARRAY_RESTRICT a[n], const double ARRAY_RESTRICT b[n], const double ARRAY_RESTRICT c[n]) 
{ 
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < n; ++i) {
        if (c[i] > 0) {
            a[i] = b[i];
        } else {
            a[i] = 0.0;
        } 
    }
}

void foo8(int n, double ARRAY_RESTRICT a[n], const double ARRAY_RESTRICT b[n], const double ARRAY_RESTRICT c[n]) 
{ 
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < n; ++i) {
        a[i] = ((c[i] > 0) ? b[i] : 0.0);
    }
}

When I compile with

$ icc -O3 -std=c99 -opt-report5 -mavx -S foo.c 
icc: remark #10397: optimization reports are generated in *.optrpt files in the output location

I see that the VLA cases are not auto-vectorized, but when I add the flag to assert no aliasing -fno-alias, they are. Thus, I conclude that I should prescribe this in the source, so I attempt to do that by compiling with

$ icc -O3 -std=c99 -opt-report5 -mavx -DARRAY_RESTRICT=restrict -S foo.c 
icc: remark #10397: optimization reports are generated in *.optrpt files in the output location

The compiler error output includes

foo.c(98): error: "restrict" is not allowed
void foo7(int n, double ARRAY_RESTRICT a[n], const double ARRAY_RESTRICT b[n], 
const double ARRAY_RESTRICT c[n]) 

             ^

but as you can see, restrict is not allowed on my VLA arguments.

So my question is: is there no way to assert no aliasing of VLA in ISO C?

Note that I can assert no aliasing in the source code using pragmas - e.g. simd, omp simd, ivdep etc. - and get the auto-vectorization that I want but these aren't ISO C.

In this context, ISO C means the most recent version of C, which of course is C11 as of the writing of this post.

Katmandu answered 17/3, 2015 at 0:7 Comment(5)
The arrays in foo5() are not really VLAs.Sublittoral
The syntax error appears for foo[5678] and foo[78] have VLA arguments. While foo[56] don't use VLAs, the same question applies regarding the use of the restrict qualifier.Katmandu
Does ISO/IEC 9899:2011 Section 6.7.3.1 Formal definition of restrict help: Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T. Do the arrays a, b and c satisfy that definition. I confess I'm not sure, but I think not.Sublittoral
Also, §6.7.6.3 has Example 5 which says the following function prototype declarators are equivalent: void f(double (* restrict a)[5]);, void f(double a[restrict][5]);, void f(double a[restrict 3][5]);, and void f(double a[restrict static 3][5]);. You may need to do some chasing from there...Sublittoral
This is the only place in the standard where restrict appears associated directly with array types. §6.7.6 is on declarators generally, and §6.7.6.2 on array declarators, and it looks to me as though the restrict has to appear inside the first component of the array dimension. In your context, it should be: void foo7(int n, double a[ARRAY_RESTRICT n], const double b[ARRAY_RESTRICT n], const double c[ARRAY_RESTRICT n]), I believe (but I wouldn't have believed without seeing the examples in the standard and you asking the question!). Also note that this applies to arrays as well as VLAs.Sublittoral
S
5

Your original code fails nicely for me with messages such as:

 void foo7(int n, double ARRAY_RESTRICT a[n], const double ARRAY_RESTRICT b[n], const double ARRAY_RESTRICT c[n])
 ^
restrict.c:126:1: error: invalid use of ‘restrict’
restrict.c:126:1: error: invalid use of ‘restrict’
restrict.c:145:1: error: invalid use of ‘restrict’

Transferring selected parts of the comments

§6.7.6.3 Function declarators (including prototypes) has Example 5 which says the following function prototype declarators are equivalent:

void f(double (* restrict a)[5]);
void f(double a[restrict][5]);
void f(double a[restrict 3][5]);
void f(double a[restrict static 3][5]);

This is the only place in the standard where restrict appears associated directly with array types. §6.7.6 is on declarators generally, and §6.7.6.2 on array declarators, and it looks to me as though the restrict has to appear inside the first component of the array dimension. In your context, it should be:

void foo7(int n, double a[ARRAY_RESTRICT n],
           const double b[ARRAY_RESTRICT n],
           const double c[ARRAY_RESTRICT n])

I wouldn't have believed that notation without seeing the examples in the standard and you asking the question! Note that this applies to arrays as well as VLAs.

This revised code, based on the commentary, compiles cleanly under the same compilation options:

gcc -g -O3 -std=c11 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
    -Wold-style-definition -Wold-style-declaration -Werror -c new.restrict.c

The compilation options demand prior declarations of non-static functions, hence the declarations at the top of the file. I also forced #define ARRAY_RESTRICT restrict in the source, rather than leaving it as a compilation option.

The compiler is GCC 4.9.2 running on an Ubuntu 14.04 derivative.

File new.restrict.c:

/* #define PRAGMA_SIMD _Pragma("simd") */
#define PRAGMA_SIMD

#ifdef __INTEL_COMPILER
#define ASSUME_ALIGNED(a) __assume_aligned(a, 64)
#else
#define ASSUME_ALIGNED(a)
#endif

#define ARRAY_RESTRICT restrict

#ifndef ARRAY_RESTRICT
#define ARRAY_RESTRICT
#endif

void foo1(double *restrict a, const double *restrict b, const double *restrict c);
void foo2(double *restrict a, const double *restrict b, const double *restrict c);
void foo3(int n, double *restrict a, const double *restrict b, const double *restrict c);
void foo4(int n, double *restrict a, const double *restrict b, const double *restrict c);
void foo5(double a[ARRAY_RESTRICT 2048], const double b[ARRAY_RESTRICT 2048], const double c[ARRAY_RESTRICT 2048]);
void foo6(double a[ARRAY_RESTRICT 2048], const double b[ARRAY_RESTRICT 2048], const double c[ARRAY_RESTRICT 2048]);
void foo7(int n, double a[ARRAY_RESTRICT n], const double b[ARRAY_RESTRICT n], const double c[ARRAY_RESTRICT n]);
void foo8(int n, double a[ARRAY_RESTRICT n], const double b[ARRAY_RESTRICT n], const double c[ARRAY_RESTRICT n]);

void foo1(double *restrict a, const double *restrict b, const double *restrict c)
{
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < 2048; ++i)
    {
        if (c[i] > 0)
        {
            a[i] = b[i];
        }
        else
        {
            a[i] = 0.0;
        }
    }
}

void foo2(double *restrict a, const double *restrict b, const double *restrict c)
{
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < 2048; ++i)
    {
        a[i] = ((c[i] > 0) ? b[i] : 0.0);
    }
}

/* Undetermined size version */

void foo3(int n, double *restrict a, const double *restrict b, const double *restrict c)
{
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < n; ++i)
    {
        if (c[i] > 0)
        {
            a[i] = b[i];
        }
        else
        {
            a[i] = 0.0;
        }
    }
}

void foo4(int n, double *restrict a, const double *restrict b, const double *restrict c)
{
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < n; ++i)
    {
        a[i] = ((c[i] > 0) ? b[i] : 0.0);
    }
}

/* Static array versions */

void foo5(double a[ARRAY_RESTRICT 2048], const double b[ARRAY_RESTRICT 2048], const double c[ARRAY_RESTRICT 2048])
{
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < 2048; ++i)
    {
        if (c[i] > 0)
        {
            a[i] = b[i];
        }
        else
        {
            a[i] = 0.0;
        }
    }
}

void foo6(double a[ARRAY_RESTRICT 2048], const double b[ARRAY_RESTRICT 2048], const double c[ARRAY_RESTRICT 2048])
{
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < 2048; ++i)
    {
        a[i] = ((c[i] > 0) ? b[i] : 0.0);
    }
}

/* VLA versions */

void foo7(int n, double a[ARRAY_RESTRICT n], const double b[ARRAY_RESTRICT n], const double c[ARRAY_RESTRICT n])
{
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < n; ++i)
    {
        if (c[i] > 0)
        {
            a[i] = b[i];
        }
        else
        {
            a[i] = 0.0;
        }
    }
}

void foo8(int n, double a[ARRAY_RESTRICT n], const double b[ARRAY_RESTRICT n], const double c[ARRAY_RESTRICT n])
{
    ASSUME_ALIGNED(a);
    ASSUME_ALIGNED(b);
    ASSUME_ALIGNED(c);
    PRAGMA_SIMD
    for (int i = 0; i < n; ++i)
    {
        a[i] = ((c[i] > 0) ? b[i] : 0.0);
    }
}
Sublittoral answered 17/3, 2015 at 1:6 Comment(6)
I haven't looked at the rationale or committee meeting notes to find out why the notation needs to be as it is.Sublittoral
do you have another suggestion? double QUALIFIER a[n] would mean that QUALIFIER qualifies double, not a. As in the const examples in this code.Fasano
@MattMcNabb: I'm not sure what you are seeking. Are you suggesting that this suggested solution has not 'restricted' the arrays so that they are not in overlapping memory? Or are you seeking information for other type qualifiers? The definition of direct declarator (in §6.7.6) includes variants such as direct-declarator [ type-qualifier-list opt assignment-expression opt ] and direct-declarator [ static type-qualifier-list opt assignment-expression ] and direct-declarator [ type-qualifier-list static assignment-expression ] ...Sublittoral
I'm referring to your comment "rationale ... to find out why the notation needs to be as it is" plus your other comments about the notation being unbelievable, etc. It implies that you find something unsatisfactory about the notation, so I am asking if you had some alternative notation in mind.Fasano
@MattMcNabb: Oh, I see. A priori, I would have expected restrict to appear outside the square brackets, as the OP did. I'm not completely clear why it can't be handled there, but all the other occurrences of restrict in the library section of the standard are in the form type * restrict — or perhaps type ** restrict — so that the keyword is after the * indicating the pointer, and there isn't a * indicating the pointer in a function's array parameter declaration, so maybe there isn't a sensible alternative.Sublittoral
Thanks for your answer. It makes sense to me now that I see it that restrict belongs inside the brackets, when I consider the multidimensional array case. How else could one say a[restrict n][restrict m] if not to place the restrict inside the brackets?Katmandu
F
4

None of the parameters in this code have variably-modified type. foo6 and foo7 are exactly the same function signature except for int n. See Why do C and C++ compilers allow array lengths in function signatures when they're never enforced?.

These are all exactly the same:

void foo8(int n, T *a);
void foo8(int n, T a[16]);
void foo8(int n, T a[n]);

The version void foo8(int n, T a[]); is almost the same, but it has a corner case that it is not permitted if T is an incomplete type.

foo8 may be called with either a VLA or a non-VLA.

Although the array declarator has variably-modified type, the array-of-T to pointer-to-T adjustment is made before the type of the parameter is reckoned. So T a[n] is adjusted to T *a which is not variably-modified; however void foo9(int n, T a[][n]); does yield the variably-modified type T (*)[n] for a.


The simplest way of combining restrict with an array declarator is to actually use the pointer form, here:

void foo8(int n, T *restrict a ) {

The attempt void foo8(int n, T restrict a[]); does not work because it is equivalent to void foo8(int n, T restrict *a);. The restrict is a qualifier and it behaves syntactically the same as other qualifiers such as const.

As noted by Jonathan Leffler, there is an alternative syntax:

void foo8(int n, T a[restrict]) {   // n is optional , as before

In this case it seems redundant to allow the same thing to be specified in two different ways, however there also exists static which can only be used with an array declarator (not a pointer declarator). If you want to use this form of static and also restrict then there is no choice but to have restrict inside the square brackets:

void foo8(int n, T a[restrict static n]) { 

To be clear, this last case is still not a variably-modified type; the static is a promise that a, which is a pointer, points to the first element of an array of at least n elements.

Also, the static need not be checked at compile-time when the function is called (although of course it would be nice if the compiler did enforce).

A final note: restrict in a prototype probably has no effect, it's only significant in the function definition.

Fasano answered 17/3, 2015 at 1:45 Comment(12)
Interesting x-ref on whether restrict in a prototype has any effect. The compiler cannot possibly enforce the constraint if it doesn't know that it has to, so the presence of restrict in the prototype permits the compiler to deduce that foo7(3, &a[0], &a[1], &a[2]) — using a function from the question — is a call that violates the restrict criterion. Whether any compiler actually does so is a separate discussion, but if the restrict is omitted from the prototype, such errors can only be reported in the TU where the function is defined, and then only in calls after the definition.Sublittoral
@JonathanLeffler Yeah, my other question attempted to have a discussion along those lines. Your conclusion makes sense.Fasano
Are you saying that the compiler is not permitted to utilize the information contained in "a[16]" that is not contained in "a[]" when it comes to auto-vectorization? My goal is to tell the compiler as much as possible, and I believe that I am saying more when I use "a[n]" vs "*a", for example.Katmandu
@Jeff that's what a[static 16] is for. Without the static, the function cannot make any assumption about what a is pointing to (unless the compiler is doing whole-program optimization and so forth, of course). It's legal to pass an array of size 2 to a function with parameter a[16], etc. Although I don't exactly see what optimization you have in mind that this information would make a difference to, other than the non-null bit.Fasano
@MattMcNabb Okay, but does a[static n] work? Because my real goal is code that allows arbitrary dimension arrays. Thanks.Katmandu
@MattMcNabb The array dimension information is particularly valuable in the context of SIMD. For example, one doesn't need unroll-and-jam if the array dimension is a multiple of the SIMD width (and the data is aligned).Katmandu
@Jeff surely the compiler would be able to decide that based on your code though. (For example if your code loops from 0 to 16, it doesn't matter what the declared dimension was, the compiler can assume that the array has 16 elements). Just in case it can't, the a[static n] exists, which says that a points to the first element of an array with at least n elements. I'm not aware of any compiler that actually makes an optimization based on this, other than to assume a is not null.Fasano
@MattMcNabb I don't think the compiler can assume the array is length 16 just because that is the loop range. In my example, the array b can shorter than a or c if the user knows that c is non-positive past a certain point, and that case inhibits auto-vectorization in some compilers. (This example is derived from one written by a former colleague that developers the LLVM auto-vectorizer...)Katmandu
@Jeff I see. Good example :) It seems to me that const double b[restrict static 2048] is what you're looking for in foo1 then (although again IDK if any compiler is advanced enough to use it)Fasano
@MattMcNabb I'm already getting auto-vectorization from Intel 15 without the static qualifier but I will see if it improves things further or helps with more complicated cases.Katmandu
I can't think of how restrict in a prototype would tell the compiler much useful if it can't see inside the function, since it's perfectly legitimate for multiple restrict pointers to alias each other when accessing objects that will not be modified during the lifetime of the function. I suppose that if a function received two [restrict static 6] arrays, the compiler might be able to infer that any elements which would be shared between the two arrays could not be modified within the function, but I don't know that knowing that either the arrays will either be non-aliased or...Bauxite
...their contents will remain unchanged, but not knowing which condition applies, would be terribly useful.Bauxite

© 2022 - 2024 — McMap. All rights reserved.