Why must a comma expression used as an array size be enclosed in parentheses if part of an array declarator?
Asked Answered
D

2

1

I just noticed that int arr2[(777, 100)] is a legal array declarator, while int arr1[777, 100] is not.

A more verbose example for compiling would be this

#include <stdio.h>

void f(int i) {
    printf("side effect %d\n", i);
}

int main(void) {
    // int arr1[777, 100]; // illegal, it seems
    int arr2[(777, 100)];
    int arr3[(f(777), 100)];

    arr2[10, 20] = 30;
    arr3[f(40), 50] = 60;

    return 0;
}

which compiles fine with GCC (but for some reason not with MSVC).

Also note crucially how the above code illustrates how comma expressions are fine in a non-declarator context.

The reason (for why parentheses are needed in array declarators) appears to be that in the C standard the array size in square brackets is an assignment-expression but not an expression (C17 draft, 6.7.6.2 ¶3 and A.2.1); the latter is the syntactic level for comma operators (C17 draft, A.2.1 (6.5.17)):

expression:
assignment-expression
expression , assignment-expression

If one expands assignment-expression, one ultimately gets to the level of primary-expression (C17 draft, A.2.1 (6.5.1)):

primary-expression:
identifier
constant
string-literal
( expression )
generic-selection

If the standard says so, that's the way it is, but: Is there a syntactic necessity? Perhaps there is a reason based in language design considerations. The C standard lists the following 4 forms of array declarators (C17 draft, 6.7.6.2 ¶3):

D [ type-qualifier-listopt assignment-expressionopt ]
D [ type-qualifier-listopt assignment-expression ]
D [ type-qualifier-list static assignment-expression ]
D [ type-qualifier-listopt ]

  • One potential reason I can see is to keep this syntax simple, with the 3rd line (with static) in mind.
  • Another potential reason might be to prevent people from writing int arr[m, n] when they really mean int arr[m][n].

Incidentally, if anyone has comments about why this doesn't compile with MSVC, that would also be appreciated.

Damper answered 6/4, 2024 at 3:54 Comment(16)
MSVC doesn't support variable-length arrays, so the array size has to be a literal, not an expression.Sensitometer
Btw if anyone knows how to get markup with line-initial spacing (for the syntax excerpts) that's different from the `␣␣` (: space) which I'm using, please let me know, or feel free to edit my post directly.Damper
I usually use code blocks if I want that kind of formatting. You can also use a bullet list.Sensitometer
What's the point of int arr2[(777, 100)];? It's equivalent to int arr2[100];Sensitometer
And int arr3[(f(777), 100)]; is equivalent to f(777); int arr3[100];.Sensitometer
@Sensitometer My cop-out here is that it's a 'language lawyer' question ;-) Though I won't reject practically useful examples.Damper
Have you looked at the grammar specification of VLA? That will probably explain why you need the parentheses. I suspect the reason is to avoid people thinking that they can use commas to declare and access multi-dimensional arrays.Sensitometer
Similar to why you need to use parentheses to put a comma expression in the arguments of a function -- otherwise there's ambiguity with the comma that separates arguments.Sensitometer
@Sensitometer I have (looked). // I did mention this (rationale) as a possibility, but is there really a necessity?Damper
@LoverofStructure I don't think there's a necessity. Language designers sometimes base their decisions on practical or esthetic reasons.Sensitometer
@LoverofStructure As I recall, the ( after the for (and perhaps after the while too) isn't really necessary.Grout
@Grout The parentheses are part of the syntax for the for- and while-loops (C17 draft, 6.8.5 ¶1).Damper
@LoverofStructure Rephrasing: As I recall, technically (from the compiler's writer perspective), the ( after the for (and perhaps after the while too) isn't really necessary.Grout
@Grout Ah, okay – you are giving an example of an element of C syntax that's strictly speaking redundant. Got it.Damper
@LoverofStructure Do you know whether in sizeof ( type-name ) (as well as in _Alignof ( type-name )) the () are technically required (i.e. strictly speaking non-redundant)?Grout
@Grout There is an answer here: https://mcmap.net/q/216742/-why-and-when-do-i-need-to-use-parentheses-after-sizeof (For a moment I thought that one could argue that sizeof int * + 0 should be parsed as (sizeof int) * (+ 0), but actually I think the example works, because * within a type designator isn't actually covered by the standard operator precedence rules. Note that I didn't check whether the grammar given in the standard would literally be ambiguous if sizeof typename were allowed – as you will know, the operator precedence rules are only implicit in the standard, because they follow from its grammar.)Damper
C
2

As mentioned in the question, avoiding accepting multiple expressions separated by commas avoids potential mistakes by people accustomed to other programming languages that use that syntax for multiple array dimensions.

Specifying assignment-expression rather than expression in the grammar excludes only the comma operator. Besides the already conjectured reason above, the only effect I can see is on macro use. int a[3, 4] would be parsed as two arguments to a function-like macro1, whereas int a[(3, 4)] would be one. But without some example use case for a macro involving that, I do not see it as the reason.

Footnote

1 For example, Foo(int a[3, 4]) would be parsed as invoking Foo with one argument of int a[3 and another of 4].

Crossfade answered 23/4, 2024 at 20:43 Comment(0)
S
1

In the specification of array declaration grammar, the length is an assignment-expression:

If, in the declaration "T D1", D1 has one of the forms:
D [ type-qualifier-listopt assignment-expressionopt ] attribute-specifier-sequenceopt
D [ static type-qualifier-listopt assignment-expression ] attribute-specifier-sequenceopt
D [ type-qualifier-list static assignment-expression ] attribute-specifier-sequenceopt
D [ type-qualifier-listopt * ] attribute-specifier-sequenceopt
and the type specified for ident in the declaration "T D" is "derived-declarator-type-list T", then the type specified for ident is "derived-declarator-type-list array of T"

And in the specification of expression grammar, assignment-expression does not include expression, which is where comma operator can be used. assignment-expression is just below expression in the hierarchy of expression grammar.

expression:
assignment-expression
expression, assignment-expression

So you have to wrap the comma expression in parentheses to make it a primary-expression, which at the very base of the expression hierarchy.

Sensitometer answered 6/4, 2024 at 4:20 Comment(2)
The question covers how the grammar requires parentheses to use the comma operator in this context. The question asks why the grammar is this way (“Is there a syntactic necessity? Perhaps there is a reason based in language design considerations.”). This post does not answer that.Crossfade
Unless there's a rationale document or one of the committee members wishes to speak up, the best we can do is speculate. I did that in a comment, I don't want to commit to it in the answer.Sensitometer

© 2022 - 2025 — McMap. All rights reserved.