Is it valid, according to ISO C (any version), to specify a zero-sized array parameter?
The standard seems ambiguous. While it's clear that zero-sized arrays are invalid, array function parameters are special:
C23::6.7.6.3/6:
A declaration of a parameter as "array of type" shall be adjusted to "qualified pointer to type", where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the keyword static also appears within the [ and ] of the array type derivation, then for each call to the function, the value of the corresponding actual argument shall provide access to the first element of an array with at least as many elements as specified by the size expression.
As long as you don't use static
, the size specified between []
is effectively ignored. As I understand the quoted paragraph, the compiler isn't allowed to make any suppositions at all about the pointer.
So, the following code should be conforming, right?
void h(char *start, char past_end[0]);
#define size 100
void j(void)
{
char dst[size];
h(dst, dst+size);
}
I use past_end[0]
as a sentinel pointer to one-past-the-end (instead of a size; it's much more comfortable in some cases). The [0]
clearly tells that it's one past the end, and not the actual end, which as a pointer, readers might confuse. The end would be marked as end[1]
, to be clear.
GCC thinks it's not conforming:
$ gcc -Wall -Wextra -Wpedantic -pedantic-errors -std=c17 -S ap.c
ap.c:1:26: error: ISO C forbids zero-size array ‘past_end’ [-Wpedantic]
1 | void h(char *start, char past_end[0]);
| ^~~
Clang seems to agree:
$ clang -Wall -Wextra -Wpedantic -pedantic-errors -std=c17 -S ap.c
ap.c:1:30: warning: zero size arrays are an extension [-Wzero-length-array]
void h(char *start, char past_end[0]);
^
1 warning generated.
If I don't ask for strict ISO C, GCC still warns (differently), while Clang relaxes:
$ cc -Wall -Wextra -S ap.c
ap.c: In function ‘j’:
ap.c:7:9: warning: ‘h’ accessing 1 byte in a region of size 0 [-Wstringop-overflow=]
7 | h(dst, dst+size);
| ^~~~~~~~~~~~~~~~
ap.c:7:9: note: referencing argument 2 of type ‘char[0]’
ap.c:1:6: note: in a call to function ‘h’
1 | void h(char *start, char past_end[0]);
| ^
ap.c:7:9: warning: ‘h’ accessing 1 byte in a region of size 0 [-Wstringop-overflow=]
7 | h(dst, dst+size);
| ^~~~~~~~~~~~~~~~
ap.c:7:9: note: referencing argument 2 of type ‘char[0]’
ap.c:1:6: note: in a call to function ‘h’
1 | void h(char *start, char past_end[0]);
| ^
$ clang -Wall -Wextra -S ap.c
I reported this to GCC, and there seems to be disagreement:
https://gcc.gnu.org/pipermail/gcc/2022-December/240277.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108036
Is there any requirements to comply with the same requirements as for arrays?
Moreover, if that proves to be true, what about the following?:
void f(size_t sz, char arr[sz]);
Is the compiler entitled, under strict ISO C mode, to assume that the array will always have at least one element, even if I didn't use static
, just because I used array syntax? If so, that would probably be a regression in the language.
start
toend
is to have the item namedend
and then point 1 item beyond the array. Similarly, theendptr
ofstrtol
family points 1 item beyond the valid string, the C++iterator
s useend
to point 1 item beyond the array, and so on. – Taviafor(const type* i = start; i != end; i++)
. Which requiresend
to point 1 item beyond the array. This is AFAIK the very reason why C allows us to point 1 item beyond an array, as long as we don't dereference that location. – TaviaI use past_end[0] as a sentinel pointer to one-past-the-end (instead of a size; it's much more comfortable in some cases).
What? That'd confuse me. It's very idiomatic to pass an array with its size to a function.. – Roilyend
to refer to the last valid pointer in many cases. However, I also see a lot of code that usesend
to refer to the last byte. That inconsistency is too much inconsistent to my taste, and in fact I found bugs in a code base where a given function was implemented withend
meaningpast_end
, while at call site it was being passed the actual end; off-by-one, you can guess. I wanted to use unambiguous syntax to fix such inconsistent source of bugs. – Scincoid"The [0] clearly tells that it's one past the end" Why?
: Because a valid pointer that has no addressable storage necessarily has to be one past the end of the array; otherwise it has at least 1 element (assuming non-null). If you have a pointer[3]
, means that there are 3 remaining elements;[1]
means one remaining element;[0]
is after the last element. And the name helps. Of course if you haven't seen if before, it might be a bit surprising, and will cause a WTF moment, but after that small learning curve, it can be very informative. – Scincoidchaining string-copy functions that truncate, while deferring truncation detection to after all chained calls, can only be done with pointers
this presumes that this kind of logic is valid to begin with, which I can't validate. In my personal opinion, I don't think it's clean to pass the end of an array to a function. Your linked post talks about an "improved" string copy function (which, you (conveniently) wrote yourself, by the way) that isn't adding any benefit to existing functions such asstrncpy
. – Roilychain
function calls in a language that wasn't designed for that use case. You chain function calls in a language like JavaScript or PHP, but not C. My hair stands up by even beginning to think about "chaining" function calls in C - ugh. – Roilystrncpy(3)
? That's a function designed to write to fixed-width buffers such as utmpx(5). It's been long misused as if it werestrlcpy(3)
, but it's not, and it's a source of bugs when used that way (I hope it's not necessary to quote anything here). Anyway, I'll quote a discussion in GCC (in which I participate, for full disclosure), just in case: <lore.kernel.org/linux-man/…>. BTW, I just fixed several such bugs today in the shadow package. – Scincoidstpecpy()
function. But @Tavia reviewed it and helped improve it; and so did other programmers I know (in private). You find issues with that function? I invite you to discuss them in that forum. I'm open to improvement. In fact, I'm about to post a minor improvement to acceptNULL
for allowing chaining with a variant ofsnprintf(3)
(underlying issue there was thatsnprintf(3)
usesint
, for the curious). That was talked in an NGINX discussion to fix some cases of UB while callingsnprintf(3)
. – Scincoidstrcpy(3)
andstrcat(3)
? It was for this exact line of code:strcat (strcpy (d, s1), s2);
This line of code has been literally copied from an ISO C document: <open-std.org/JTC1/SC22/WG14/www/docs/n2349.htm>. And of course, such code goes back to K&R (I don't have that book handy to quote it, though). – ScincoidThe idiomatic (though far from ideal) way to append two strings is by calling the strcpy and strcat functions as follows
lol. – Roilystrcat (strcpy (d, s1), s2);
I don't write code like that, neither should anyone else. – Roily