I always thought that argc
was required to mark the end of argv
but I just learned that argv[argc] == NULL
by definition. Am I right in thinking that argc
is totally redundant? If so, I always thought C
made away with redundancy in the name of efficiency. Is my assumption wrong or there's a historic reason behind this? If the reason is historic, can you elaborate?
History.
Harbison & Steel (5th Edition, 9.9 "The main program") says the following:
Standard C requires that
argv[argc]
be a null pointer, but it is not so in some older implementations.
argv[argc]
as a null pointer — I suspect H&S don't provide that level of detail, though. They'd have to be pretty old these days. (I was never unlucky enough to come across one, but there are plenty of esoteric platforms that I've not programmed on.) –
Measures argv[argc]
and all examples use argc
to determine he end of the argv[]
array. The 2nd Edition points out the null sentinel, but doesn't use it in any examples. –
Exemplification Here's the history.
In first edition UNIX, which predates C, exec took as arguments a filename and the address of a list of pointers to NUL-terminated argument strings terminated by a NULL pointer. From the man page:
sys exec; name; args / exec = 11.
name: <...\0>
...
args: arg1; arg2; ...; 0
arg1: <...\0>
...
The kernel counted up the arguments and provided the new image with the arg count followed by a list of pointers to copies of the argument strings, at the top of the stack. From the man page:
sp--> nargs
arg1
...
argn
arg1: <arg1\0>
...
argn: <argn\0>
(The kernel source is here; I haven't looked to see if the kernel actually wrote something after the pointer to the last argument.)
At some point, up through the 6th edition, the documentation for exec, execl, and execv began to note that the kernel placed a -1
after the arg pointers. The man page says:
Argv is not directly usable in another execv, since argv[argc] is -1 and not 0.
At this point, you could argue that argc
was redundant, but programs had, for some time, been using it rather than looking through the argument list for -1
. For example, here's the beginning of cal.c:
main(argc, argv)
char *argv[];
{
if(argc < 2) {
printf("usage: cal [month] year\n");
exit();
}
In 7th edition, exec was changed to add a NULL pointer after the argument strings, and this was followed by a list of pointers to the environment strings, and another NULL. The man page says:
Argv is directly usable in another execv because argv[argc] is 0.
© 2022 - 2024 — McMap. All rights reserved.
NULL
be an element ofargv
? That is, before the actual end of the array. – Invarargv
could be an empty string, that is an array of just one element, a0
-byte. – Syllabicargc
instead of iterating overargv
. – Dawnaif (argc < 3) { printf("error message"); return 1; }
without looping theargv
list first. Not to mention various other choices that might be made based on the number of arguments (read files from command-line args vs. reading stdin, etc.) – Kulakexecname 'arg1' 'arg2' '' 'arg4'
, in which caseargv[3]
is an empty string. And, as @Jens said, that's notNULL
. – Invarexecv()
and give it an array with null pointers in the middle of it. This would be a bad idea; the program's behavior would be undefined. The C standard specifically requires the pointersargv[0]
throughargv[argc-1]
to be pointers to strings, which means they can't be null pointers. – Mctyremain()
return in C and C++, which quotes what the standard says. The reason for the redundancy is primarily historical (that's how it was done in C in the mid-70s, so that's how it has been done ever since). And now, of course, there's a quarter century of it being standardized behaviour, and changing it would break a lot of code. – Measuresexecve()
only knows the length of the argument list by coming across the first null pointer. The extras 'in the middle' simply don't count. It's a little more debatable what happens if the zeroth argument is a null pointer. The standard permitsargc == 0
, and still requiresargv[argc] == 0
. – Measuresexec*()
functions, but the end of the list is defined by a null pointer (either as an argument for the variadic functionsexecl
,execlp
, andexecle
, or as the last element of the array forexecv
andexecvp
). And theargc
value passed to the invoked program is computed from that. (You could pass invalid pointers, which could make the invoked program unhappy, but that's a different thing.) – Mctyre