Handle argc equal to 0
Asked Answered
N

2

13

I recently saw something curious. In the HHVM source code, the very first 3 lines of the main() function read as follows:

if (!argc) {
  return 0;
}

It's a bit silly, but still, I just can't help wondering... why return 0!? It's not that I think there's some correct way to handle this, but returning 0, usually associated with success, seems particularly inappropriate.

Besides not crashing, is there ever a case where there's an appropriate response to argc being 0? (Or even less than 0?) Does it ever matter?

The only way I know of to end up in a case with argc of 0 is with exec() and friends. If for some reason that does happen, it's almost certainly a bug in the caller and the callee can't do much about it.

(tagged as C and C++ because I expect that the answer is the same for the two languages)

Edit: To try and make the question less vague and philosophical, I'll offer an alternative.

if (!argc) {
  puts("Error: argc == 0");
  return 1;
}

The key points are that there's an indication of the error and a non-zero value is returned. It's extremely unlikely this would be needed, but if it was you might as well try to indicate the error. On the other hand, if the detected error is as serious as argc equal to 0, maybe there's a reason it would be bad to try and access stdout or the C standard library.

Neusatz answered 18/1, 2015 at 20:32 Comment(10)
argc is required to be nonnegative by standard.Hedges
Well, why not. It was asked to do nothing, it successfully did nothing.Roden
@HansPassant What makes you say it was asked to do nothing? It strikes me as it was asked a nonsensical question.Neusatz
So, could we say then that this check doesn't make any sense, unless you are debugging the OS?Shackleford
I think the check itself does make sense. If you do get argc equal to 0 something went wrong but the program shouldn't crash so the only thing to do is exit. What gets me is the return 0;.Neusatz
For completeness' sake: Which platforms does HHVM target? As a real suggestion: Are there any exec* calls?Sully
@Sully HHVM primarily targets Linux and also targets OS X. There are plenty of occurrences of exec*( in the source being called from C++, PHP, and shell script. I haven't yet found one that would cause argc == 0,Neusatz
There is an interesting difference between C and C++: In C, you are allowed to call main from your program. So maybe they call it somewhere and pass it 0 as the first parameter as some mechanism to break the recursion. In that case, silently returning success seems appropriate. I'm not familiar with the software you are talking about so I don't know whether it actually does this but you might want to look for it.Rasp
@Hedges 0 is non-negativeHeavy
@MattMcNabb ... have you read the question? I'm referring to "(Or even less than 0?)".Hedges
K
5

I think it's just a case of Defensive programming due to the following snippet in the HHVM's sorce code (file hphp/hhvm/main.cpp):

int main(int argc, char** argv) {
  if (!argc) {
    return 0;
  }
  HPHP::checkBuild();
  int len = strlen(argv[0]);

In the line:

int len = strlen(argv[0]);

if argc == 0 -> argv[0] == NULL and strlen(argv[0]) will cause a segmentation fault.

I'm not familiar with HHVM but they can just suppose some program can call the program without arguments (not even the program name).

King answered 19/1, 2015 at 0:3 Comment(5)
Which makes me wonder: Why should if(!argc) return; be any better than the segmentation fault? In either case the process is stopped immediately, and the caller has to handle the failure. In the case of a segmentation fault, we get a non-zero exit status returned to the caller, which is better than pretending everything's fine by return 0;. Is this just this religious thing about avoiding segfaults?Thumbprint
@cmaster strlen(argv[0]) causes undefined behaviour. This has many worse potential consequences than a segmentation fault.Heavy
@Heavy Well, if argv[0] is NULL (which it must be since argv is null-terminated), the only way strlen(argv[0]) can do something other than segfaulting is if you are on a braindead system where someone has actually mapped something to address zero. And even if there were a page and some data to read, all strlen() implementation would still either segfault or just return the number of nonzero bytes at address zero. That a language lawyer says that something is undefined behavior does not mean that the behavior is unpredictable.Thumbprint
@cmaster anything can happen when it is undefined behaviour. Relying on a particular compiler's happenstance treatment of UB is just setting yourself up for trouble for no reason. One example: there are well known instances of bugs where programmers expected a segfault but the optimizer removed the line entirely.Heavy
@Heavy The optimizer can't remove anything in strlen(argv[0]) because the call has to work in the non-null case. The UB enters the program via input data. The segfault happens within the implementation of strlen().Thumbprint
F
10

Note that the C11 standard explicitly allows for argc == 0:

5.1.2.2.1 Program startup

¶1 The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:

int main(void) { /* ... */ }

or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):

int main(int argc, char *argv[]) { /* ... */ }

or equivalent;10) or in some other implementation-defined manner.

¶2 If they are declared, the parameters to the main function shall obey the following constraints:

  • The value of argc shall be nonnegative.
  • argv[argc] shall be a null pointer.
  • If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup. The intent is to supply to the program information determined prior to program startup from elsewhere in the hosted environment. If the host environment is not capable of supplying strings with letters in both uppercase and lowercase, the implementation shall ensure that the strings are received in lowercase.
  • If the value of argc is greater than zero, the string pointed to by argv[0] represents the program name; argv[0][0] shall be the null character if the program name is not available from the host environment. If the value of argc is greater than one, the strings pointed to by argv[1] through argv[argc-1] represent the program parameters.
  • The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

10) Thus, int can be replaced by a typedef name defined as int, or the type of argv can be written as char ** argv, and so on.

The two bullet points saying 'if the value of argc is greater than zero' clearly allow argc == 0, though it would be unusual for that to be the case.

Theoretically, therefore, a program could take precautions against it, though argv[0] == 0 even if argc == 0, so as long as the code doesn't dereference a null pointer, it should be fine. Many programs, perhaps even most, do not take such precautions; they assume that argv[0] will not be a null pointer.

Freddie answered 19/1, 2015 at 7:32 Comment(0)
K
5

I think it's just a case of Defensive programming due to the following snippet in the HHVM's sorce code (file hphp/hhvm/main.cpp):

int main(int argc, char** argv) {
  if (!argc) {
    return 0;
  }
  HPHP::checkBuild();
  int len = strlen(argv[0]);

In the line:

int len = strlen(argv[0]);

if argc == 0 -> argv[0] == NULL and strlen(argv[0]) will cause a segmentation fault.

I'm not familiar with HHVM but they can just suppose some program can call the program without arguments (not even the program name).

King answered 19/1, 2015 at 0:3 Comment(5)
Which makes me wonder: Why should if(!argc) return; be any better than the segmentation fault? In either case the process is stopped immediately, and the caller has to handle the failure. In the case of a segmentation fault, we get a non-zero exit status returned to the caller, which is better than pretending everything's fine by return 0;. Is this just this religious thing about avoiding segfaults?Thumbprint
@cmaster strlen(argv[0]) causes undefined behaviour. This has many worse potential consequences than a segmentation fault.Heavy
@Heavy Well, if argv[0] is NULL (which it must be since argv is null-terminated), the only way strlen(argv[0]) can do something other than segfaulting is if you are on a braindead system where someone has actually mapped something to address zero. And even if there were a page and some data to read, all strlen() implementation would still either segfault or just return the number of nonzero bytes at address zero. That a language lawyer says that something is undefined behavior does not mean that the behavior is unpredictable.Thumbprint
@cmaster anything can happen when it is undefined behaviour. Relying on a particular compiler's happenstance treatment of UB is just setting yourself up for trouble for no reason. One example: there are well known instances of bugs where programmers expected a segfault but the optimizer removed the line entirely.Heavy
@Heavy The optimizer can't remove anything in strlen(argv[0]) because the call has to work in the non-null case. The UB enters the program via input data. The segfault happens within the implementation of strlen().Thumbprint

© 2022 - 2024 — McMap. All rights reserved.