C argv what is the maximum size of data [duplicate]
Asked Answered
P

4

12

Possible Duplicate:
About command line arguments of main function

How would I determine what the maximum size of data I could pass into a C main(int argc, char* argv)? Is there a macro somewhere in the standard that would define this? Is the data "owned" by the main process (i.e. does my program store this data) or is it somehow "owned" by the operating system and I can just get a pointer to it?

Powerhouse answered 19/1, 2013 at 22:17 Comment(2)
it can differ gratly depending on the system you are running the program. check out my response for reference and how to find what's the limit in your systemFontenot
About the duplicate, the duplicate link title doesn't contain specific enough info to be found. However, the accepted answer does also answer this question.Powerhouse
B
17

In a POSIX system, there is a value, ARG_MAX, defined in <limits.h> with a minimum acceptable value of _POSIX_ARG_MAX (which is 4096). You can discover the value at run-time via the sysconf() function with the SC_ARG_MAX parameter.

It is often 256 KiB.

The data in argv (both the array of pointers and the strings that they point at) are 'owned' by the program. They can be modified; whether that is sensible depends on your viewpoint. You certainly can't step outside the bounds of what was passed to the main() function without invoking undefined behaviour. Functions such as GNU getopt() do reorganize the arguments when run without the POSIXLY_CORRECT environment variable set in the environment. You already have a pointer to the data in the argv as provided to main().

Empirically, you will often find that the data immediately after the end of the string argv[argc-1] is actually the start of the environment. The main program can be written as int main(int argc, char **argv, char **envp) in some systems (recognized as an extension in the C standard Annex J, §J.5.1), where envp is the same value as is stored in the global variable environ, and is the start of a null-terminated array of pointers to the environment strings.

Blackpoll answered 19/1, 2013 at 22:21 Comment(0)
F
6

ARG_MAX is maximum length of arguments for a new process

You will see this error message, if you tried to call a program with too many arguments, that is, most likely in connection with pattern matching:

$ command * 

It's only the exec() system call and its direct variants, which will yield this error. They return the corresponding error condition E2BIG ().

The shell is not to blame, it just delivers this error to you. In fact, shell expansion is not a problem, because here exec() is not needed, yet. Expansion is only limited by the virtual memory system resources.

Thus the following commands work smoothly, because instead of handing over too many arguments to a new process, they only make use of a shell built-in (echo) or iterate over the arguments with a control structure (for loop):

/dir-with-many-files$ echo * | wc -c
/dir-with-many-files$ for i in * ; do grep ARG_MAX "$i"; done

There are different ways to learn the upper limit

command: getconf ARG_MAX

system call: sysconf(_SC_ARG_MAX)

system header: ARG_MAX in e.g. <[sys/]limits.h>

In contrast to the headers, sysconf and getconf tell the limit which is actually in effect. This is relevant on systems which allow to change it at run time, by reconfiguration, by recompiling (e.g. Linux) or by applying patches (HP-UX 10).

example usage of sysconf():

#include <stdio.h>
#include <unistd.h>
int main() {
    return printf("ARG_MAX: %ld\n", sysconf(_SC_ARG_MAX));
}

A handy way to find the limits in your headers, if you have cpp installed:

cpp <<EOF
#include <limits.h>
#include <param.h>
#include <params.h>
#include <sys/limits.h>
#include <sys/param.h>
#include <sys/params.h>
arg_max: ARG_MAX
ncargs: NCARGS
EOF

When looking at ARG_MAX/NCARGS, you have to consider the space comsumption by both argv[] and envp[] (arguments and environment). Thus you have to decrease ARG_MAX at least by the results of env|wc -c and env|wc -l * 4 for a good estimation of the currently available space.

POSIX suggests to subtract 2048 additionally so that the process may savely modify its environment. A quick estimation with the getconf command:

 expr `getconf ARG_MAX` - `env|wc -c` - `env|wc -l` \* 4 - 2048

The most reliable way to get the currently available space is to test the success of an exec() with increasing length of arguments until it fails. This might be expensive, but at least you need to check only once, the length of envp[] is considered automatically, and the result is reliable.

alternatively, the GNU autoconf check "Checking for maximum length of command line arguments..." can be used. It works quite similar.

However, it results in a much lower value (it can be a fourth of the actual value only) both by intention and for reasons of simplicity:

In a loop with increasing n, the check tries an exec() with an argument length of 2n (but won't check for n higher than 16, that is 512kB). The maximum is ARG_MAX/2 if ARG_MAX is a power of 2. Finally, the found value is divided by 2 (for safety), with the reason "C++ compilers can tack on massive amounts of additional arguments".

The actual value

On Linux 2.6.23, it is 1/4th of stack size. Kernel code for reference.

Fontenot answered 19/1, 2013 at 22:42 Comment(0)
P
1

main() is not special in regards to what it accepts. What is special is the magic that happens before main() gets called the first time.

You can call main() with whatever you want ...

#include <stdio.h>

char longstring[1024000] = "foo";

int main(int argc, char **argv) {
  char *p = longstring;
  printf("main called with argc == %d", argc);
  if (argv) printf(" and a relevant argv");
  puts("");
  switch (argc) {
    case 1: main(2, NULL); break;
    case 2: main(3, &p); break;
    default: puts("Uff!"); break;
  }
  return 0;
}
Pavid answered 19/1, 2013 at 22:31 Comment(2)
You can do that in C; you can't do that in C++ (and the question is tagged C so your answer is OK). When the system calls main(), there are guarantees such as argc >= 1 and argv[argc] == 0; when you call it, you can impose any rules you like, so your case 1 call is OK because you did it, but would not be OK if the system tried it.Blackpoll
That's what I meant with the "special magic that happens before main() is called the first time".Pavid
M
0

I might be wrong, but i think argc and argv belong to __libc_start_main in libc.so.6
Who calls main ?

Might be helpful :)

Mayramays answered 19/1, 2013 at 22:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.