What is the type of command-line argument `argv` in C?
Asked Answered
M

7

18

I'm reading a section from C Primer Plus about command-line argument argv and I'm having difficulty understanding this sentence.

It says that,

The program stores the command line strings in memory and stores the address of each string in an array of pointers. The address of this array is stored in the second argument. By convention, this pointer to pointers is called argv, for argument values .

Does this mean that the command line strings are stored in memory as an array of pointers to array of char?

Meadows answered 23/8, 2016 at 8:16 Comment(8)
Does this mean that the command line strings are stored in memory as an array of pointers to array of char? Yes. IMHO the whole confusion is caused by The program stores the command line strings in memory ... ; the point is that all this happens before main() is called. Main() is just a function, which is called with two arguments: an int and a pointer to an array of string pointers.Sherer
@Sherer argv isn't "a pointer to an array of string pointers", if we're being pedantic. This whole question is about the difference between "pointer to an array" and "pointer to the first element of an array", really.Sclerotic
The OPs confusion is IMHO about the external side ("crt0"), which sets up the args, and the internal side (main()), which recieves it. That is also the cause of the difference between the (perceived: decayed) types. Really.Sherer
@Sherer this is a "language lawyer" question which means it is about Standard C, in which there is no "crt0" and the setup of the arguments doesn't matter, so long as argv behaves as specified in the C StandardSclerotic
The "language-lawyer" tag was added later (by someone who did not understand the nature of the question, IMHO) And I quoted "crt0" for a reason. Really.Sherer
@joop: On Linux (and other OSes that use the SysV ABI), the argv array is in memory at process startup, in a format suitable for passing by reference to main. So the crt0 libc startup code doesn't have to do anything with argv except pass a pointer to it to main(). In Linux, the kernel puts argv and the environment block at the top of the user-space stack. The x86 flavours of the System V ABI are online here.Zeiler
Possible duplicate of Command line arguments: argvOctavo
Another candidate is What does int argc, char *argv[] mean?.Octavo
H
17

Directly quoting from C11, chapter §5.1.2.2.1/p2, program startup, (emphasis mine)

int main(int argc, char *argv[]) { /* ... */ }

[...] If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, [...]

and

[...] and the strings pointed to by the argv array [...]

So, basically, argv is a pointer to the first element of an array of strings note. This can be made clearer from the alternative form,

int main(int argc, char **argv) { /* ... */ }

You can rephrase that as pointer to the first element of an array of pointers to the first element of null-terminated char arrays, but I'd prefer to stick to strings .


NOTE:

To clarify the usage of "pointer to the first element of an array" in above answer, following §6.3.2.1/p3

Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. [...]

Hildebrandt answered 23/8, 2016 at 8:22 Comment(17)
Then is the address of the whole string(&"string") stored in the array? Or is the address of the initial element of the string stored in the array?Meadows
@Meadows Address of the initial element of the string.Hildebrandt
...and since the array is passed to a function (main) it collapses to a pointer. So argv is a pointer that points to an array. Try sizeof argv and argv++ and you will see that argv is a pointer.Homeopathic
Then why do you say that "...as array of pointers to null-terminated char arrays", not "...as array of pointers to null-terminated char"? I think there is a difference between those two sentences.. or am I wrong?Meadows
@Sourac Ghosh Sorry my bad! I thought "pointer to a null-terminated char arrays" means pointer to the whole string(&"String")Meadows
@KlasLindbäck OK, appended the answer to clear out any confusions. :)Hildebrandt
@KlasLindbäck; So argv is a pointer that points to an array: No. argv is a pointer that points to a pointer to a char.Splurge
@SouravGhosh a string is something that can be stored in an array. A string is not an array (or vice versa)Sclerotic
@Splurge If an array is passed to main then argv points to an array. The type of argv is pointer to pointer to char, though.Homeopathic
@KlasLindbäck; In C you can't pass an array to a function but a pointer to the first element of the array.Splurge
@Sclerotic Yes, in general you're very right, but isn't it like that a string and a null terminated char array are same things? §7.1.1/1 A string is a contiguous sequence of characters terminated by and including the first null character.Hildebrandt
@Splurge That's what I've been saying.Homeopathic
@SouravGhosh "array" is not mentioned in your quote . You wouldn't say that 10 was the same thing as an int variable would you?Sclerotic
@Sclerotic Nopes, obviously not. :) Which statement in my answer you're referring to exactly, please? (or if you want, please feel free to edit it).Hildebrandt
I'm referring to your comment "a null-terminated char array is a string". should be "a null-terminated char array contains a string" . "string" is to "array" as "10" is to "int variable"Sclerotic
@Sclerotic Agreed. Poor choice of words from my side. Cannot edit that now, will delete add a new one.Hildebrandt
@Meadows For your [comment], the proper reply should be, there is no null-terminated char. A null-terminated char array contains (or, called as) a string.Hildebrandt
S
27

argv is of type char **. It is not an array. It is a pointer to pointer to char. Command line arguments are stored in the memory and the address of each of the memory location is stored in an array. This array is an array of pointers to char. argv points to first element of this array.

                  Some
                  array

                 +-------+        +------+------+-------------+------+
argv ----------> |       |        |      |      |             |      |
                 | 0x100 +------> |      |      | . . . . . . |      |  Program Name1
         0x900   |       |        |      |      |             |      |
                 |       |        +------+------+-------------+------+
                 +-------+         0x100  0x101
                 |       |        +------+------+-------------+------+
                 | 0x205 |        |      |      |             |      |
         0x904   |       +------> |      |      | . . . . . . |      |  Arg1
                 |       |  .     |      |      |             |      |
                 +-------+        +------+------+-------------+------+
                 |  .    |  .      0x205  0x206
                 |  .    |
                 |  .    |  .
                 |  .    |
                 +-------+  .     +------+------+-------------+------+
                 |       |        |      |      |             |      |
                 | 0x501 +------> |      |      | . . . . . . |      |  Argargc-1
                 |       |        |      |      |             |      |
                 +-------+        +------+------+-------------+------+
                 |       |         0x501  0x502
                 | NULL  |
                 |       |
                 +-------+


0xXXX Represents memory address


1. In most of the cases argv[0] represents the program name but if program name is not available from the host environment then argv[0][0] represents null character.

Splurge answered 23/8, 2016 at 8:25 Comment(14)
Is the address of the whole string (&"string") stored in the array?Meadows
@Jin; No. It is the address of the first element of the string.Splurge
@Splurge argv is of type char **. It is not array..... I'm confused... Can you please refer my answer once?Hildebrandt
@SouravGhosh; I am more confused than you. I don't know why standard says it array. Let me go through it in detail.Splurge
@Splurge Then I guess my argument that "command line strings are stored in memory as an array of pointers to array of char? " is wrong! It should be corrected as "...as an array of pointers to char"Meadows
@Jin; Yes. It should be.Splurge
@Meadows To be nitpicky, it should read ....pointers to null-terminated char array...see the last line of my answer. :)Hildebrandt
@SouravGhosh But as haccks says, shouldn't it be array of pointers to char, not array of pointers to char array?Meadows
@SouravGhosh; OK. Ultimately argv is pointing to the first element of an array of char * and that can be the reason standard referred it as array in this particular case. But the type of argv is char **.Splurge
Really helpful ASCII schema !Pistoleer
As a nitpick, "some array" should have one more element that stores a null pointer.Continuum
@jamesdlin; Good catch.Splurge
Also notable: argv[0] isn't guaranteed to hold the program name (but does, most of the time) but only something that represents the program name: https://mcmap.net/q/173039/-is-quot-argv-0-name-of-executable-quot-an-accepted-standard-or-just-a-common-convention/1116364Nonparticipating
@DanielJour: Isn't "represents" just there because filenames can be represented in other ways than as "null-terminated multibyte strings" (for example, NTFS uses a UTF-16 encoding), and they needed to specify which representation they're using here? There are bigger issues here than the word "represents", like the fact that "the name used to invoke the program" isn't very specific -- it needn't be a filename (e.g. Unix login shells), and even if it was, nobody said what directory or filename extension might have been used to resolve it. Or it could just be "".Cruzcruzado
H
17

Directly quoting from C11, chapter §5.1.2.2.1/p2, program startup, (emphasis mine)

int main(int argc, char *argv[]) { /* ... */ }

[...] If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, [...]

and

[...] and the strings pointed to by the argv array [...]

So, basically, argv is a pointer to the first element of an array of strings note. This can be made clearer from the alternative form,

int main(int argc, char **argv) { /* ... */ }

You can rephrase that as pointer to the first element of an array of pointers to the first element of null-terminated char arrays, but I'd prefer to stick to strings .


NOTE:

To clarify the usage of "pointer to the first element of an array" in above answer, following §6.3.2.1/p3

Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. [...]

Hildebrandt answered 23/8, 2016 at 8:22 Comment(17)
Then is the address of the whole string(&"string") stored in the array? Or is the address of the initial element of the string stored in the array?Meadows
@Meadows Address of the initial element of the string.Hildebrandt
...and since the array is passed to a function (main) it collapses to a pointer. So argv is a pointer that points to an array. Try sizeof argv and argv++ and you will see that argv is a pointer.Homeopathic
Then why do you say that "...as array of pointers to null-terminated char arrays", not "...as array of pointers to null-terminated char"? I think there is a difference between those two sentences.. or am I wrong?Meadows
@Sourac Ghosh Sorry my bad! I thought "pointer to a null-terminated char arrays" means pointer to the whole string(&"String")Meadows
@KlasLindbäck OK, appended the answer to clear out any confusions. :)Hildebrandt
@KlasLindbäck; So argv is a pointer that points to an array: No. argv is a pointer that points to a pointer to a char.Splurge
@SouravGhosh a string is something that can be stored in an array. A string is not an array (or vice versa)Sclerotic
@Splurge If an array is passed to main then argv points to an array. The type of argv is pointer to pointer to char, though.Homeopathic
@KlasLindbäck; In C you can't pass an array to a function but a pointer to the first element of the array.Splurge
@Sclerotic Yes, in general you're very right, but isn't it like that a string and a null terminated char array are same things? §7.1.1/1 A string is a contiguous sequence of characters terminated by and including the first null character.Hildebrandt
@Splurge That's what I've been saying.Homeopathic
@SouravGhosh "array" is not mentioned in your quote . You wouldn't say that 10 was the same thing as an int variable would you?Sclerotic
@Sclerotic Nopes, obviously not. :) Which statement in my answer you're referring to exactly, please? (or if you want, please feel free to edit it).Hildebrandt
I'm referring to your comment "a null-terminated char array is a string". should be "a null-terminated char array contains a string" . "string" is to "array" as "10" is to "int variable"Sclerotic
@Sclerotic Agreed. Poor choice of words from my side. Cannot edit that now, will delete add a new one.Hildebrandt
@Meadows For your [comment], the proper reply should be, there is no null-terminated char. A null-terminated char array contains (or, called as) a string.Hildebrandt
S
11

This thread is such a train wreck. Here is the situation:

  • There is an array with argc+1 elements of type char *.
  • argv points to the first element of that array.
  • There are argc other arrays of type char and various lengths, containing null terminated strings representing the commandline arguments.
  • The elements of the array of pointers each point to the first character of one of the arrays of char; except for the last element of the array of pointers, which is a null pointer.

Sometimes people write "pointer to array of X" to mean "pointer to the first element of an array of X". You have to use the contexts and types to work out whether or not they actually did mean that.

Sclerotic answered 23/8, 2016 at 9:17 Comment(1)
There must be an exact duplicate somewhere to this question.Octavo
N
1

Yes, exactly.

argv is a char** or char*[], or simply an array of char* pointers.

So argv[0] is a char* (a string) and argv[0][0] is a char.

Naylor answered 23/8, 2016 at 8:19 Comment(12)
If it were an array of zero-terminated strings (char[], not char*), it'd just be char[].Shivers
@Rhymoid Not zero terminated chars, zero terminated stringsNaylor
can you give an example of non-zero-terminated _string_s?Hildebrandt
@Naylor It's confusing to describe it as such. There's also the practice of storing "a list of\0strings like\0so\0", which is a (zero-terminated) array of zero-terminated strings. This is not how argv works, but it is used (e.g. in the cmdline procfile of the Linux kernel). char* is not a string.Shivers
@SouravGhosh {_size:5, data: {'a','b','c','d','e'}} Here's one.Naylor
@Naylor ok, so data is a string ?Hildebrandt
@Splurge it points to the first element of an array of char pointers.Sclerotic
@M.M; That doesn't make argv an array of char *.Splurge
argv is not "simply an array of char* pointers", that's misleading, please consider removing that part of your answer.Groff
@Rhymoid the standard allows the commandline arguments to be laid out in memory like that, with each pointer pointing to the start of the next string etc.Sclerotic
@Sclerotic I'm sure it does, but it's not what argv itself is.Shivers
@Rhymoid in this context you'd say that argv works by pointing to the first of the pointers which point into that series of stringsSclerotic
T
0

Yes.

The type of argv is char**, i.e. a pointer to pointer to char. Basically, if you consider a char* to be a string, then argv is a pointer to an array of strings.

Thither answered 23/8, 2016 at 8:19 Comment(5)
The strings don't have to be in an array, but the pointers to them are. Also, the end of the array of pointers is a null.Incommunicable
the type of argv is char**, i.e. a pointer to an array of pointers to arrays of char: No. No way.Splurge
@Splurge Care to elaborate?Thither
char ** is read as pointer to pointer to char and not pointer to an array of pointers to arrays of char. This makes argv is of type char *((*)[])[].Splurge
@Splurge My bad. I was hoping to make it more intuitive, but it was just plain incorrect.Thither
T
0

Strictly speaking, there are a number of properties that must be present for argv to be an array. Let us consider some of those:

¹/ There can be no array pointed at by a null pointer, as null pointer are guaranteed to be an address distinct from that of any object. Therefore, argv in the following code can't be an array:

#include <assert.h>
int main(int argc, char *argv[]) {
    if (argv) return main(0, 0);
    assert(argv == 0); // argv is a null pointer, not to be dereferenced
}

²/ It's invalid to assign to an array. For example, char *argv[] = { 0 }; argv++; is a constraint violation, but int main(int argc, char *argv[]) { argv++; } compiles and runs fine. Thus we must conclude from this point that argv is not an array when declared as an argument, and is instead a pointer that (might) point into an array. (This is actually the same as point 1, but coming from a different angle, as calling main with a null pointer as argv is actually reassigning argv, something we can't do to arrays).

³/ ... As the C standard says:

Another use of the sizeof operator is to compute the number of elements in an array: sizeof array / sizeof array[0]

For example:

#include <assert.h>
int main(int argc, char *argv[]) {
    size_t size = argc+1; // including the NULL
    char *control[size];
    assert(sizeof control / sizeof *control == size); // this test passes, because control is actually an array
    assert(sizeof argv   / sizeof *argv    == size); // this test fails for all values of size != 1, indicating that argv isn't an array
}

⁴/ The unary &address-of operator is defined such that when applied to an array, will yield the same value of a different type, so, for example:

#include <assert.h>
int main(int argc, char *argv[]) {
    char *control[42];
    assert((void *) control == (void *) &control); // this test passes, because control is actually an array
    assert((void *) argv    == (void *) &argv); // this test fails, indicating that argv isn't an array
}
Turku answered 15/8, 2022 at 22:54 Comment(0)
A
-1

argv is an array of pointers to characters.

The following code displays the value of argv, the contents of argv and performs a memory dump on the memory pointed at by the contents of argv. Hopefully this illuminates the meaning of the indirection.

#include <stdio.h>
#include <stdarg.h>

print_memory(char * print_me)
{
    char * p;
    for (p = print_me; *p != '\0'; ++p)
    {
        printf ("%p: %c\n", p, *p);
    }

    // Print the '\0' for good measure
    printf ("%p: %c\n", p, *p);

}

int main (int argc, char ** argv) {
    int i;

    // Print argv
    printf ("argv: %p\n", argv);
    printf ("\n");

    // Print the values of argv
    for (i = 0; i < argc; ++i)
    {
        printf ("argv[%d]: %p\n", i, argv[i]);
    }
    // Print the NULL for good measure
    printf ("argv[%d]: %p\n", i, argv[i]);
    printf ("\n");

    // Print the values of the memory pointed at by argv
    for (i = 0; i < argc; ++i)
    {
        print_memory(argv[i]);
    }

    return 0;
}

Sample Run:

$ ./a.out Hello World!
argv: ffbfefd4

argv[0]: ffbff12c
argv[1]: ffbff134
argv[2]: ffbff13a
argv[3]: 0

ffbff12c: .
ffbff12d: /
ffbff12e: a
ffbff12f: .
ffbff130: o
ffbff131: u
ffbff132: t
ffbff133:
ffbff134: H
ffbff135: e
ffbff136: l
ffbff137: l
ffbff138: o
ffbff139:
ffbff13a: W
ffbff13b: o
ffbff13c: r
ffbff13d: l
ffbff13e: d
ffbff13f: !
ffbff140:

$

You have this big contiguous array ranging from ffbff12c to ffbff140 which contains the command line arguments (this is not guaranteed to be a contiguous by the standard, but is how it's generally done). argv just contains pointers into that array so you know where to look for the words.

argv is a pointer... to pointers... to characters

Aviatrix answered 23/8, 2016 at 15:59 Comment(2)
It's not in the C or POSIX standard, but it may be guaranteed to be contiguous by the System V ABI standard.Humility
Page 34 of software.intel.com/sites/default/files/article/402129/… describes the details of what is and is not required for that array: "Argument strings, environment strings, and the auxiliary information appear in no specific order within the information block and they need not be compactly allocated." [but the "information block" itself is a well-defined area at the top of memory that is defined to contain all the strings]. Obviously this is only relevant to systems that this standard applies to.Humility

© 2022 - 2024 — McMap. All rights reserved.