Whenever we write a string, enclosed in double quotes, C automatically creates an array of characters for us, containing that string, terminated by the \0 character.
Those notes are mildly misleading in this case. I shall have to update them.
When you write something like
char *p = "Hello";
or
printf("world!\n");
C automatically creates an array of characters for you, of just the right size, containing the string, terminated by the \0
character.
In the case of array initializers, however, things are slightly different. When you write
char b[2] = "hi";
the string is merely the initializer for an array which you are creating. So you have complete control over the size. There are several possibilities:
char b0[] = "hi"; // compiler infers size
char b1[1] = "hi"; // error
char b2[2] = "hi"; // No terminating 0 in the array. (Illegal in C++, BTW)
char b3[3] = "hi"; // explicit size matches string literal
char b4[10] = "hi"; // space past end of initializer is always zero-initialized
For b0
, you don't specify a size, so the compiler uses the string initializer to pick the right size, which will be 3.
For b1
, you specify a size, but it's too small, so the compiler should give you a error.
For b2
, which is the case you asked about, you specify a size which is just barely big enough for the explicit characters in the string initializer, but not the terminating \0
. This is a special case. It's legal, but what you end up with in b2
is not a proper null-terminated string. Since it's unusual at best, the compiler might give you a warning. See this question for more information on this case.
For b3
, you specify a size which is just right, so you get a proper string in an exactly-sized array, just like b0
.
For b4
, you specify a size which is too big, although this is no problem. There ends up being extra space in the array, beyond the terminating \0
. (As a matter of fact, this extra space will also be filled with \0
.) This extra space would let you safely do something like strcat(b4, ", wrld!")
.
Needless to say, most of the time you want to use the b0
form. Counting characters is tedious and error-prone. As Brian Kernighan (one of the creators of C) has written in this context, "Let the computer do the dirty work."
One more thing. You wrote:
and yet the compiler is reorganizing the memory store instructions so that a
and c
are stored before b
in memory to make room for a \0
at the end of the array.
I don't know what's going on there, but it's safe to say that the compiler is not trying to "make room for a \0
". Compilers can and often do store variables in their own inscrutable internal order, matching neither the order you declared them, nor alphabetical order, nor anything else you might think of. If under your compiler array b
ended up with extra space after it which did contain a \0
as if to terminate the string, that was probably basically random chance, not because the compiler was trying to be nice to you and helping to make something like printf("%s\n", b)
be better defined. (Under the two compilers where I tried it, printf("%s\n", b)
printed hi^E
and hi ??
, clearly showing the presence of trailing random garbage, as expected.)
b
is initialized with the first 2 characters from the string literal, but does not contain the null terminator. (b
is not a string). – Elmaelmajianchar *array_of_strings[] = {"hi", "mom"};
. You can call it a string (if it has a 0 terminator, aka ASCII nul (not NULL, @Baard)), or you can call it a char array. – Megrimc char array initialized string literal zero terminated
is what I'd do if I was wondering... which found some SO questions that might be duplicates). I still like this title better. (Although not as much as before you made me think about it more carefully :/) "string array" seemed like a wrong title, though, likeargv[]
arrays terminated with NULL pointers. Maybe there's a 3rd option we'd both like. – Megrim0
or0LL
. It's not wrong to write "null-terminated char array", but I prefer to write "0-terminated" when I'm talking aboutchar
or other integer types, only using the word null at all to talk about pointers. A special term makes more sense for pointers since the object-representation may not be all-0. – Megrim