Why is this non-null terminated string printed correctly
Asked Answered
S

2

6

Yesterday, I had my Unit Test. One of the programs was to copy strings and find out its length without the string functions. This was the code I wrote:

#include <stdio.h>

int main(){
    char str1[100], str2[100] = {'a'};

    printf("Enter a string\n");
    fgets(str1, sizeof(str1), stdin);

    int i;
    for(i = 0; str1[i] != '\0'; i++){
        str2[i] = str1[i];
    }   

    str2[i] = '\0';

    printf("Copied string = %s", str2);

    printf("Length of string = %d", i-1);
}

I had a rather surprising observation! Even if a commented str2[i] = '\0', the string would be printed correctly i.e., without the extra 'a's in the initialization which should not be overwritten as per my knowledge.

After commenting str2[i] = '\0', i expected to see this output:

test
Copied string = testaaaaaaaaaaaaaaaaaaaaaaaaaaa....
Length of string = 4

This is the output:

test
Copied string = test
Length of string = 4

How is str2 printed correctly? Is it the fact that the compiler recognized the copying of the string and silently added the null termination? I am using gcc but clang also produces similar output.

Spandex answered 3/4, 2019 at 5:36 Comment(0)
A
8

str2[100] = {'a'}; does not fill str2 with 100 repeated a. It just sets str[0] to 'a' and the rest to zero.

As far back as C89:

3.5.7 Initialization

...

Semantics

...

If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant. If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate./65/

...

If there are fewer initializers in a list than there are members of an aggregate, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

Airspeed answered 3/4, 2019 at 5:40 Comment(9)
Oh so others are set to 0 which is the same as '\0'? And hence it works?Spandex
@Spandex Yes, exactly.Airspeed
Depending on the implementation (and configuration), the remaining 99 chars might be random numbers, null or non-null.Vallation
@Vallation No. The array is initialized.Murder
@Vallation nope, they will be 0, for sure. See my answer for the reason.Misguided
@Vallation The standard says otherwise.Airspeed
hmm... I was not aware of that specification. Thank you all.Vallation
AFAIK, C89 is now obsolete, the correct one to quote would be the C11.Misguided
@SouravGhosh Just pointing out it's been part of the standard all along, since virolino mentioned "depending on the implementation". Even old implementations do it.Airspeed
M
3

First, the rule of initialization for aggregate types[1], quoting C11, chapter 6.7.9 (emphasis mine)

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject;151) all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

and,

If an object that has static or thread storage duration is not initialized explicitly, then:

  • if it has pointer type, it is initialized to a null pointer;

  • if it has arithmetic type, it is initialized to (positive or unsigned) zero;

  • if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

  • if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

Now, an initialization statement like

char str2[100] = {'a'};

will initialize str2[0] to 'a', and str2[1] through str2[99] with 0, according to the above rule. That 0 value is the null-terminator for strings.

Thus, any value you store there, lesser than the length of the array, up to the length-1 element, is automatically going to be terminated by a null.

So, you're okay to use the array as string and get the expected behavior of that of a string.


[1]: Aggregate types:

According to chapter 6.2.5/P21

[...] Array and structure types are collectively called aggregate types.

Misguided answered 3/4, 2019 at 5:49 Comment(3)
You might want to point out that arrays are "aggregates" since the emphasis doesn't make sense by itself. An array is an aggregate and so every member is initialized (recursively), according to the rules above. If the element is a pointer, then to null, if it is an arithmetic type, then to zero.Rathbone
@Rathbone Right sir, added a note.Misguided
I am sorry. I could only mark one answer as correct :(Spandex

© 2022 - 2024 — McMap. All rights reserved.