What happens if I define a 0-size array in C/C++?
Asked Answered
W

8

153

Just curious, what actually happens if I define a zero-length array int array[0]; in code? GCC doesn't complain at all.

Sample Program

#include <stdio.h>

int main() {
    int arr[0];
    return 0;
}

Clarification

I'm actually trying to figure out if zero-length arrays initialised this way, instead of being pointed at like the variable length in Darhazer's comments, are optimised out or not.

This is because I have to release some code out into the wild, so I'm trying to figure out if I have to handle cases where the SIZE is defined as 0, which happens in some code with a statically defined int array[SIZE];

I was actually surprised that GCC does not complain, which led to my question. From the answers I've received, I believe the lack of a warning is largely due to supporting old code which has not been updated with the new [] syntax.

Because I was mainly wondering about the error, I am tagging Lundin's answer as correct (Nawaz's was first, but it wasn't as complete) -- the others were pointing out its actual use for tail-padded structures, while relevant, isn't exactly what I was looking for.

Wheen answered 15/3, 2012 at 15:12 Comment(14)
Sounds like GCC doesn't complain at all, and you get to initialize it. What more, exactly, do you need to know, and can you figure it out by doing it yourself?Talamantes
You may find this topic interesting: https://mcmap.net/q/24855/-array-of-zero-lengthMarylandmarylee
@AlexanderCorwin: Unfortunately in C++, with undefined behavior, non-standard extensions, and other anomalies, trying something out yourself is often not a path to knowledge.Bilateral
https://mcmap.net/q/24856/-zero-length-c-array-binds-to-pointer-type - this as wellMarylandmarylee
what would be the purpose ? no offence but this is a bad question ... there's no possible productive use of creating an array of 0, furthermore if you open a project and copy paste you can press the compile button and see for yourself without asking this as a question.Electioneer
@JustinKirk I just got trapped by that too by testing and seing it worked. And due to the criticism I received in my post, I learnt that testing and seing it working does not mean it is valid and legal. So a self test is not valid sometimes.Baba
@JustinKirk, see Matthieu's answer for an example of where you would use it. It also might come in handy in a template where the array size is a template parameter. The example in the question is obviously out of context.Cumbrance
Everytime a question like this pops up, I'm stumped as to why -pedantic isn't the default with g++. So many non-portable extensions...Dharna
@JustinKirk: What's the purpose of [] in Python or even "" in C? Sometimes, you've got a function or a macro that requires an array, but you don't have any data to put in it.Haven
@dan04: Bad example. "" is a totally valid string constant that's one byte long (\0). In C, there is no situation where you want an actual "zero-sized array" where you could not just pass NULL without all the same caveats. Lundin's answer invoking the standard and Matthieu's answer explaining why it's used sometimes anyway cover the issue in its entirety.Fsh
What is "C/C++"? These are two separate languagesHuberty
possible duplicate of Array with size 0Devisee
The REASON this is a warning rather than an error in many compilers is this is the old way of declaring a variant-sized array at the end of a struct. This always had to be allocated on the heap though.Patroon
@Haven The POINT of [] in Python would be to create an empty list. And “” in C would denote an empty string, which has plenty of uses - give it so nothing is printed in a certain location, or initialize a larger array of char without writing {‘\0’} as the initializer-list. (Sorry about lack of nice formatting, on mobile right now.)Filmer
D
111

An array cannot have zero size.

ISO 9899:2011 6.7.6.2:

If the expression is a constant expression, it shall have a value greater than zero.

The above text is true both for a plain array (paragraph 1). For a VLA (variable length array), the behavior is undefined if the expression's value is less than or equal to zero (paragraph 5). This is normative text in the C standard. A compiler is not allowed to implement it differently.

gcc -std=c99 -pedantic gives a warning for the non-VLA case.

Dogy answered 15/3, 2012 at 15:38 Comment(21)
where did you get that? C99 6.7.5.2 §5: If the size is an expression that is not an integer constant expression [...] each time it is evaluated it shall have a value greater than zero.Mustang
"it must actually give an error" - the distinction between "warnings" and "errors" isn't recognized in the standard (it only mentions "diagnostics"), and the only situation where compilation must stop [i.e. the real-world difference between warning and error] is on encountering an #error directive.Mulberry
FYI, as a general rule, the (C or C++) standards only state what compilers must allow, but not what they must disallow. In some cases, they will state that the compiler should issue a "diagnostic" but that's about as specific as they get. The rest is left to the compiler vendor. EDIT: What Random832 said too.Constance
Needs a C++ quote too, in order to answer the question.Huberty
@Mulberry In practice, all compilers build the binary executable even if warnings are present, but they don't build it if errors are present. A compiler isn't allowed to build a binary containing zero-length arrays. Whether it likes to inform the user about a failed build through a warning or error, is perhaps implementation-defined.Dogy
@Dogy "A compiler isn't allowed to build a binary containing zero-length arrays." The standard says absolutely nothing of the sort. It only says that it must generate at least one diagnostic message when given source code containing an array with a zero-length constant expression for its size. The only circumstance under which the standard forbids a compiler to build a binary is if it encounters an #error preprocessor directive.Mulberry
@Mulberry And what's your point? The standard doesn't state the the compiler must build anything at all, either. But for the compiler to be a useful tool, in the real world, it must 1) generate a binary and 2) ensure that the binary behaves according to the source code, 3) ensure that the source code is correct for the relevant programming language. Whether the C standard allows it to skip any of these steps is irrelevant in the real world, because if you remove either of these 3, the compiler is no longer a useful tool, but merely waste of space on the HD.Dogy
@Dogy Generating a binary for all correct cases satisfies #1, and generating or not generating one for incorrect cases would not affect it. Printing a warning is sufficient for #3. This behavior has no relevance to #2, since the standard does not define the behavior of this source code.Mulberry
Just a note, GCC's -pedantic flag never produces errors, only warnings (assuming -Werror isn't on), this is what -pedantic-errors is for.Karafuto
@Lundin: The point is that your statement is mistaken; conforming compilers are allowed to build a binary containing a zero-length arrays, as long as a diagnostic is issued.Sheave
@Random832: The standard requires that an #error directive supply something more specific than "a diagnostic", but it doesn't require that it delete pre-existing output files in case of a build failure, it can't really mandate anything about the consequences after any required messages have been produced.Unsegregated
@KeithThompson: If a compiler explicitly documents that zero-length arrays are allowed as an extension, could that eliminate the requirement for a diagnostic? Looking at the list of common extensions in C89, it would appear that the Standard regards a keyword asm as an extension, but would not regard a keyword like __asm as an extension. Would the distinction be relevant if compilers were required to regard as ill-formed code which uses asm as a keyword in ways that would be syntactically invalid if it were an identifier?Unsegregated
@Unsegregated What does the language "shall not successfully translate", which is used for no other condition but the presence of #error, actually mean then?Mulberry
@supercat: No, int arr[0]; still requires a diagnostic for any conforming implementation. (Very commonly C compilers are not conforming in their default mode.) N1570 5.1.1.3.Sheave
@Random832: "Shall not successfully translate" means that the system must provide a means of distinguishing well-formed programs from programs that have a #error directive. If the build system leaves behind an old a.out after a failed compilation attempt, and someone decides to run it despite the compilation failure, the Standard would impose no requirements on what the file might do (since it may very well have been generated from a completely unrelated set of source texts).Unsegregated
@KeithThompson: The only requirement the Standard imposes when an implementations is given any source text which is ill-formed but does not contain #error is that the implementation must issue at least one diagnostic. If documenting an extension does not waive the requirement for a diagnostic, then in what case would the presence or absence of documentation regarding an extension affect whether or not a compiler is conforming?Unsegregated
@supercat: As far as I know, nothing. For example, an extension might define the behavior of something whose behavior is not defined by the standard. In the absence of any documentation of an extension, the behavior is simply undefined, even if it happens to behave consistently.Sheave
@KeithThompson: If defined behaviors beyond what is otherwise mandated were considered "extensions" using the terminology of the standard, it would seem odd that the authors of the Standard would have omitted from J2 one that their rationale claimed was common to the majority of current implementations, while including others that are far less common.Unsegregated
@KeithThompson: A major intention of the Standard was that if program X worked usefully for some implementation on platform Y, then a C89 implementation on platform Y should be able to process the same program just as well with minimal changes. Many programs that were written before C89 were laced with implementation-specific keywords, and the authors of the Standard didn't want to claim such programs should be considered "ill-formed", rather than merely being obviously non-portable.Unsegregated
@KeithThompson: What do you make of e.g. N1570, Annex J.5.12? That would imply to me that predefined macros which do start with underscores are not considered "extensions", but that an implementation may define such macros without leading underscores if they are documented. If pre-defining macros without underscores would make a compiler non-conforming even if all such macros are documented, that would imply the Standard was saying that a behavior it forbids is a "common extension".Unsegregated
Annex J is non-normative. Furthermore, J.5p1 clearly says, "The inclusion of any extension that may cause a strictly conforming program to become invalid renders an implementation nonconforming". An implementation that predefined FOOBAR as a macro, for example, is non-conforming because it breaks any strictly conforming program that uses FOOBAR as an identifier. N1570 4p6 describes which extensions a conforming implementation may have; it doesn't say that a non-conforming extension is not an extension.Sheave
E
108

As per the standard, it is not allowed.

However it's been current practice in C compilers to treat those declarations as a flexible array member (FAM) declaration:

C99 6.7.2.1, §16: As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.

The standard syntax of a FAM is:

struct Array {
  size_t size;
  int content[];
};

The idea is that you would then allocate it so:

void foo(size_t x) {
  Array* array = malloc(sizeof(size_t) + x * sizeof(int));

  array->size = x;
  for (size_t i = 0; i != x; ++i) {
    array->content[i] = 0;
  }
}

You might also use it statically (gcc extension):

Array a = { 3, { 1, 2, 3 } };

This is also known as tail-padded structures (this term predates the publication of the C99 Standard) or struct hack (thanks to Joe Wreschnig for pointing it out).

However this syntax was standardized (and the effects guaranteed) only lately in C99. Before a constant size was necessary.

  • 1 was the portable way to go, though it was rather strange.
  • 0 was better at indicating intent, but not legal as far as the Standard was concerned and supported as an extension by some compilers (including gcc).

The tail padding practice, however, relies on the fact that storage is available (careful malloc) so is not suited to stack usage in general.

Extort answered 15/3, 2012 at 15:26 Comment(17)
@Lundin: I have not seen any VLA here, all the sizes are known at compile time. The flexible array term comes from gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Zero-Length.html and doe qualify int content[]; here as far as I understand. Since I am not too savvy on C terms of the art... could you confirm whether my reasoning seems correct ?Extort
@MatthieuM.: C99 6.7.2.1, §16: As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.Mustang
This idiom is also known by the name "struct hack", and I've met more people familiar with that name than "tail-padded structure" (never heard it before except maybe as a generic reference to padding a struct for future ABI compatibility) or "flexible array member" which I first heard in C99.Fsh
@MatthieuM.: what you called static usage is a non-standard GNU extensionMustang
I can't edit this myself because edits must be at least 6 characters, but if the OP could make (struct Array).size (instead of _size) spelled the same in both struct definition and in foo(), that would make the code example clearer.Latecomer
@wkschwartz: Oops! Sorry for the slip up, and thanks for the comment :)Extort
@MatthieuM. perhaps you could explain why this all works this way? See https://mcmap.net/q/24858/-allocating-a-dynamic-array-in-a-dynamically-allocated-struct-struct-of-arrays/1286628 (in particular, the questions in bold) (yes this is a shameless plug for my own question post)Latecomer
Using an array size of 1 for the struct hack would avoid having compilers squawk, but was only "portable" because compiler writers were nice enough to acknowledge such usage as a de-facto standard. Were it not for the prohibition on zero-sized arrays, programmer' consequent usage of single-element arrays as a crummy substitute, and compiler writers' historical attitude that they should serve programmer's needs even when not required to by the Standard, compiler writers could have easily and usefully optimized foo[x] to foo[0] whenever foo was a single-element array.Unsegregated
@supercat: Good remark, I had never thought about the potential optimizations that could have been derived from a strict reading of this part of the Standard. Even more interesting, compilers could have thus assumed that x was necessarily 0, or if x was known not to be 0 that this path would not be executed (since it would be Undefined Behavior).Extort
@MatthieuM.: The hyper-modern philosophy of UB didn't take off until well after the struct hack was deprecated, so I don't think those optimizations would have been an issue. On the other hand, replacing foo[x] with foo[0] would have been both easy and useful in many practical situations (e.g. a program which stores data item n in foo[n>>8][n & 255] and has a macro-defined size limit which might for many builds be less than 255). Implementations on some processors might also optimize for the case where arrays have upper size limits of two elements, 128 bytes, or 256 bytes.Unsegregated
"Normally, it is not allowed." - That is incorrect. It is always not allowed. Note, that OP asked for a 0-size array, not an array with unspecified size. -> a[0] vs. a[]. You answer is useful and I did upvoted it but it loses the focus to OP´s problem.Fissile
@RobertSsupportsMonicaCellio: My understanding is that many compilers have allowed a[0] as an extension to specify a flexible array member for... eternity. So, disallowed by the standard, but allowed in practice.Extort
@MatthieuM. IMHO You should 1. show the relation of a zero-size array to a flexible array member inside of the answer explicitly (that is an extension to fam's) and 2. mention explicitly that is non-standard. - The use of the quote from C99 suggests the impression to me that the zero-size array would be implemented as part of flexible array members per the C standard instead of it is in fact relied on a non-standard compiler extension.Fissile
@RobertSsupportsMonicaCellio: It as shown explicitly in the answer, but at the end. I've front-loaded the explanation as well, to make it clearer from the get go.Extort
@MatthieuM. Now, it is perfect. :-)Fissile
The reason why I was so nitpicking is because I shared a link to your answer and thought at that moment of time, a zero-sized array would be allowed by the standard (as it gave me the impression in the first place) as incomplete struct member. Later I got hinted, it is not.Fissile
@RobertSsupportsMonicaCellio: No worries, your criticism was constructive and we all benefit from the (better) revised answer :)Extort
I
60

In Standard C and C++, zero-size array is not allowed..

If you're using GCC, compile it with -pedantic option. It will give warning, saying:

zero.c:3:6: warning: ISO C forbids zero-size array 'a' [-pedantic]

In case of C++, it gives similar warning.

Indescribable answered 15/3, 2012 at 15:15 Comment(7)
In Visual C++ 2010: error C2466: cannot allocate an array of constant size 0Cumbrance
-Werror simply turns all warnings into errors, that doesn't fix the incorrect behavior of the GCC compiler.Dogy
C++ Builder 2009 also correctly gives an error: [BCC32 Error] test.c(3): E2021 Array must have at least one elementDogy
Apparently zero-sized arrays will be allowed in C++11.Summitry
@Summitry Zero-sized arrays, or VLAs?Dogy
Instead of -pedantic -Werror, you could also just do -pedantic-errorsLubbock
A zero-sized array is not quite the same thing as a zero-sized std::array. (Aside: I recall but can't find the source that VLAs were considered and explicitly rejected from being in C++.)Fsh
A
29

It's totally illegal, and always has been, but a lot of compilers neglect to signal the error. I'm not sure why you want to do this. The one use I know of is to trigger a compile time error from a boolean:

char someCondition[ condition ];

If condition is a false, then I get a compile time error. Because compilers do allow this, however, I've taken to using:

char someCondition[ 2 * condition - 1 ];

This gives a size of either 1 or -1, and I've never found a compiler which would accept a size of -1.

Anchovy answered 15/3, 2012 at 15:31 Comment(4)
This is an interesting hack to use it for.Wheen
It's a common trick in metaprogramming, I think. I wouldn't be surprised if the implementations of STATIC_ASSERT used it.Anchovy
Why not just: #if condition \n #error whatever \n #endifBenedick
@Benedick because the condition may not be known at preprocessing time, only compile timeUstulation
C
11

Another use of zero-length arrays is for making variable-length object (pre-C99). Zero-length arrays are different from flexible arrays which have [] without 0.

Quoted from gcc doc:

Zero-length arrays are allowed in GNU C. They are very useful as the last element of a structure that is really a header for a variable-length object:

 struct line {
   int length;
   char contents[0];
 };
 
 struct line *thisline = (struct line *)
   malloc (sizeof (struct line) + this_length);
 thisline->length = this_length;

In ISO C99, you would use a flexible array member, which is slightly different in syntax and semantics:

  • Flexible array members are written as contents[] without the 0.
  • Flexible array members have incomplete type, and so the sizeof operator may not be applied.

A real-world example is zero-length arrays of struct kdbus_item in kdbus.h (a Linux kernel module).

Chloroprene answered 30/4, 2015 at 0:36 Comment(5)
IMHO, there was no good reason for the Standard to have forbidden zero-length arrays; it could have zero-sized objects just fine as members of a structure and regarded them as void* for purposes of arithmetic (so adding or subtracting pointers to zero-size objects would be forbidden). While Flexible Array Members are mostly better than zero-sized arrays, the they can also act as a sort of "union" to alias things without adding an extra level of "syntactic" indirection to what follows (e.g. given struct foo {unsigned char as_bytes[0]; int x,y; float z;} one can access members x..z...Unsegregated
...directly without having to say e.g. myStruct.asFoo.x, etc. Further, IIRC, C squawks at any effort to include a flexible array member within a struct, thus making it impossible have a structure which includes multiple other flexible-array members of known-length content.Unsegregated
@Unsegregated a good reason is to maintain the integrity of the rule about accessing outside array bounds. As the last member of a struct, the C99 flexible array member achieves exactly the same effect as GCC zero-sized array, but without needing to add special cases to other rules. IMHO it's an improvement that sizeof x->contents is an error in ISO C as opposed to returning 0 in gcc. Zero-sized arrays that are not struct members introduce a bunch of other problems.Chromium
@M.M: What problems would they cause if subtracting two equal pointers to a zero-sized object were defined as yielding zero (as would subtracting equal pointers to any size of object), and subtracting unequal pointers to zero-sized objects was defined as yielding Unspecified Value? If the Standard had specified that an implementation may allow a struct containing an FAM to be embedded within another struct provided that the next element in the latter struct is either an array with the same element type as the FAM or a struct starting with such an array, and provided that...Unsegregated
...it recognizes the FAM as aliasing the array (if alignment rules would cause the arrays to land at different offsets, a diagnostic would be required), that would have been very useful. As it is, there's no good way to have a method which accept pointers to structures of the general format struct {int n; THING dat[];} and can work with things of static or automatic duration.Unsegregated
E
9

I'll add that there is a whole page of the online documentation of gcc on this argument.

Some quotes:

Zero-length arrays are allowed in GNU C.

In ISO C90, you would have to give contents a length of 1

and

GCC versions before 3.0 allowed zero-length arrays to be statically initialized, as if they were flexible arrays. In addition to those cases that were useful, it also allowed initializations in situations that would corrupt later data

so you could

int arr[0] = { 1 };

and boom :-)

Eleph answered 15/3, 2012 at 15:17 Comment(9)
Can i do like int a[0] , then a[0] = 1 a[1] = 2 ??Gauntlet
@SurajJain If you want to overwrite your stack :-) C doesn't check the index vs the size of the array you are writing, so you can a[100000] = 5 but if you are lucky you'll simply crash your app, if you are lucky :-)Eleph
Int a[0] ; means a variable array (zero sized array) , How Can I Now assign itGauntlet
@SurajJain What part of "C doesn't check the index vs the size of the array you are writing" isn't clear? There is no index checking in C, you can write after the end of the array and crash the computer or overwrite precious bits of your memory. So if you have an array of 0 elements, you can write after the end of the 0 elements.Eleph
See This quora.com/…Gauntlet
@SurajJain Yep, I do know perfectly well the use cases of a zero-length array. They are used in the Windows API in the same way. If you want to know how does they work: if you have a struct Test { int a; int b[0]; } in a 32 bit environment, simply put &(b[0]) == &a + 4. Now, unless you allocated some extra memory for the struct Test, using b is illegal, but if you, for example, malloc(sizeof(struct Test) + sizeof(int) * 16) then you can use up to 16 elements in the b array, like ideone.com/3LORwDEleph
So , why did you do "&(b[0]) == &a + 4" also how to initialise this array outside structure if they are not used in structure ??Gauntlet
@SurajJain I haven't written a program, I'm explaining "on paper" how the address of &b[0] would be calculated by the compiler. You can't use in a meaningful way a zero-length array outside of a struct.Eleph
Oh , oK Thanks a lot , can you do a little more help i have asked a question many months ago , and it was hugely downvoted , i then properly intended it and corrected it , now it is gettign upvote , but it has no views , if you could wrtie the answer to that question and also tell me if something is missing in that question it would be great help #34826536Gauntlet
U
6

Zero-size array declarations within structs would be useful if they were allowed, and if the semantics were such that (1) they would force alignment but otherwise not allocate any space, and (2) indexing the array would be considered defined behavior in the case where the resulting pointer would be within the same block of memory as the struct. Such behavior was never permitted by any C standard, but some older compilers allowed it before it became standard for compilers to allow incomplete array declarations with empty brackets.

The struct hack, as commonly implemented using an array of size 1, is dodgy and I don't think there's any requirement that compilers refrain from breaking it. For example, I would expect that if a compiler sees int a[1], it would be within its rights to regard a[i] as a[0]. If someone tries to work around the alignment issues of the struct hack via something like

typedef struct {
  uint32_t size;
  uint8_t data[4];  // Use four, to avoid having padding throw off the size of the struct
}

a compiler might get clever and assume the array size really is four:

; As written
  foo = myStruct->data[i];
; As interpreted (assuming little-endian hardware)
  foo = ((*(uint32_t*)myStruct->data) >> (i << 3)) & 0xFF;

Such an optimization might be reasonable, especially if myStruct->data could be loaded into a register in the same operation as myStruct->size. I know nothing in the standard that would forbid such optimization, though of course it would break any code which might expect to access stuff beyond the fourth element.

Unsegregated answered 15/3, 2012 at 18:2 Comment(5)
The flexible array member was added to C99 as a legitimate version of the struct hackChromium
The Standard does say that accesses to different array members do not conflict, which would tend to make that optimization impossible.Slotnick
@BenVoigt: The C language standard doesn't specify the effect of writing a byte and reading the containing a word simultaneously, but 99.9% of processors do specify that the write will succeed and the word will contain either the new or old version of the byte along with the unaltered contents of the other bytes. If a compiler targets such processors, what would be the conflict?Unsegregated
@supercat: The C language standard guarantees that simultaneous writes to two different array elements won't conflict. So your argument that (read while write) works ok, is not sufficient.Slotnick
@BenVoigt: If a piece of code were to e.g. write to array elements 0, 1, and 2 in some sequence, it would not be allowed to read all four elements into a long, modify three, and write back all four, but I think it would be allowed to read all four into a long, modify three, write back the lower 16 bits as a short, and bits 16-23 as a byte. Would you disagree with that? And code which only needed to read elements of the array would be allowed to simply read them into a long and use that.Unsegregated
B
2

Definitely you can't have zero sized arrays by standard, but actually every most popular compiler gives you to do that. So I will try to explain why it can be bad

#include <cstdio>

int main() {
    struct A {
        A() {
            printf("A()\n");
        }
        ~A() {
            printf("~A()\n");
        }
        int empty[0];
    };
    A vals[3];
}

I am like a human would expect such output:

A()
A()
A()
~A()
~A()
~A()

Clang prints this:

A()
~A()

GCC prints this:

A()
A()
A()

It is totally strange, so it is a good reason not to use empty arrays in C++ if you can.

Also there is extension in GNU C, which gives you to create zero length array in C, but as I understand it right, there should be at least one member in structure prior, or you will get very strange examples as above if you use C++.

Burge answered 23/12, 2020 at 9:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.