Are flexible array members valid in C++?
Asked Answered
P

10

64

In C99, you can declare a flexible array member of a struct as such:

struct blah
{
    int foo[];
};

However, when someone here at work tried to compile some code using clang in C++, that syntax did not work. (It had been working with MSVC.) We had to convert it to:

struct blah
{
    int foo[0];
};

Looking through the C++ standard, I found no reference to flexible member arrays at all; I always thought [0] was an invalid declaration, but apparently for a flexible member array it is valid. Are flexible member arrays actually valid in C++? If so, is the correct declaration [] or [0]?

Percuss answered 10/12, 2010 at 19:55 Comment(8)
Can't you just use a std::vector<int> member and worry about more interesting stuff? Or is this a layout issue?Redfield
That flexible-array-member tag seems a bit... lonely. But maybe it's just me.Betti
@FredOverflow: there is sometimes a need to have structures that can be used in both C and C++ (system APIs being one very common example).Hoodlum
@FredOverflow, normally I would, but in this case, it's necessary to have a contiguous allocation for blah with a variable sized foo. It's certainly a good design question as to why we need it in the first place, which I can't get in to here.Percuss
BTW: An array of size 0 is illegal in both C and C++.Euchology
@Redfield - If you want to represent a growing shared memory area and all you know of its contents at design-time is that it's a blob of bytes, vector is a poor choice. And that's just one example.Enshrine
It is always nice and efficient to avoid another level of indirection. Much like with bitfields, in this case I also prefer to calculate the layouts manually and be done with it. Yes, there is no compiler guaranteed type safety, but as long as you know what you are doing it is 100% fine and safe, and gives you functionality that neither the standard nor the compiler otherwise provide.Bohlin
Actually, that specific construction is illegal according to the C99 standard, as it states: "as a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member." So blah needs an extra member, one before foo, to be valid in C99.Nessie
R
37

C++ was first standardized in 1998, so it predates the addition of flexible array members to C (which was new in C99). There was a corrigendum to C++ in 2003, but that didn't add any relevant new features. The next revision of C++ (C++2b) is still under development, and it seems flexible array members still aren't added to it.

Rives answered 10/12, 2010 at 20:4 Comment(6)
Can you please update this answer? It seems that C++11 did not add flexible array members (§9.2/9), and it's looking like C++14 will be the same.Walkon
And C++17 also doesn’t have them, but I think they’re still being looked into, so maybe C++2a?Threesquare
Any way to make GCC g++ give a warning? -std=c++xx -Wall -Wextra is not enough. Scary :-(Jaffna
@CiroSantilli新疆改造中心六四事件法轮功 If I remember correctly -Wpedantic or -pedantic will trigger the warning.Thompson
Still nothing in C++20 although a paper did exist: thephd.github.io/vendor/future_cxx/papers/d1039.htmlEiland
Still getting warning in GNU++23. @CiroSantilliOurBigBook.com I used -Wall -Wextra -pedantic -pg -g3 -Os -flto -save-temps -std=gnu++23 -fdiagnosics-color=alwaysLapwing
H
36

C++ doesn't support C99 flexible array members at the end of structures, either using an empty index notation or a 0 index notation (barring vendor-specific extensions):

struct blah
{
    int count;
    int foo[];  // not valid C++
};

struct blah
{
    int count;
    int foo[0]; // also not valid C++
};

As far as I know, C++0x will not add this, either.

However, if you size the array to 1 element:

struct blah
{
    int count;
    int foo[1];
};

the code will compile, and work quite well, but it is technically undefined behavior. You can allocate the appropriate memory with an expression that is unlikely to have off-by-one errors:

struct blah* p = (struct blah*) malloc( offsetof(struct blah, foo[desired_number_of_elements]);
if (p) {
    p->count = desired_number_of_elements;

    // initialize your p->foo[] array however appropriate - it has `count`
    // elements (indexable from 0 to count-1)
}

So it's portable between C90, C99 and C++ and works just as well as C99's flexible array members.

Raymond Chen did a nice writeup about this: Why do some structures end with an array of size 1?

Note: In Raymond Chen's article, there's a typo/bug in an example initializing the 'flexible' array. It should read:

for (DWORD Index = 0; Index < NumberOfGroups; Index++) { // note: used '<' , not '='
  TokenGroups->Groups[Index] = ...;
}
Hoodlum answered 10/12, 2010 at 20:31 Comment(10)
However, even if you allocate excess memory you still can't validly access members outside of the array bounds of one element. The behaviour is undefined; a C++ implementation would be within its rights to add bounds checking according to the actual type of the object constructed.Ancohuma
@Charles - I don't think you're right about that (even pedantically), otherwise the following would be undefined behavior: int* p = malloc(sizeof(int)*4); p[3] = 0;.Hoodlum
@Michael: I think that the reason Charles said one element is not because he thinks it's impossible to allocate an array, but rather because 1 is the length of the array in your particular struct blah. The claim is that since p->foo is of type blah[1], then p->foo[1] is UB. However, although p->foo[1] is outside the object foo, it isn't outside the array of char that was allocated with malloc, so it is inside an object. With suitable casts via char* read access at least would be fine. I can't remember how the standards legalese falls out, though.Neonatal
Also, I can't remember whether it's legal for the structure blah to contain some padding after foo, that the implementation uses e.g. to detect buffer overruns using a magic number that should still be there later (and which perhaps is a trap representation for int on some exotic architecture). The implementation can't do that in an array, but possibly can in a class containing an array. Anyway, I think you need an implementation-specific guarantee to pull the trick.Neonatal
There's no need for suitable casts to char*. malloc() "allocates space for an object whose size if specified by size". The index operator just performs pointer arithmetic based on (p->foo)'s type and offset. If the object being pointed to is large enough for the resulting pointer arithmetic (and malloc()` took care of that job) then I think there no UB.Hoodlum
Also, just to be clear, I don't claim this 'struct-hack' technique to be valid for non-POD types.Hoodlum
Well, I guess I'll have to eat my words. None other than WG14 has stated that this is UB (Defect Report 51: open-std.org/Jtc1/sc22/wg14/www/docs/dr_051.html). However, I contend that 1) the 'safer idiom' suggested by WG14 in DR51 is flat-out ridiculous, 2) the UB behaves as expected on all platforms that are important to me, and 3) alternatives (that also aren't UB) are less convenient and/or more error-prone to use (and therefore more likely to cause observable bugs) - so I'll likely continue to use it. But now at least I'll know I'm breaking a rule...Hoodlum
Yes, I wasn't claiming that it isn't a useful technique, just that it's not strictly conforming. Unfortunately the safer idiom is also UB for a different reason. You can only perform pointer arithmetic on an object that actually exists and a POD object only starts to exist once memory of sufficient alignment and size is allocated. If something is declared with a very large array and you allocate not enough space for it, it can't start to exist.Ancohuma
Also, (again only talking language-lawyer curiosities) I don't believe that your offsetof invocation is strictly conforming either, because foo[desired_number_of_elements] doesn't designate a member of a hypothetical static blah object. I think you would have to do offsetof(blah, foo) + desired_number_of_elements.Ancohuma
Anyone else notice that the DR is discussing a nonexistent ->> operator?Bernabernadene
H
5

If you can restrict your application to only require a few known sizes, then you can effectively achieve a flexible array with a template.

template <typename BASE, typename T, unsigned SZ>
struct Flex : public BASE {
    T flex_[SZ];
};
Hemialgia answered 5/11, 2018 at 20:20 Comment(3)
I don't understand your comment about "restrict your application to only require a few known sizes", I am not clear why that would that be a consideration when using this approach. Can you explain any further?Knawel
@MarkCh: A template has to be instantiated at compile time, so the size of flex_ will be fixed at compile time. But, you can use as many sizes as you want, each size will denote a different type.Hemialgia
Fixed and flexible sound quite contradictory to me.Bohlin
A
4

The second one will not contain elements but rather will point right after blah. So if you have a structure like this:

struct something
{
  int a, b;
  int c[0];
};

you can do things like this:

struct something *val = (struct something *)malloc(sizeof(struct something) + 5 * sizeof(int));
val->a = 1;
val->b = 2;
val->c[0] = 3;

In this case c will behave as an array with 5 ints but the data in the array will be after the something structure.

The product I'm working on uses this as a sized string:

struct String
{
  unsigned int allocated;
  unsigned int size;
  char data[0];
};

Because of the supported architectures this will consume 8 bytes plus allocated.

Of course all this is C but g++ for example accepts it without a hitch.

Alexander answered 10/12, 2010 at 20:1 Comment(6)
That's pretty interesting. I imagine that you wouldn't be able to ever pass this struct to a function as value, right? as that would probably just pass the sizeof(String) which wouldn't take into account the size you allocated for data. But it should work as long as you only pass it as reference or pointer, is that right?Ponceau
Well, you can pass it but it will only pass the data in the string. Also, depending on the compiler it will generate some warnings.Alexander
This is not true. T[0] is neither a valid type specifier in C nor in C++. You have to use T[].Zygosis
@Johannes it is valid in C99, see open-std.org/jtc1/sc22/wg14/www/newinc9x.htmAlexander
Of course all this is C but g++ for example accepts it without a hitch. That's nice for g++. Per the Standard, it's still UB and so should be killed with fire. @Alexander That link only mentions the existence of flexible array members in C99 (but never C++); it does not support your contention that [0] is a valid syntax for declaring them, which it is not.Mudstone
Zero-length arrays were never valid in any version of C. You can't have zero-length VLAs. It is a gcc non-standard extension. This code will not compile in standard C nor standard C++.Relique
G
4

If you only want

struct blah { int foo[]; };

then you don't need the struct at all an you can simply deal with a malloc'ed/new'ed int array.

If you have some members at the beginning:

struct blah { char a,b; /*int foo[]; //not valid in C++*/ };

then in C++, I suppose you could replace foo with a foo member function:

struct blah { alignas(int) char a; char b; 
    int *foo(void) { return reinterpret_cast<int*>(&this[1]); } };

Example use:

#include <stdlib.h>
struct blah { 
    alignas(int) char a;
    char b;
    ////////
    int *foo(void) { return reinterpret_cast<int*>(&this[1]); }
};
int main()
{
    blah *b = (blah*)malloc(sizeof(blah)+10*sizeof(int));
    if(!b) return 1;
    b->foo()[1]=1;
}

There's no strict aliasing issues with this type of casting of the memory past the initial struct here because the memory is dynamic (has no declared type).

Gant answered 30/11, 2017 at 13:22 Comment(9)
It seems to me this method might have some holes as currently written. What would alignas(int) do if the struct were to have a double value, for example? (double being a type that commonly has stricter alignment requirements than int)Woodsum
@AndrewHenle Substitute double for int then (both in alignas and in the cast/return-value). In general, I don't think there's anything in C or C++ that prevents you from having an array just after a malloced/aligned_alloced struct just as long as (1) there's space for it and (2) the struct has equal or larger alignment so that the "just after" (&this[1]) is sufficiently aligned.Gant
@AndrewHenle C's flexible array members basically just (1) automatically give you an accessor (the name) instead of you having to access it via (target*)(&this[1]) or a member method that does the same (2) allow the flexible array to possibly start already in the initial struct's end padding if the alignment requirement for the array is smaller that for the initial struct.Gant
@AndrewHenle I've heard some (IMO competent) people argue that flexible array members aren't even needed and that there should be no issues indexing past a final 1-sized array provided there's space. I wouldn't bet on it and in the absence of FEM's I'd probably just do what I've described here without being stingy about every last padding byte.Gant
I was merely concerned that using alignas() for a less-strictly-aligned type could force a more-strictly-aligned type of struct element to an invalid alignment. Substituting the alignas() to use the most-restrictive element type isn't really a good solution as a complex struct of other structs is subject to having one of its sub-structs change by adding a more restrictively aligned type.Woodsum
I've heard some (IMO competent) people argue that flexible array members aren't even needed and that there should be no issues indexing past a final 1-sized array provided there's space. Yeah, no. It's UB and it can bite youWoodsum
@AndrewHenle Yeah, that's why I said I wouldn't bet on it. :) As for alignas inappropriately reducing alignment that can never happen (def. with _Alignas, I'm presuming C++'s behaves the same). It's a constraint violation (=compiler error) to attempt to do so.Gant
With that, I'd say your method here is a significant improvement on the final 1-sized array "struct hack". I don't see any way UB is invoked, based on your description of alignas(), which was my original concern.Woodsum
@AndrewHenle Thanks. To be generic, you'd still want to avoid the compiler error for when the alignas to the flex-member type would wanna reduce the alignment requirement for the first struct member. So something like alignas(alignof(flexttype)>alignof(firstmembtype) ? alignof(flexttype) : alignof(firstmembtype) ) firstmemtype firstmemb; Then it could be packaged into a macro (or perhaps a C++ template). :)Gant
E
4

A proposal is underway, and might make into some future C++ version. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1039r0.html for details (the proposal is fairly new, so it's subject to changes)

Eadith answered 23/8, 2019 at 13:17 Comment(4)
yes! and the "Design" section nicely touches on why C++ up to now doesn't officially allow for a flexible array member: the code would silently break if the FAM struct was not physically located at the end of the surrounding class (must be a last member, must not be a base class, ...) open-std.org/jtc1/sc22/wg21/docs/papers/2018/…Monodrama
@SusanneOberhauser I guess the reason for it not being available in C++ originally is that it has inheritance, which does need to make assumption about the layout of the base class to work. But in C++11 there are final classes that can't be inherited, so a good solution would be to allow them only in final classes?Eadith
In C, a struct with a flexible array member must not be embedded in another struct, even not at the end. so being a final class adresses one of the challenges but not the other (the struct being a member)Monodrama
@SusanneOberhauser I think it would be fine to have the same restrictions that are specified in C, i.e. not embed in other structs and only being last member. The most disruptive changes would be in allocating such objects (and how we keep track of their size during lifetime), both in dynamic and static allocation contexts. This would pose challenges to implementors that are not relevant in C where allocation (and tracking size) is manual.Eadith
W
3

Flexible arrays are not part of the C++ standard yet. That is why int foo[] or int foo[0] may not compile. While there is a proposal being discussed, it has not been accepted to the newest revision of C++ (C++2b) yet.

However, almost all modern compiler do support it via compiler extensions.

The catch is that if you use this extension with the highest warning level (-Wall --pedantic), it may result into a warning.

A workaround to this is to use an array with one element and do access out of bounds. While this solution is UB by the spec (dcl.array and expr.add), most of the compilers will produce valid code and even clang -fsanitize=undefined is happy with it:

#include <new>
#include <type_traits>

struct A {
    int a[1];
};

int main()
{
    using storage_type = std::aligned_storage_t<1024, alignof(A)>;
    static storage_type memory;
    
    A *ptr_a = new (&memory) A;

    ptr_a->a[2] = 42;
    
    return ptr_a->a[2];
}

demo


Having all that said, if you want your code to be standard compliant and do not depend on any compiler extension, you will have to avoid using this feature.
Wrist answered 10/12, 2010 at 19:55 Comment(0)
B
1

I faced the same problem to declare a flexible array member which can be used from C++ code. By looking through glibc headers I found that there are some usages of flexible array members, e.g. in struct inotify which is declared as follows (comments and some unrelated members omitted):

struct inotify_event
{
  //Some members
  char name __flexarr;
};

The __flexarr macro, in turn is defined as

/* Support for flexible arrays.
   Headers that should use flexible arrays only if they're "real"
   (e.g. only if they won't affect sizeof()) should test
   #if __glibc_c99_flexarr_available.  */
#if defined __STDC_VERSION__ && __STDC_VERSION__ >= 199901L
# define __flexarr  []
# define __glibc_c99_flexarr_available 1
#elif __GNUC_PREREQ (2,97)
/* GCC 2.97 supports C99 flexible array members as an extension,
   even when in C89 mode or compiling C++ (any version).  */
# define __flexarr  []
# define __glibc_c99_flexarr_available 1
#elif defined __GNUC__
/* Pre-2.97 GCC did not support C99 flexible arrays but did have
   an equivalent extension with slightly different notation.  */
# define __flexarr  [0]
# define __glibc_c99_flexarr_available 1
#else
/* Some other non-C99 compiler.  Approximate with [1].  */
# define __flexarr  [1]
# define __glibc_c99_flexarr_available 0
#endif

I'm not familar with MSVC compiler, but probably you'd have to add one more conditional macro depending on MSVC version.

Bazil answered 13/10, 2019 at 6:24 Comment(0)
O
0

Flexible array members are not supported in standard C++, however the clang documentation says.

"In addition to the language extensions listed here, Clang aims to support a broad range of GCC extensions."

The gcc documentation for C++ says.

"The GNU compiler provides these extensions to the C++ language (and you can also use most of the C language extensions in your C++ programs)."

And the gcc documentation for C documents support for arrays of zero length.

https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html

Objectify answered 13/3, 2021 at 1:4 Comment(0)
E
-3

The better solution is to declare it as a pointer:

struct blah
{
    int* foo;
};

Or better yet, to declare it as a std::vector:

struct blah
{
    std::vector<int> foo;
};
Ehtelehud answered 10/12, 2010 at 20:4 Comment(6)
Neither of these are easily serializable which is the whole point of flexible array members.Floorage
no, int[0] does not create a pointer. See answer by terminus.Comanche
@doron: The vector solution is serializeable as vectors are guaranteed to be contiguous. Even the pointer version is fairly easy to serialize.Ehtelehud
@kriss: Edited - I submitted before I meant to. I was trying to say it allows you to create a data member that behaves like a pointer, but removed it as it could be confusing. In C++, there is really no need to even bother with this syntax "hack" that was used in C. Sorry for the confusion.Ehtelehud
@Zac Howland: No problem. I would call that an address, and it's interest is that it also allow to define a memory aligned zero length member in a structure. Even with C++ there is cases when it's useful, when dealing with hardware aware low level programs. And I do agree with doron about serialization.Comanche
@ZacHowland: Elements in the vector are guaranteed to be contiguous with each other, but are guaranteed not to be contiguous with other members of struct blah. The flexible-array elements on the other hand, are contiguous with other members of struct blah.Bernabernadene

© 2022 - 2024 — McMap. All rights reserved.