Array of zero length

Asked 17/11, 2008 at 6:54 Answered 9/2, 2018 at 20:20

Solved c++arrays visual-c++flexible-array-member

I am working on refactoring some old code and have found few structs containing zero length arrays (below). Warnings depressed by pragma, of course, but I've failed to create by "new" structures containing such structures (error 2233). Array 'byData' used as pointer, but why not to use pointer instead? or array of length 1? And of course, no comments were added to make me enjoy the process... Any causes to use such thing? Any advice in refactoring those?

struct someData
{
   int nData;
   BYTE byData[0];
}

NB It's C++, Windows XP, VS 2003

Alexandrina answered 17/11, 2008 at 6:54 Comment(1)

This is the "struct hack", described in question 2.6 of the comp.lang.c FAQ. Dennis Ritchie called it "unwarranted chumminess with the C implementation". C99 introduced a new language feature, the "flexible array member", to replace the struct hack. Even Microsoft's compiler, which is noted for its lack of C99 support, supports flexible array members. – Electroscope 11/9, 2012 at 18:37

Yes this is a C-Hack.
To create an array of any length:

struct someData* mallocSomeData(int size)
{
    struct someData*  result = (struct someData*)malloc(sizeof(struct someData) + size * sizeof(BYTE));
    if (result)
    {    result->nData = size;
    }
    return result;
}

Now you have an object of someData with an array of a specified length.

Ishmaelite answered 17/11, 2008 at 16:53 Comment(4)

Shouldn't this at least use new[], this being about C++? – Desireedesiri 24/2, 2014 at 14:21

@unwind: Can't use new for this. The whole point is that this is a C-Hack and not required in C++ (because we have better ways of doing it). Also I am pretty sure that zero length arrays are illegal in C++ (well at least C++03, not sure if that was updated in C++11). – Ishmaelite 28/12, 2014 at 17:17

The popular jargon for this is "Struct Hack". – Tilda 16/1, 2015 at 4:27

Except, your calculation is off (in the general case). Depending on the type of objects in the array, a compiler needs to impose certain alignment rules, and summing up the sizes of the members may not produce the correct size. Instead, use the offsetof macro to have the compiler calculate the correct result. (Note: Not an issue for BYTEs, assuming those are defined to be some char variant.) – Thaumaturgy 14/5, 2016 at 23:24

There are, unfortunately, several reasons why you would declare a zero length array at the end of a structure. It essentially gives you the ability to have a variable length structure returned from an API.

Raymond Chen did an excellent blog post on the subject. I suggest you take a look at this post because it likely contains the answer you want.

Note in his post, it deals with arrays of size 1 instead of 0. ~~This is the case because zero length arrays are a more recent entry into the standards.~~ His post should still apply to your problem.

http://blogs.msdn.com/oldnewthing/archive/2004/08/26/220873.aspx

EDIT

Note: Even though Raymond's post says 0 length arrays are legal in C99 they are in fact still not legal in C99. Instead of a 0 length array here you should be using a length 1 array

Harpole answered 17/11, 2008 at 6:59 Comment(6)

"This is the case because zero length arrays are a more recent entry into the standards." Which standards? C++11 still disallows 0-length arrays (§8.3.4/1), as well as C99 (§6.7.5.2/1). – Aludel 13/4, 2012 at 16:42

@Aludel i was essentially parotting what Raymond said at the end of his blog post. I wasn't aware that 0 length arrays were still illegal in C99 until a recent comment discussion with you on another question. I'll update the answer – Harpole 13/4, 2012 at 16:47

Sorry to nitpick such an old answer. :-P I only ask because another question linked here as "proof" that 0-length arrays were legal C++. :-] – Aludel 13/4, 2012 at 16:48

@Aludel NP on nitpicking old answers. Definitely don't want to be spouting bad data :) – Harpole 13/4, 2012 at 16:49

Can you find ANY reference on the impacts of using 0-length or [] arrays on memory alignment? In a project, a colleague has found out that the best use to be safe would be int arr[] (since int protects from any alignment issues), but since we are returning the array, the very best in our case is void *arr[], which is pretty cryptographic. – Seow 14/7, 2012 at 22:24

Here's the wayback archive of the blog in case anyone needs it (the original page is dead). – Magdalenemagdalenian 20/7, 2020 at 9:20

This is an old C hack to allow a flexible sized arrays.

In C99 standard this is not neccessary as it supports the arr[] syntax.

Insouciance answered 17/11, 2008 at 6:58 Comment(2)

Sadly, Visual Studio is very poor when it comes to C99 support. :( – Blondy 17/11, 2008 at 7:4

Without addressing the general truth of your comment, ...the MS VC v9 compiler supports the arr[] syntax. – Rambert 14/12, 2009 at 17:18

Your intution about "why not use an array of size 1" is spot on.

The code is doing the "C struct hack" wrong, because declarations of zero length arrays are a constraint violation. This means that a compiler can reject your hack right off the bat at compile time with a diagnostic message that stops the translation.

If we want to perpetrate a hack, we must sneak it past the compiler.

The right way to do the "C struct hack" (which is compatible with C dialects going back to 1989 ANSI C, and probably much earlier) is to use a perfectly valid array of size 1:

struct someData
{
   int nData;
   unsigned char byData[1];
}

Moreover, instead of sizeof struct someData, the size of the part before byData is calculated using:

offsetof(struct someData, byData);

To allocate a struct someData with space for 42 bytes in byData, we would then use:

struct someData *psd = (struct someData *) malloc(offsetof(struct someData, byData) + 42);

Note that this offsetof calculation is in fact the correct calculation even in the case of the array size being zero. You see, sizeof the whole structure can include padding. For instance, if we have something like this:

struct hack {
  unsigned long ul;
  char c;
  char foo[0]; /* assuming our compiler accepts this nonsense */
};

The size of struct hack is quite possibly padded for alignment because of the ul member. If unsigned long is four bytes wide, then quite possibly sizeof (struct hack) is 8, whereas offsetof(struct hack, foo) is almost certainly 5. The offsetof method is the way to get the accurate size of the preceding part of the struct just before the array.

So that would be the way to refactor the code: make it conform to the classic, highly portable struct hack.

Why not use a pointer? Because a pointer occupies extra space and has to be initialized.

There are other good reasons not to use a pointer, namely that a pointer requires an address space in order to be meaningful. The struct hack is externalizeable: that is to say, there are situations in which such a layout conforms to external storage such as areas of files, packets or shared memory, in which you do not want pointers because they are not meaningful.

Several years ago, I used the struct hack in a shared memory message passing interface between kernel and user space. I didn't want pointers there, because they would have been meaningful only to the original address space of the process generating a message. The kernel part of the software had a view to the memory using its own mapping at a different address, and so everything was based on offset calculations.

Languish answered 21/5, 2014 at 6:39 Comment(1)

"which is compatible with C dialects going back to 1989 " - accessing past the first element of the array causes undefined behaviour even in C89. The struct hack relies on the compiler "defining" this behaviour for itself. – Karakoram 3/9, 2015 at 4:8

It's worth pointing out IMO the best way to do the size calculation, which is used in the Raymond Chen article linked above.

struct foo
{
    size_t count;
    int data[1];
}

size_t foo_size_from_count(size_t count)
{
    return offsetof(foo, data[count]);
}

The offset of the first entry off the end of desired allocation, is also the size of the desired allocation. IMO it's an extremely elegant way of doing the size calculation. It does not matter what the element type of the variable size array is. The offsetof (or FIELD_OFFSET or UFIELD_OFFSET in Windows) is always written the same way. No sizeof() expressions to accidentally mess up.

Roussel answered 9/2, 2018 at 20:20 Comment(0)

Recommended topics

Hot tags