To answer your numbered points.
- Yes.
- All the bytes. Malloc/free doesn't know or care about the type of the object, just the size.
- It is strictly speaking undefined behaviour, but a common trick supported by many implementations. See below for other alternatives.
The latest C standard, ISO/IEC 9899:1999 (informally C99), allows flexible array members.
An example of this would be:
int main(void)
{
struct { size_t x; char a[]; } *p;
p = malloc(sizeof *p + 100);
if (p)
{
/* You can now access up to p->a[99] safely */
}
}
This now standardized feature allowed you to avoid using the common, but non-standard, implementation extension that you describe in your question. Strictly speaking, using a non-flexible array member and accessing beyond its bounds is undefined behaviour, but many implementations document and encourage it.
Furthermore, gcc allows zero-length arrays as an extension. Zero-length arrays are illegal in standard C, but gcc introduced this feature before C99 gave us flexible array members.
In a response to a comment, I will explain why the snippet below is technically undefined behaviour. Section numbers I quote refer to C99 (ISO/IEC 9899:1999)
struct {
char arr[1];
} *x;
x = malloc(sizeof *x + 1024);
x->arr[23] = 42;
Firstly, 6.5.2.1#2 shows a[i] is identical to (*((a)+(i))), so x->arr[23] is equivalent to (*((x->arr)+(23))). Now, 6.5.6#8 (on the addition of a pointer and an integer) says:
"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."
For this reason, because x->arr[23] is not within the array, the behaviour is undefined. You might still think that it's okay because the malloc() implies the array has now been extended, but this is not strictly the case. Informative Annex J.2 (which lists examples of undefined behaviour) provides further clarification with an example:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int
a[4][5]) (6.5.6).
+
in the calculation, which is obviously absurd. – Rockrose