There is a data structure used in the implementation of the J language (a dialect of APL) which accommodates dynamically-allocated arbitrary-dimensional arrays. It uses a hybrid of a struct
and a dynamic array for its data structure, a trick commonly known as the struct hack. (More info on the J implementation here and here.)
To see the idea in a simple context, consider the 1D case: we want a dynamic 1D array which carries its size along with it. So:
struct vec { int n; int p[]; };
Since the p
member is last, and C has no built-in bounds checking, it can be used to access additional memory off the end of the struct
. Of course, when allocating, we'll need to provide this extra memory and not simply allocate the size of the struct
. The struct
is just the header of the array. C90 requires a number (say, 1) for the length of the p[] array, but C99 allows the number to be omitted so the size of the header is more straightforward to calculate.
So, an array with more dimensions will need more values to hold the sizes of each dimension. And for our structure to accommodate arrays of different dimensionality, this dimension vector will need to be variable-length as well.
What we can do to achieve all of this is to apply the struct hack twice, recursively upon itself. This gives us a memory layout like this, where R
is the number of dimensions which we'll call the rank of the array, the D
values are the lengths of each dimension, and the V
values are the actual array data:
1 R Product(D)
--- -------------------- -----------------------------
R D[0] D[1] ... D[R-1] V[0] V[1] ... V[Product(D)-1]
And to describe this in C,
typedef struct arr { int r; int d[]; } *arr;
The elements of an array a
immediately follow the R
elements of the dims vector D
. So the V
elements can be accessed at a->d[r+0], a->d[r+1], ... a->d[r+i]
(after reducing the index vector down to a single index on the flattened representation). The elements are easiest to treat in row-major order. The number of actual elements is the product of all the dimensions, all the dimensions multiplied together. Edit: The expression here might be better written: (a->d+a->r)[0], (a->d+a->r)[1], ... (a->d+a->r)[i]
.
In order to allocate one of these things, we'll need a function to compute this product as part of the size calculation.
int productdims(int rank, int *dims){
int z=1;
for(int i=0; i<rank; i++)
z *= dims[i];
return z;
}
And to initialize, we just need to fill-in the members.
arr makearr(int rank, int *dims){
arr z = calloc( (sizeof(struct arr)/sizeof(int)) +
rank + productdims(rank,dims), sizeof(int));
z->r = rank;
memmove(z->d,dims,rank*sizeof(int));
return z;
}
Remembering the formula for accessing 2D data (say an array of [m][n] elements) with a single index (it's a typical dynamic array like in the question). Element [i][j] is at i×n+j. With a 3D array [m][n][o], element [i][j][k] is at i×(n×o)+j×o+k.
So we can calculate a single index for our linearly-laid-out data from an array of indices and the array of dimensions.
int *elem(arr a, ...){
va_list ap;
int idx = 0;
va_start(ap,a);
if (a->r){
idx = va_arg(ap,int);
for(int i=1; i<a->r; i++){
idx *= a->d[i];
idx += va_arg(ap,int);
}
}
va_end(ap);
return &a->d[a->r + idx];
}
Don't forget the headers:
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
And voilà:
int main() {
{
int i,n=6;
arr m = makearr(1, (int[]){n});
for (i=0;i<n;i++)
*elem(m,i) = i;
for (i=0;i<n;i++,printf(" "))
printf("%d",*elem(m,i));
}
puts("\n");
{
int i,j,n=4;
arr m = makearr(2, (int[]){n,n});
for (i=0;i<n;i++)
for (j=0;j<n;j++)
*elem(m,i,j) = i*n+j;
for (i=0;i<n;i++,printf("\n"))
for (j=0;j<n;j++,printf(" "))
printf("%d",*elem(m,i,j));
}
puts("\n");
{
int i,j,k,n=3;
arr m = makearr(3, (int[]){n,n,n});
for (i=0;i<n;i++)
for (j=0;j<n;j++)
for (k=0;k<n;k++)
*elem(m,i,j,k) = (i*n+j)*n+k;
for (i=0;i<n;i++,printf("\n"))
for (j=0;j<n;j++,printf("\n"))
for (k=0;k<n;k++,printf(" "))
printf("%d",*elem(m,i,j,k));
}
return 0;
}
Output:
0 1 2 3 4 5
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
0 1 2
3 4 5
6 7 8
9 10 11
12 13 14
15 16 17
18 19 20
21 22 23
24 25 26
Further elaboration of this code to support 2D matrix multiplication and column slices has been posted to comp.lang.c in this thread.
Better-motivated question/more powerful data structure.
alloca
, this belongs in the history books. This allocates on the "stack", so there is no advantage at all to prefer that over variable length arrays (VLA) that are the standardized approach. – Melodiemelodionalloca
but not VLAs. (That doesn't necessarily mean I disagree with your point.) Note that both VLAs andalloca
share the flaw that they don't report allocation failures. – Patrizius