Is there any overhead for using variable-length arrays?
Asked Answered
M

4

51

Is there some overhead of using variable-length arrays? Could the size of array be passed via command line argument at run time? Why is it introduced, compared to automatic and dynamically allocating an array?

Manned answered 9/1, 2010 at 19:55 Comment(0)
H
52

VLA does have some overhead (compared to "ordinary" named compile-time-sized array).

Firstly, it has run-time length and yet the language provides you with means to obtain the actual size of the array at run-time (using sizeof). This immediately means that the actual size of the array has to be stored somewhere. This results in some insignificant per-array memory overhead. However, since VLAs can only be declared as automatic objects, this memory overhead is not something anyone would ever notice. It is just like declaring an extra local variable of integral type.

Secondly, VLA is normally allocated on stack, but because of its variable size, in general case its exact location in memory is not known at compile time. For this reason the underlying implementation usually has to implement it as a pointer to a memory block. This introduces some additional memory overhead (for the pointer), which is again completely insignificant for the reasons described above. This also introduces slight performance overhead, since we have to read the pointer value in order to find the actual array. This is the same overhead you get when accessing malloc-ed arrays (and don't get with the named compile-time-sized arrays).

Since the size of the VLA is a run-time integer value, it can, of course, be passed as a command-line argument. VLA doesn't care where its size comes from.

VLA were introduced as run-time-sized arrays with low allocation/deallocation cost. They fit between "ordinary" named compile-time-sized arrays (which have virtually zero allocation-deallocation cost, but fixed size) and malloc-ed arrays (which have run-time size, but relatively high allocation-deallocation cost).

VLA obey [almost] the same scope-dependent lifetime rules as automatic (i.e local) objects, which means that in general case they can't replace malloc-ed arrays. They applicability is limited to situations when you need a quick run-time sized array with a typical automatic lifetime.

Hilton answered 9/1, 2010 at 22:56 Comment(8)
VLAs actually obey almost the same lifetime rules as other automatic objects ("from the declaration of the [VLA] until execution of the program leaves the scope of the declaration" vs. "from entry into the block with which [the object] is associated until execution of that block ends in any way") [from 6.2.4(5) and 6.2.4(6) of the C99 standard].Raindrop
"VLA is normally allocated on stack," -- Normally? Do you mean that it might be allocated on the heap?Singapore
@Cool Guy: I mean that the language spec does not specify where they are allocated and does not even postulate the existence of "stack", for which reason I normally prefer to add various weasel-words every time I talk about something that is formally an implementation detail.Hilton
Once after allocated, is there any difference for malloc( ) allocated variable vs alloca( ) allocated variable? For example, load/write the variablesCoggins
@dragonxlwang: Once allocated, there's no difference. (Aside from such considerations as memory locality: alloca allocates memory "right here in the stack" next to other local variables, while malloc allocates memory "somewhere far away, in the heap".)Hilton
@Spikatrix, yes, dynamic VLA can be constructed with int (*a)[n] = malloc(sizeof(int[n]));Rohrer
@Spikatrix: https://mcmap.net/q/298164/-what-39-s-the-point-of-vla-anywayHilton
Are you sure about your second point (VLAs not necessarily being allocated on the stack)? I've found a simple example where a large VLA allocation silently overwrites stack memory of another local variable: softwareengineering.stackexchange.com/a/144004/366046Magdalenamagdalene
M
28

There is some run-time overhead with variable-length arrays, but you would have to be working fairly hard to measure it. Note that sizeof(vla) is not a compile-time constant if vla is a variable-length array.

The size of the array can be passed to a function at run-time. If you choose to take the size from a command line argument and convert that into an integer and pass that to the function at run-time, so be it -- it will work.

Variable-length arrays are used because the variables are automatically allocated to the correct size and automatically freed on exit from the function. This avoids over-allocating space (allocating enough space for the maximum possible size when you mostly work with minimal sizes), and avoids problems with memory clean up.

Additionally, with multi-dimensional arrays, AFAIK it behaves more like Fortran - you can dynamically configure all the dimensions, rather than being stuck with fixed sizes for all but the leading dimension of the array.


Concrete evidence of some run-time overhead for VLA - at least with GCC 4.4.2 on SPARC (Solaris 10).

Consider the two files below:

vla.c - using a variable-length array

#include <assert.h>
#include <stddef.h>
extern size_t identity_matrix(int n, int m);

size_t identity_matrix(int n, int m)
{
    int vla[n][m];
    int i, j;
    assert(n > 0 && n <= 32);
    assert(m > 0 && m <= 32);
    for (i = 0; i < n; i++)
    {
        for (j = 0; j < m; j++)
        {
            vla[i][j] = 0;
        }
        vla[i][i] = 1;
    }
    return(sizeof(vla));
}

fla.c - using a fixed-length array

#include <assert.h>
#include <stddef.h>
extern size_t identity_matrix(int n, int m);

size_t identity_matrix(int n, int m)
{
    int fla[32][32];
    int i, j;
    assert(n > 0 && n <= 32);
    assert(m > 0 && m <= 32);
    for (i = 0; i < n; i++)
    {
        for (j = 0; j < m; j++)
        {
            fla[i][j] = 0;
        }
        fla[i][i] = 1;
    }
    return(sizeof(fla));
}

Compilation and object file sizes

For comparison purposes, the names of the local array are different (vla vs fla), and the dimensions on the array are different when it is declared - otherwise, the files are the same.

I compiled using:

$ gcc -O2 -c -std=c99 fla.c vla.c

The object file sizes are somewhat different - as measured both by 'ls' and by 'size':

$ ls -l fla.o vla.o
-rw-r--r--   1 jleffler rd          1036 Jan  9 12:13 fla.o
-rw-r--r--   1 jleffler rd          1176 Jan  9 12:13 vla.o
$ size fla.o vla.o
fla.o: 530 + 0 + 0 = 530
vla.o: 670 + 0 + 0 = 670

I've not done extensive testing to see how much of the overhead is fixed and how much is variable, but there is overhead in using a VLA.

Metamorphose answered 9/1, 2010 at 20:2 Comment(1)
The line "vla[i][i] = 1;" needs an extra assert(n == m). Better would be to put "vla[i][j] = ? i==j ? 1: 0; " in the inner loop. YMMV.Compassionate
A
4

I just wonder if there is some overhead of using variable-length arrays?

Nope

Can the size of array could be passed via command line argument at run time?

Yes.

Why is it introduced, compared to automatic and dynamically allocating an array?

Automatic allocated only allows a fixed size known at compile time.

Dynamically allocating (malloc) will store the array on the heap, which has a large memory space, but is slower to access.

VLA works by placing the array in the stack. This makes allocation and access extremely fast, but the stack is usually small (of a few KB), and when the VLA overflowed the stack, it's indistinguishable from an infinite recursion.

Augustaaugustan answered 9/1, 2010 at 20:2 Comment(10)
Wow - a dead heat for the timing on our answers!Metamorphose
And, see my (amended) answer for illustration that there is some run-time overhead for using VLAs, at least in some compiler implementations (using GCC 4.4.2 on Sun SPARC and Solaris 10 as a specific example).Metamorphose
There's no reason to think that the heap is slower to access. Allocation and deallocation are slower than stack allocation and deallocation (which just require adjusting the stack pointer), but once an object is allocated, it's just another object in memory.Assumption
@KeithThompson: Hm, memory caching?Augustaaugustan
(How) can you figure out the maximum permissible size for a VLA, and what happens if you exceed it? (Standard references welcome.)Cortney
@KerrekSB: There are two cases: on the stack and on the 'heap'. On the stack, you'll be limited by the size of the stack, which is typically between 1 and 8 MiB by default (but you might be able set the limit bigger). With heap allocation, the data storage is limited like other dynamically allocated memory — the limit is usually quite large (but less than 4 GiB on a 32-bit system).Metamorphose
@JonathanLeffler: Well, yes, no doubt, but I mean how can you figure it out in the program so when the user enters n you can decide whether int a[n]; will fit. But that's a fundamental problem of C -- you cannot even tell whether you'll be able to successfully call a function. The implementation limits aren't queryable.Cortney
@JonathanLeffler: Wait what, "onl the heap"? How do you put a VLA "on the heap"?Cortney
@KerrekSB: You can create a VLA on the heap as shown in my answer to calloc() for an array with negative index in C — specifically, the code marked 2dv.c.Metamorphose
@JonathanLeffler: Hah. C is so weird :-) But your code is just a bad cast. If you wanted to do it properly, it'd have to be something like malloc(sizeof(int[x][y])), non?Cortney
K
1

There should be very little overhead for VLAs (At most it should result in an addition to the stack pointer). Dynamic allocation requires manual memory management and is slower than stack-based allocation of a VLA, and "automatic" declaration of an array requires a compile-time expression for the array size. However, keep in mind that if a stack overflow occurs, it will cause undefined behavior, so keep VLAs relatively small.

You could pass the size of an array via a command-line argument, but you would have to write the code to handle that yourself.

Kibler answered 9/1, 2010 at 20:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.