How to find the size of an array (from a pointer pointing to the first element array)?
Asked Answered
C

17

397

First off, here is some code:

int main() 
{
    int days[] = {1,2,3,4,5};
    int *ptr = days;
    printf("%u\n", sizeof(days));
    printf("%u\n", sizeof(ptr));

    return 0;
}

Is there a way to find out the size of the array that ptr is pointing to (instead of just giving its size, which is four bytes on a 32-bit system)?

Clactonian answered 29/1, 2009 at 16:33 Comment(8)
I've always used parens with sizeof - sure it makes it look like a function call, but I think it's clearer.Herakleion
Why not? Do you have something against superfluous parentheses? I think it reads a little more easily with them, myself.Crackling
Especially if you're doing something like malloc(sizeof(int) * 4).Herakleion
Heh. I find them cluttering, and ... pointless, since it makes it looks like a function call, which it really isn't. I don't consider that clear.Holytide
@Paul: well .. assuming the left hand side of that call is a pointer to int, I'd write it as int *ptr = malloc(4 * sizeof *ptr); which to me is far clearer. Less parens to read, and bringing the literal constsant to the front, like in maths.Holytide
@Holytide - don't allocate an array of pointers when you meant an array of ints!Herakleion
There is no "pointer pointing to an array" here. Just a pointer pointing to an int.Science
Some compilers do have built ins for this purpose gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.htmlDistend
H
344

No, you can't. The compiler doesn't know what the pointer is pointing to. There are tricks, like ending the array with a known out-of-band value and then counting the size up until that value, but that's not using sizeof().

Another trick is the one mentioned by Zan, which is to stash the size somewhere. For example, if you're dynamically allocating the array, allocate a block one size_t bigger than the one you need, stash the size in the there, and return ptr+sizeof(size_t) as the pointer to the array. When you need the size, decrement the pointer and peek at the stashed value. Just remember to free the whole block starting from the beginning, and not just the array.

Herakleion answered 29/1, 2009 at 16:39 Comment(17)
I'm sorry for this posting a comment so late but if the compiler does not know what the pointer is pointing to how does free know how much memory to clear? I do know that this information is stored internally for functions like free to use. So my question is why can' the compiler do so too?Hermelindahermeneutic
@viki.omega9, because free discovers the size at runtime. The compiler can't know the size because you could make the array a different size depending on runtime factors (command line arguments, contents of a file, phase of moon,etc).Herakleion
Quick follow up, why isn't there a function that can return the size the way free does?Hermelindahermeneutic
Well, if you could guarantee that the function was only called with malloced memory and the library tracks the malloced memory the way most I've seen do (by using an int before the returned pointer) then you could write one. But if the pointer is to a static array or the like, it would fail. Similarly, there is no guarantee that the size of malloced memory is accessible to your program.Herakleion
Isn't this a problem if the pointer isn't inside any allocated memory at all? We would never be able to find the out-of-band value nor would we be able to find the stashed size of the array. Am I off here?Phalange
@RouteMapper, there are lots of problems with the "tricks". That's why I called them tricks. The language doesn't provide robust support for finding out the side of an allocated array, and it's going to be up to you to find the solution that works for your use-case.Herakleion
@PaulTomblin, do you think there's a better way to do it in an intermediate language? Say, LLVM IR?Phalange
@viki.omega9: Another thing to keep in mind is that the size recorded by the malloc/free system may not be the size you asked for. You malloc 9 bytes and get 16. Malloc 3K bytes and get 4K. Or similar situations.Kroon
@PaulTomblin: Can you explain what the difference between days and ptr is? Both of them are integer pointers which contain the address to the first element of the array.Newsworthy
Is it possible if we pass the pointer to entire array like char (*ptr)[size] ? In this case ptr is defined as char (*ptr)[size] = &days;. But I am not sure how we get the size inside the function.Lea
@JonWheelock no, that won't help. The pointer does not retain the size information.Herakleion
@Hermelindahermeneutic If you are using GNU, you can use malloc_usable_size() man7.org/linux/man-pages/man3/malloc_usable_size.3.html There are similar function in other environments.Bursitis
This is exactly why many standard library functions have a separate explicit parameter for the length of a passed-in array - since you actually pass the pointer.Quietude
@Paul, I think you meant to say "allocate a block one size_t bigger than you need (and account for alignment/padding)", since int isn't generally wide enough for that purpose.Woodworker
@TobySpeight correct, but mixing int and size_t in an array could be a mess.Herakleion
Some implementations of malloc have an associated function msize that returns the size of the allocated block (which may be larger than requested). Its non-standard, however, and doesn't work for things not allocated on the malloc heap.Cavesson
Yes, I agree that mixing int and size_t in an array is asking for trouble. Much better to pass them separately, or to define a structure containing the size and a flexible array member.Woodworker
K
117

The answer is, "No."

What C programmers do is store the size of the array somewhere. It can be part of a structure, or the programmer can cheat a bit and malloc() more memory than requested in order to store a length value before the start of the array.

Kroon answered 29/1, 2009 at 16:42 Comment(3)
Thats how pascal strings are implementedZomba
and apparently pascal strings are why excel runs so fast!Intercellular
@Adam: It is fast. I use it in a list of strings implementation of mine. It is super-fast to linear search because it is: load size, prefetch pos+size, compare size to search size, if equal strncmp, move to next string, repeat. It's faster than binary search up to about 500 strings.Kroon
S
57

For dynamic arrays (malloc or C++ new) you need to store the size of the array as mentioned by others or perhaps build an array manager structure which handles add, remove, count, etc. Unfortunately C doesn't do this nearly as well as C++ since you basically have to build it for each different array type you are storing which is cumbersome if you have multiple types of arrays that you need to manage.

For static arrays, such as the one in your example, there is a common macro used to get the size, but it is not recommended as it does not check if the parameter is really a static array. The macro is used in real code though, e.g. in the Linux kernel headers although it may be slightly different than the one below:

#if !defined(ARRAY_SIZE)
    #define ARRAY_SIZE(x) (sizeof((x)) / sizeof((x)[0]))
#endif

int main()
{
    int days[] = {1,2,3,4,5};
    int *ptr = days;
    printf("%u\n", ARRAY_SIZE(days));
    printf("%u\n", sizeof(ptr));
    return 0;
}

You can google for reasons to be wary of macros like this. Be careful.

If possible, the C++ stdlib such as vector which is much safer and easier to use.

Swatch answered 29/1, 2009 at 17:12 Comment(11)
ARRAY_SIZE is a common paradigm used by practical programmers everywhere.Metallic
Yes it is a common paradigm. You still need to use it cautiously though as it is easy to forget and use it on a dynamic array.Swatch
Yes, good point, but the question being asked was about the pointer one, not the static array one.Herakleion
That ARRAY_SIZE macro always works if its argument is an array (i.e. expression of array type). For your so-called "dynamic array", you never get an actual "array" (expression of array type). (Of course, you can't, since array types include their size at compile-time.) You just get a pointer to the first element. Your objection "does not check if the parameter is really a static array" is not really valid, since they are different as one is an array and the other isn't.Science
There is a template function floating around that does the same thing but will prevent the use of pointers.Dibble
@Science I think you are getting caught up in semantics. My point was simply that macros do not have type checking. If you pass a pointer to the macro, you will not get the intended result. If you can use the C++ template equivalent, it is safer -- but the OP asked about C.Swatch
The ARRAY_SIZE macro, with a different name, is used in K&R C 2nd edition. Not a good practice nowadays, but the language itself does not offer a solution.Carder
Perhaps using a more suggestive name like STATIC_ARRAY_SIZE makes it more clear for users.Sigvard
@Sigvard that is a good suggestion, however from a historical (and even current?) standpoint, the macro is still named ARRAY_SIZE in a lot of production code. For example, Google V8 used a much newer an improved version of the macro with #ifdefs for C++, as shown in this link: https://mcmap.net/q/87799/-macro-definition-array_sizeSwatch
ARRAY_SIZE is a bad name. because you would think it is the total size, similar to sizeof. ARRAY_LENGTH would be better.Bursitis
@Bursitis yes, it is not a good name. However the macro has been around for decades. I did not name it. That is the name that was used when I first saw it in the 1990's. I have no idea how long the macro has been around, but it wouldn't surprise me if it was from 1980's or 1970's or possibly even earlier. Current C++ template versions are recommended, unless you happen to be using plain C.Swatch
K
19

There is a clean solution with C++ templates, without using sizeof. The following getSize() function returns the size of any static array:

#include <cstddef>

template<typename T, std::size_t SIZE>
constexpr std::size_t getSize(T (&)[SIZE]) {
    return SIZE;
}

Here is an example with a foo_t structure:

#include <cstddef>
#include <cstdio>

template<typename T, std::size_t SIZE>
constexpr std::size_t getSize(T (&)[SIZE]) {
    return SIZE;
}

struct foo_t {
    int ball;
};

int main()
{
    foo_t foos3[] = {{1},{2},{3}};
    foo_t foos5[] = {{1},{2},{3},{4},{5}};
    std::printf("%u\n", getSize(foos3));
    std::printf("%u\n", getSize(foos5));
}

Output:

3
5
Kurtz answered 19/4, 2012 at 11:53 Comment(5)
I have never seen the notation T (&)[SIZE]. Can you explain what this means? Also you could mention constexpr in this context.Marcin
That's nice if you use c++ and you actually have a variable of an array type. Neither of them is the case in the question: Language is C, and the thing the OP wants to get the array size from is a simple pointer.Needful
would this code lead to code bloat by recreating the same code for every different size/type combination or is that magically optimised out of existence by the compiler?Moreen
@WorldSEnder: That's C++ syntax for a reference of array type (with no variable name, just a size and element-type).Wrac
@user2796283: This function is optimized away entirely at compile time; no magic is needed; it's not combining anything to a single definition, it's simply inlining it away to a compile-time constant. (But in a debug build, yes, you'd have a bunch of separate functions that return different constants. Linker magic might merge ones that use the same constant. The caller doesn't pass SIZE as an arg, it's a template param that has to already be known by the function definition.)Wrac
L
11

As all the correct answers have stated, you cannot get this information from the decayed pointer value of the array alone. If the decayed pointer is the argument received by the function, then the size of the originating array has to be provided in some other way for the function to come to know that size.

Here's a suggestion different from what has been provided thus far,that will work: Pass a pointer to the array instead. This suggestion is similar to the C++ style suggestions, except that C does not support templates or references:

#define ARRAY_SZ 10

void foo (int (*arr)[ARRAY_SZ]) {
    printf("%u\n", (unsigned)sizeof(*arr)/sizeof(**arr));
}

But, this suggestion is kind of silly for your problem, since the function is defined to know exactly the size of the array that is passed in (hence, there is little need to use sizeof at all on the array). What it does do, though, is offer some type safety. It will prohibit you from passing in an array of an unwanted size.

int x[20];
int y[10];
foo(&x); /* error */
foo(&y); /* ok */

If the function is supposed to be able to operate on any size of array, then you will have to provide the size to the function as additional information.

Laughing answered 9/4, 2013 at 17:21 Comment(0)
A
6

For this specific example, yes, there is, IF you use typedefs (see below). Of course, if you do it this way, you're just as well off to use SIZEOF_DAYS, since you know what the pointer is pointing to.

If you have a (void *) pointer, as is returned by malloc() or the like, then, no, there is no way to determine what data structure the pointer is pointing to and thus, no way to determine its size.

#include <stdio.h>

#define NUM_DAYS 5
typedef int days_t[ NUM_DAYS ];
#define SIZEOF_DAYS ( sizeof( days_t ) )

int main() {
    days_t  days;
    days_t *ptr = &days; 

    printf( "SIZEOF_DAYS:  %u\n", SIZEOF_DAYS  );
    printf( "sizeof(days): %u\n", sizeof(days) );
    printf( "sizeof(*ptr): %u\n", sizeof(*ptr) );
    printf( "sizeof(ptr):  %u\n", sizeof(ptr)  );

    return 0;
} 

Output:

SIZEOF_DAYS:  20
sizeof(days): 20
sizeof(*ptr): 20
sizeof(ptr):  4
Aurel answered 13/4, 2011 at 21:4 Comment(0)
H
6

You can do something like this:

int days[] = { /*length:*/5, /*values:*/ 1,2,3,4,5 };
int *ptr = days + 1;
printf("array length: %u\n", ptr[-1]);
return 0;
Hemorrhoidectomy answered 22/6, 2018 at 11:42 Comment(0)
H
5

There is no magic solution. C is not a reflective language. Objects don't automatically know what they are.

But you have many choices:

  1. Obviously, add a parameter
  2. Wrap the call in a macro and automatically add a parameter
  3. Use a more complex object. Define a structure which contains the dynamic array and also the size of the array. Then, pass the address of the structure.
Hamachi answered 25/7, 2016 at 20:50 Comment(1)
Objects know what they are . But if you point to a subobject, there's no way of getting information about the complete object or a larger subobjectJudy
C
3

My solution to this problem is to save the length of the array into a struct Array as a meta-information about the array.

#include <stdio.h>
#include <stdlib.h>

struct Array
{
    int length;

    double *array;
};

typedef struct Array Array;

Array* NewArray(int length)
{
    /* Allocate the memory for the struct Array */
    Array *newArray = (Array*) malloc(sizeof(Array));

    /* Insert only non-negative length's*/
    newArray->length = (length > 0) ? length : 0;

    newArray->array = (double*) malloc(length*sizeof(double));

    return newArray;
}

void SetArray(Array *structure,int length,double* array)
{
    structure->length = length;
    structure->array = array;
}

void PrintArray(Array *structure)
{       
    if(structure->length > 0)
    {
        int i;
        printf("length: %d\n", structure->length);
        for (i = 0; i < structure->length; i++)
            printf("%g\n", structure->array[i]);
    }
    else
        printf("Empty Array. Length 0\n");
}

int main()
{
    int i;
    Array *negativeTest, *days = NewArray(5);

    double moreDays[] = {1,2,3,4,5,6,7,8,9,10};

    for (i = 0; i < days->length; i++)
        days->array[i] = i+1;

    PrintArray(days);

    SetArray(days,10,moreDays);

    PrintArray(days);

    negativeTest = NewArray(-5);

    PrintArray(negativeTest);

    return 0;
}

But you have to care about set the right length of the array you want to store, because the is no way to check this length, like our friends massively explained.

Copolymerize answered 5/10, 2015 at 14:45 Comment(0)
D
2

This is how I personally do it in my code. I like to keep it as simple as possible while still able to get values that I need.

typedef struct intArr {
    int size;
    int* arr; 
} intArr_t;

int main() {
    intArr_t arr;
    arr.size = 6;
    arr.arr = (int*)malloc(sizeof(int) * arr.size);

    for (size_t i = 0; i < arr.size; i++) {
        arr.arr[i] = i * 10;
    }

    return 0;
}
Doloroso answered 14/3, 2021 at 18:51 Comment(2)
Prefer size_t to store the size.Mcclelland
That's a really good&simple approach! BTW intArr after struct can be omitted. Also a shorter and more readable way of writing the malloc line would be arr.arr = malloc(arr.size * sizeof *arr.arr); which makes it more reusable because you don't need to specify "int".Viki
H
1

No, you can't use sizeof(ptr) to find the size of array ptr is pointing to.

Though allocating extra memory(more than the size of array) will be helpful if you want to store the length in extra space.

Hematuria answered 15/3, 2016 at 11:22 Comment(0)
V
1
int main() 
{
    int days[] = {1,2,3,4,5};
    int *ptr = days;
    printf("%u\n", sizeof(days));
    printf("%u\n", sizeof(ptr));

    return 0;
}

Size of days[] is 20 which is no of elements * size of it's data type. While the size of pointer is 4 no matter what it is pointing to. Because a pointer points to other element by storing it's address.

Vadose answered 10/7, 2016 at 10:32 Comment(1)
sizeof(ptr) is the size of pointer and sizeof(*ptr) is the size of pointer to whichOlivo
M
1

In strings there is a '\0' character at the end so the length of the string can be gotten using functions like strlen. The problem with an integer array, for example, is that you can't use any value as an end value so one possible solution is to address the array and use as an end value the NULL pointer.

#include <stdio.h>
/* the following function will produce the warning:
 * ‘sizeof’ on array function parameter ‘a’ will
 * return size of ‘int *’ [-Wsizeof-array-argument]
 */
void foo( int a[] )
{
    printf( "%lu\n", sizeof a );
}
/* so we have to implement something else one possible
 * idea is to use the NULL pointer as a control value
 * the same way '\0' is used in strings but this way
 * the pointer passed to a function should address pointers
 * so the actual implementation of an array type will
 * be a pointer to pointer
 */
typedef char * type_t; /* line 18 */
typedef type_t ** array_t;
int main( void )
{
    array_t initialize( int, ... );
    /* initialize an array with four values "foo", "bar", "baz", "foobar"
     * if one wants to use integers rather than strings than in the typedef
     * declaration at line 18 the char * type should be changed with int
     * and in the format used for printing the array values 
     * at line 45 and 51 "%s" should be changed with "%i"
     */
    array_t array = initialize( 4, "foo", "bar", "baz", "foobar" );

    int size( array_t );
    /* print array size */
    printf( "size %i:\n", size( array ));

    void aprint( char *, array_t );
    /* print array values */
    aprint( "%s\n", array ); /* line 45 */

    type_t getval( array_t, int );
    /* print an indexed value */
    int i = 2;
    type_t val = getval( array, i );
    printf( "%i: %s\n", i, val ); /* line 51 */

    void delete( array_t );
    /* free some space */
    delete( array );

    return 0;
}
/* the output of the program should be:
 * size 4:
 * foo
 * bar
 * baz
 * foobar
 * 2: baz
 */
#include <stdarg.h>
#include <stdlib.h>
array_t initialize( int n, ... )
{
    /* here we store the array values */
    type_t *v = (type_t *) malloc( sizeof( type_t ) * n );
    va_list ap;
    va_start( ap, n );
    int j;
    for ( j = 0; j < n; j++ )
        v[j] = va_arg( ap, type_t );
    va_end( ap );
    /* the actual array will hold the addresses of those
     * values plus a NULL pointer
     */
    array_t a = (array_t) malloc( sizeof( type_t *) * ( n + 1 ));
    a[n] = NULL;
    for ( j = 0; j < n; j++ )
        a[j] = v + j;
    return a;
}
int size( array_t a )
{
    int n = 0;
    while ( *a++ != NULL )
        n++;
    return n;
}
void aprint( char *fmt, array_t a )
{
    while ( *a != NULL )
        printf( fmt, **a++ );   
}
type_t getval( array_t a, int i )
{
    return *a[i];
}
void delete( array_t a )
{
    free( *a );
    free( a );
}
Manichaeism answered 11/3, 2017 at 11:56 Comment(2)
Your code is full of comments, but I think it would make everything easier if you added some general explanation of how this works outside of code, as normal text. Can you please edit your question and do it? Thank you!Ridinger
Creating an array of pointers to each element so you can linear-search it for NULL is probably the least efficient alternative imaginable to just storing a separate size directly. Especially if you actually use this extra layer of indirection all the time.Wrac
T
1
#include <stdio.h>
#include <string.h>
#include <stddef.h>
#include <stdlib.h>

#define array(type) struct { size_t size; type elem[0]; }

void *array_new(int esize, int ecnt)
{
    size_t *a = (size_t *)malloc(esize*ecnt+sizeof(size_t));
    if (a) *a = ecnt;
    return a;
}
#define array_new(type, count) array_new(sizeof(type),count)
#define array_delete free
#define array_foreach(type, e, arr) \
    for (type *e = (arr)->elem; e < (arr)->size + (arr)->elem; ++e)

int main(int argc, char const *argv[])
{
    array(int) *iarr = array_new(int, 10);
    array(float) *farr = array_new(float, 10);
    array(double) *darr = array_new(double, 10);
    array(char) *carr = array_new(char, 11);
    for (int i = 0; i < iarr->size; ++i) {
        iarr->elem[i] = i;
        farr->elem[i] = i*1.0f;
        darr->elem[i] = i*1.0;
        carr->elem[i] = i+'0';
    }
    array_foreach(int, e, iarr) {
        printf("%d ", *e);
    }
    array_foreach(float, e, farr) {
        printf("%.0f ", *e);
    }
    array_foreach(double, e, darr) {
        printf("%.0lf ", *e);
    }
    carr->elem[carr->size-1] = '\0';
    printf("%s\n", carr->elem);

    return 0;
}
Taiwan answered 18/3, 2022 at 15:12 Comment(0)
B
0
 #define array_size 10

 struct {
     int16 size;
     int16 array[array_size];
     int16 property1[(array_size/16)+1]
     int16 property2[(array_size/16)+1]
 } array1 = {array_size, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

 #undef array_size

array_size is passing to the size variable:

#define array_size 30

struct {
    int16 size;
    int16 array[array_size];
    int16 property1[(array_size/16)+1]
    int16 property2[(array_size/16)+1]
} array2 = {array_size};

#undef array_size

Usage is:

void main() {

    int16 size = array1.size;
    for (int i=0; i!=size; i++) {

        array1.array[i] *= 2;
    }
}
Berkelium answered 11/3, 2014 at 8:27 Comment(0)
B
0

Most implementations will have a function that tells you the reserved size for objects allocated with malloc() or calloc(), for example GNU has malloc_usable_size()

However, this will return the size of the reversed block, which can be larger than the value given to malloc()/realloc().


Bursitis answered 3/8, 2021 at 17:46 Comment(0)
P
0

There is a popular macro, which you can define for finding number of elements in the array (Microsoft CRT even provides it OOB with name _countof):

#define countof(x) (sizeof(x)/sizeof((x)[0]))

Then you can write:

int my_array[] = { ... some elements ... };
printf("%zu", countof(my_array)); // 'z' is correct type specifier for size_t
Plumage answered 9/12, 2021 at 19:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.