What is the safe way to fill multidimensional array using std::fill?
Asked Answered
T

6

20

Here is what I am using:

class something
{
   char flags[26][80];
} a;

std::fill(&a.flags[0][0], &a.flags[0][0] + 26 * 80, 0);

(Update: I should have made it clear earlier that I am using this inside a class.)

Tourane answered 16/10, 2010 at 9:14 Comment(0)
C
38

The simple way to initialize to 0 the array is in the definition:

char flags[26][80] = {};

If you want to use std::fill, or you want to reset the array, I find this a little better:

char flags[26][80];
std::fill( &flags[0][0], &flags[0][0] + sizeof(flags) /* / sizeof(flags[0][0]) */, 0 );

The fill expressed in terms of the array size will allow you to change the dimensions and keep the fill untouched. The sizeof(flags[0][0]) is 1 in your case (sizeof(char)==1), but you might want to leave it there in case you want to change the type at any point.

In this particular case (array of flags --integral type) I could even consider using memset even if it is the least safe alternative (this will break if the array type is changed to a non-pod type):

memset( &flags[0][0], 0, sizeof(flags) );

Note that in all three cases, the array sizes are typed only once, and the compiler deduces the rest. That is a little safer as it leaves less room for programmer errors (change the size in one place, forget it in the others).

EDIT: You have updated the code, and as it is it won't compile as the array is private and you are trying to initialize it externally. Depending on whether your class is actually an aggregate (and want to keep it as such) or whether you want to add a constructor to the class you can use different approaches.

const std::size_t rows = 26;
const std::size_t cols = 80;
struct Aggregate {
   char array[rows][cols];
};
class Constructor {
public:
   Constructor() {
      std::fill( &array[0][0], &array[rows][0], 0 ); // [1]
      // memset( array, 0, sizeof(array) );
   }
private:
   char array[rows][cols];
};
int main() {
   Aggregate a = {};
   Constructor b;
}

Even if the array is meant to be public, using a constructor might be a better approach as it will guarantee that the array is properly initialized in all instances of the class, while the external initialization depends on user code not forgetting to set the initial values.

[1] As @Oli Charlesworth mentioned in a comment, using constants is a different solution to the problem of having to state (and keep in synch) the sizes in more than one place. I have used that approach here with a yet different combination: a pointer to the first byte outside of the bidimensional array can be obtained by requesting the address of the first column one row beyond the bidimensional array. I have used this approach just to show that it can be done, but it is not any better than others like &array[0][0]+(rows*cols)

Charqui answered 16/10, 2010 at 9:21 Comment(32)
+1. However, an alternative to using sizeof is to #define (or const int) the dimensions. This has the added advantage that it will work if you've passed flags as a function argument, so that it's decayed to a pointer so sizeof will no longer give the correct result.Mucky
I'm not sure I can use the initializing braces because the 2d array is inside a class. I'll give you a chance to update before I accept.Tourane
@DavidRodriguez: Can you explain why a similar syntax i.e. fill(&arr[0], &arr[0] + sizeof(arr), 0) doesn't work for a 1D array?Nehru
@DhruvMullick: The sizeof operator gives you the size in bytes, not the number of elements, if arr holds anything such that sizeof(*arr) != 1 the above has undefined behavior and will probably crash (try to access well beyond the end of the array). Otherwise (i.e. if the array holds [unsigned|signed] char, it should be workingWorden
Actually I didn't know that in the syntax for fill, we use the number of elements.Nehru
@DhruvMullick: you can use cppreference to check the documentation for the standard library components. fill does not take the number of elements, it takes two iterators, to obtain the second iterator you need to do some arithmetic, but that is always in terms of elements not bytes.Worden
quite old answer, however, shouldnt it be &array[rows][cols] as the second argument to fill ?Junko
@tobi303: I don't think so. &array[rows][cols] is one whole row beyond the last element in the array, not just one element past.Worden
@DavidRodríguez-dribeas I just realized how stupid I am ;) of course your version is right and I was just a bit confusedJunko
I like this answer, and I agree that it's likely to work on any real system. But is it guaranteed to work? It seems to me that flags[0][0] is an element of the 80-dimension array flags[0], hence &flags[0]0] is a pointer into that array, whereas &flags[0][0] + sizeof(flags) yields a pointer that is (more than one element) past the end of that array. What in the standard explicitly allows such arithmetic here? And what allows std::fill to take iterators to different containers (flags[0] and flags[25])?Baucom
@JoshuaGreen: The standard guarantees that you can obtain (but not dereference) a pointer one beyond any array. Furthermore, there is a clause that allows using a single object, for the purpose of pointer arithmetic, as if it was an array of a single element for the purpose of obtaining a pointer one beyond the object.Worden
@DavidRodriguez, are you saying that &flags[0][0] returns a pointer to an object that can be considered an element of an array of 26 * 80 elements and that pointer arithmetic can be taken "with respect to" this big, 1-dimensional array? I can believe that, but I'd be very interesting in seeing the clause that spells this out.Baucom
@DavidRodrigues, I don't believe what we need here is covered by the cases mentioned in your comment.Baucom
@JoshuaGreen: I misunderstood your question as relating to the access one beyond the end of an array, which is what those clauses guarantee. On the question you now have, there are no bidimensional arrays in C++. The different rules dealing with pointer arithmetic on the outer and inner arrays indirectly guarantee that the above is valid. The size of an array of N elements is N times the size of the element (no external padding) That guarantees that the first element in the second inner array is located at the address of one beyond the last element in the first.Worden
... if you apply the same reasoning to the last element in the outer array you end up with the address of one beyond the last inner array being exactly the same as the address of one beyond the last outer array.Worden
@DavidRodríguez-dribeas, I still don't see why I'm allowed to start with an element of an 80-long array (flags[0][0], an element of flags[0]), take its address (&flags[0][0]), and then add 2080 ((sizeof flags)/(sizeof flags[0][0])) to it.Baucom
@Joshua: &flags[0] + 25 is a pointer to the beginning of the last innner array of 80 characters, right? And it is 2000 chars into the array. Given the last inner array of 80 characters, char (&last)[80] = flags[25], you can obtain a pointer one beyond the end of this one array by offsetting by sizeof last / sizeof *last, that is, 80 characters or a total of 2080 characters. You can do the math this long way, or you can do it the short wayWorden
It is the "and it is 2000 characters into the array" part that I'm not sure of. Yes, if you're allowed to treat flags as one big char array of length 26 * 80, then everything works. Alas, you haven't defined such an array. Rather, you've defined an array of arrays, and arithmetic is taking you from one small array into another. Such array arithmetic normally yields undefined behavior, e.g., char a[10]; char b[10]; char *p = a + 11; is undefined even if a and b are consecutive in memory. I can believe the arithmetic is OK for flags, but what in the standard guarantees it?Baucom
@JoshuaGreen: The standard guarantees that an array of N elements of type T takes sizeof(T) * N bytes, and it also guarantees that the each element is sizeof(T) bytes beyond the previous element. In this case the type T is char[80]. This is fundamental and the basis for pointer arithmetic.Worden
The above arithmetic isn't staying inside one array. flags[0][0] is an element of the 80-element array flags[0] but &flags[0][0] + sizeof(flags) takes us well outside of flags[0]. Yes, it would be natural for this to take us into/through the other arrays in flags (flags[1], flags[2], flags[3], etc.) and yes, the standard guarantees the memory layout makes this sensible, but I still don't see what guarantees that this arithmetic is valid when arithmetic taking you outside your starting array normally yields undefined behavior.Baucom
@JoshuaGreen: taking you outside your starting array The key word here is not starting array, but complete object. I am not sure how else to put it out... do the math working only on the outer array (which is guaranteed) up until the last inner array, then use pointer arithmetic on that array to jump to one past the end. A pointer so obtained is kosher and it is exactly sizeof flags bytes beyond &flags[0][0]. The operation is allowed because all that pointer arithmetic is done within a single complete object.Worden
@dribeas, I'm not finding any clear quotes. On StackOverflow I found "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined." &flags[0][0] points to an element of flags[0] while &flags[0][0] + sizeof flags does not point to an element of (or one past the last element of) flags[0]. Maybe flags[0][0] can be considered an element of the array flags, but neither the definition nor the notation clearly support that.Baucom
@dribeas, to gain a better understanding of my concerns, consider what would happen if we instead defined std::array<char, 80> flags[26];. Would the arithmetic be well-defined here? I think the answer is "No" and I don't see why the case char flags[26][80]; must clearly be different, though it might very well be handled specially in the standard.Baucom
FWIW, the discussion here seems to suggest that the arithmetic might be well-defined but only because the data type is char.Baucom
@JoshuaGreen: the discussion would be slightly more involved, since I have omitted sizeof(T) everywhere (knowing that it is 1 for char). But the arithmetic is sound regardless of the typeWorden
@dribeas, the discussion there seems to differ. One argument is that treating the array of arrays of T as a single array of T violates the strict-aliasing rule (for which there's a special exception for T == (unsigned) char). I'm personally not convinced that that's the only problem with the arithmetic -- other parts of the standard seem to be rather strict as well -- but it appears to be at least one issue that must be considered.Baucom
@JoshuaGreen: I find that a lack of understanding. a[n] is not reading the array as an array of one dimension, the [] operator has nothing to do with the type from which the pointer was obtained and is strictly a different representation of *(a+n). There is no aliasing here at all. The user obtained a pointer to an element (exact type) and uses pointer arithmetic to reach other objects of the same type, not a different type that aliases the original.Worden
@dribeas, the standard labels stepping (far enough) outside an array undefined behavior. I worry that an implementation could keep track of the valid range of any pointer (based on the array it came from) and penalize violations of this restriction. Above, the compiler could note that flags[0][0] is an element of flags[0] which contains 80 elements, hence it could penalize you for stepping further than that. I doubt any real implementation does this, and gcc apparently guarantees not to, but why couldn't some other implementation work this way?Baucom
@dribeas, I can believe that you're right, that multidimensional arrays are treated specially in this regard. (I don't think an array of std::arrays could be flattened in this manner.) However, I have yet to find a quote from the standard that justifies this belief.Baucom
what about a 3 dimensional array?Acquittance
@akashchandrakar I know, I am a bit late. Still... I have found this post only talks about the specific issue in the OP's code rather having a generalized answer for the Question actually asked (i.e. "What is the safe way to fill multidimensional array using std::fill?"). Therefore added a new answer, which also solves for 3-d array fill casesRascality
@Rascality This can't be done with multidimensional arrays and this can now be shown using core constant evaluation which is very picky about detecting undefined behavior Here's the proof. You can only call std::fill with pointers over the innermost array range.Randolf
R
5

What is the safe way to fill multidimensional array using std::fill?

The easy default initialization would be using braced inilization.

char flags[26][80]{};

The above will initialize all the elements in the flags to default char.


2-D Array filling using std::fill or std::fill_n

However, in order to provide different value to initialize the above is not enough. The options are std::fill and std::fill_n. (Assuming that the array flags is public in your class)

std::fill(
   &a.flags[0][0],
   &a.flags[0][0] + sizeof(a.flags) / sizeof(a.flags[0][0]),
   '0');

// or using `std::fill_n`
// std::fill_n(&a.flags[0][0], sizeof(a.flags) / sizeof(a.flags[0][0]), '1');

To generalize this for any 2d-array of any type with any initializing value, I would suggest a templated function as follows. This will also avoid the sizeof calculation of the total elements in the array.

#include <algorithm> // std::fill_n, std::fill
#include <cstddef>   // std::size_t

template<typename Type, std::size_t M, std::size_t N>
constexpr void fill_2D_array(Type(&arr2D)[M][N], const Type val = Type{}) noexcept
{
   std::fill_n(&arr2D[0][0], M * N, val);
   // or using std::fill
   // std::fill(&arr2D[0][0], &arr2D[0][0] + (M * N ), val);
}

Now you can initialize your flags like

fill_2D_array(a.flags, '0'); // flags should be `public` in your class!

(See Live Online)


3-D Array filling using std::fill or std::fill_n

Adding one more non-template size parameter to the above template function, this can be brought to 3d-arrays as well

#include <algorithm> // std::fill_n
#include <cstddef>   // std::size_t

template<typename Type, std::size_t M, std::size_t N, std::size_t O>
constexpr void fill_3D_array(Type(&arr3D)[M][N][O], const Type val = Type{}) noexcept
{
   std::fill_n(&arr3D[0][0][0], M * N * O, val);
}

(See Live Online)

Rascality answered 8/8, 2020 at 12:18 Comment(0)
P
2

it is safe, a two-dimensional array is an array of arrays. Since an array occupied contiguous storage, so the whole multidimensional thing will too. So yeah, it's OK, safe and portable. Assuming you are NOT asking about style, which is covered by other answers (since you're using flags, I strongly recommend std::vector<std::bitset<80> > myFlags(26))

Purkey answered 16/10, 2010 at 9:17 Comment(4)
Are you certain the bitset would be suitable? I'm just storing either zero or one inside each space in he 2d array. These flags are used to track which positions on the console have been updated with my floodfill routine.Tourane
@Truncheon: you store 8 values in a char, right? And in order to get, say, 7th value in the 6th char you must do some bit shifting/anding etc. The bitset will do it for you. That is, it will store each flag in a bit, and you can set/unset each flag by its index without worrying about bit patterns. The only downside is that the bitset's size must be a compile-time constant. If you are familiar with boost, they have dynamic_bitset which is pretty much what its name is.Purkey
I believe that he is storing a single bit in each byte. That means that using a bitset of 26*80 positions and the appropriate (row,col)->index algebra the use of a single bitset will be more efficient in memory consumption.Worden
@David: It will definitely be more efficient in terms of memory, but readability will suffer, that is he will have to write btst[numCols*row+col] which is less readable compared to v[row][col]. The decision is up to the OP and is dependent on his priorities.Purkey
D
1
char flags[26][80];
std::fill((char*)flags, (char*)flags + sizeof(flags)/sizeof(char), 0);
Dissonance answered 2/9, 2018 at 4:32 Comment(0)
T
0

Is char[80] supposed to be a substitute for a real string type? In that case, I recommend the following:

std::vector<std::string> flags(26);
flags[0] = "hello";
flags[1] = "beautiful";
flags[2] = "world";
// ...

Or, if you have a C++ compiler that supports initialization lists, for example a recent g++ compiler:

std::vector<std::string> flags { "hello", "beautiful", "world" /* ... */ };
Tact answered 16/10, 2010 at 10:25 Comment(1)
This doesn't try to answer the question, just gives advice that nobody asked for.Lagas
R
0

The problem isn't that you can't fill a 2D array by passing appropriate pointer(s) to a function. It's pretty clear that if you write a function that, for instance, takes a T* and a count of rows*cols and assumes a contiguous set of T's, it will compile and happily fill the 2D array. The function doesn't know anything about the structure of the 2D array. It just gets a pointer and length.

The problem is the compiler knows the structure. And it knows that if you pass a pointer to an element in the 2D array, the only values that can be modified according to the language, are the ones in the sub-array that the pointer addressed. So the compiler can assume all other values outside of that subarray will not be altered. The language guarantees that. And compiler optimizers can take advantage to speed up code.

This is similar to the problem of signed int overflow. While signed overflow is well defined in two's complement arithmetic, compilers are free to assume it never happens because it is prohibited by the language. This opens up optimization opportunities and compilers take advantage to speed up code.

So no, You can't use std::fill to initialize the contents of a multi-dimensional array. The C++ language is rather strict on what pointers are allowed to point to. Even without de-referencing them! The following code shows std::fill failing when trying to initialize the entire array. It also shows that simply incrementing the pointer to one past the last element is UB.

#include <algorithm>

constexpr bool foo()
{
    char matrix[2][10]{};
    std::fill(&matrix[0][0], &matrix[0][10], 0);    // works
    std::fill(&matrix[1][0], &matrix[1][10], 0);    // works
    // std::fill(&matrix[0][0], &matrix[1][10], 0);    // fails, UB detected
    char* pc = &matrix[0][10];  // Legal. points to one past the last element of the lower, 10 char array.
    // pc = &matrix[0][10]+1;  // UB. points to two past the last element of the lower, 10 char array.
    return true;
}

constexpr bool x = foo();

int main()
{
    return x;
}

Code in Compiler Explorer

Randolf answered 12/7 at 4:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.