Array of non-contiguous objects

#include <iostream> #include <cstring> // This struct is not guaranteed to occupy contiguous storage // in the sense of the C++ Object model (§1.8.5): struct separated { int i; separated(int a, int b){i=a; i2=b;} ~separated(){i=i2=-1;} // nontrivial destructor --> not trivially copyable private: int i2; // different access control --> not standard layout }; int main() { static_assert(not std::is_standard_layout<separated>::value,"sl"); static_assert(not std::is_trivial<separated>::value,"tr"); separated a[2]={{1,2},{3,4}}; std::memset(&a[0],0,sizeof(a[0])); std::cout<<a[1].i; // No guarantee that the previous line outputs 3. } // compiled with Debian clang version 3.5.0-10, C++14-standard // (outputs 3)

1. This is an instance of Occam's razor as adopted by the dragons that actually write compilers: Do not give more guarantees than needed to solve the problem, because otherwise your workload will double without compensation. Sophisticated classes adapted to fancy hardware or to historic hardware were part of the problem. (hinting by BaummitAugen and M.M)

2. (contiguous=sharing a common border, next or together in sequence)

First, it is not that objects of type T either always or never occupy contiguous storage. There may be different memory layouts for the same type within a single binary.

[class.derived] §10 (8): A base class subobject might have a layout different from ...

This would be enough to lean back and be satisfied that what is happening on our computers does not contradict the standard. But let's amend the question. A better question would be:

Does the standard permit arrays of objects that do not occupy contiguous storage individually, while at the same time every two successive subobjects share a common border?

If so, this would influence heavily how char* arithmetic relates to T* arithmetic.

Depending on whether you understand the OP standard quote meaning that only the subobjects share a common border, or that also within each subobject, the bytes share a common border, you may arrive at different conclusions.

Assuming the first, you find that 'contiguously allocated' or 'stored contiguously' may simply mean &a[n]==&a[0] + n (§23.3.2.1), which is a statement about subobject addresses that would not imply that the array resides within a single sequence of contiguous bytes.

If you assume the stronger version, you may arrive at the 'element offset==sizeof(T)' conclusion brought forward in T* versus char* pointer arithmetic That would also imply that one could force otherwise possibly non-contiguous objects into a contiguous layout by declaring them T t[1]; instead of T t;

Now how to resolve this mess? There is a fundamentally ambiguous definition of the sizeof() operator in the standard that seems to be a relict of the time when, at least per architecture, type roughly equaled layout, which is not the case any more. (How does placement new know which layout to create?)

When applied to a class, the result [of sizeof()] is the number of bytes in an object of that class including any padding required for placing objects of that type in an array. [expr.sizeof] §5.3.3 (2)

But wait, the amount of required padding depends on the layout, and a single type may have more than one layout. So we're bound to add a grain of salt and take the minimum over all possible layouts, or do something equally arbitrary.

Finally, the array definition would benefit from a disambiguation in terms of char* arithmetic, in case this is the intended meaning. Otherwise, the answer to question 1 applies accordingly.

A few remarks related to now deleted answers and comments: As is discussed in Can technically objects occupy non-contiguous bytes of storage?, non-contiguous objects actually exist. Furthermore, memseting a subobject naively may invalidate unrelated subobjects of the containing object, even for perfectly contiguous, trivially copyable objects:

#include <iostream>
#include <cstring>
struct A {
  private: int a;
  public: short i;
};
struct B :  A {
  short i;
};
int main()
{
   static_assert(std::is_trivial<A>::value , "A not trivial.");
   static_assert(not std::is_standard_layout<A>::value , "sl.");
   static_assert(std::is_trivial<B>::value , "B not trivial.");
   B object;
   object.i=1;
   std::cout<< object.B::i;
   std::memset((void*)&(A&)object ,0,sizeof(A));
   std::cout<<object.B::i;
}
// outputs 10 with g++/clang++, c++11, Debian 8, amd64

Therefore, it is conceivable that the memset in the question post might zero a[1].i, such that the program would output 0 instead of 3.

There are few occasions where one would use memset-like functions with C++-objects at all. (Normally, destructors of subobjects will fail blatantly if you do that.) But sometimes one wishes to scrub the contents of an 'almost-POD'-class in its destructor, and this might be the exception.

Recommended topics

Hot tags