How to control the size of a class to be a multiple of the size of a member?
Asked Answered
M

3

7

I have the following class,

struct employee {
    std::string name;
    short salary;
    std::size_t age;
};

Just as an example, in Linux amd64, the size of the struct is 48 bytes, and the size of std::string is 32, that is, not a multiple.

Now, I need, in a cross-platform way, for employee to have a size that is a multiple of the size of std::string (first member).

(Cross-platform could mean, for example, both Linux amd64 and Apple ARM.)

That is, sizeof(employee) % sizeof(std::string) == 0.

I tried controlling the padding using alignas for the whole class or the members, but the requirement to be a power of 2 is too restrictive, it seems.

Then I tried to add a char array at the end. Still, I had two problems, first, what is the exact size of the array in different platforms at compile-time, and second not adding another member that can screw up the nice aggregate initialization of the class.

For the first, I do this:

struct employee_dummy {
    std::string name;
    short salary;
    std::size_t age;
};

struct employee {
    std::string name;
    short salary;
    std::size_t age;
    char padding[(sizeof(employee_dummy)/sizeof(std::string)+1)*sizeof(std::string) - sizeof(employee_dummy)];
};

Note the ugly dummy class, and I don't even know if the logic is correct.

For the second problem, I don't have a solution. I could do this, but then I would need to add a constructor, the class would not be an aggregate, etc.

struct employee {
    std::string name;
    short salary;
    std::size_t age;
 private:
    char padding[(sizeof(employee_dummy)/sizeof(std::string)+1)*sizeof(std::string) - sizeof(employee_dummy)];
};

How can I control the size of the struct with standard or non-standard mechanisms and keep the class as an aggregate?

Here is a link to play with this problem empirically: https://cppinsights.io/s/f2fb5239


NOTE ADDED:

I realized that, if the technique to add padding is correct, the calculation is even more difficult because the dummy class might be already adding padding, so I have to take into account the offset of the last element instead.

In this example I want data to be a multiple of the first member (std::complex):

struct dummy {
    std::complex<double> a;
    double b;
    std::int64_t b2;
    int c;
};

struct data {
    std::complex<double> a;
    double b;
    std::int64_t b2;
    int c;
    char padding[ ((offsetof(dummy, c) + sizeof(c)) / sizeof(std::complex<double>) + 1)* sizeof(std::complex<double>) - (offsetof(dummy, c) + sizeof(c)) ];
};

Note the formula is even worse now.

Matheny answered 7/11, 2023 at 5:2 Comment(19)
Why would you want to do that? Are you planning on creating a serialization scheme?Unmake
@PepijnKramer, no, I want to see (in memory) an array of employee as an strided array of std::string (names). This is similar to the goal here #57424650, but the problem is less offuscated here, and also I realize that in generalit was not a problem of alignment but simply of sizes.Matheny
So you have a code that requires the stride to be a multiple of sizeof(std::string)? Is it not possible to specify the stride in bytes?Effete
Is this for debugging or to have some kind of fast access at runtime? Because at runtime the prefetcher will detect that you are repeatedly accessing the same member (and that will have a predictable stride anyway). If it is for the debugger, you might want to create view for your datatype and use that instead (and keep your runtime code unaffected)Unmake
Using an anonymous bitfield seems to preserve aggregateness nicely: char : 8*num_padding_bytes;. But out of the big three, MSVC rejects bitfields larger than the underlying type size.Effete
@HolyBlackCat, I get this question often; no, it is for a whole framework that assumes that strides are given in multiples of the element's size. The fact is that legacy C libraries, numeric and otherwise, fortran, etc, use element-size strides. It also simplifies the pointer arithmetic, I think. Here it is a more numeric example (also I realized that I should look into the offsets as well) cppinsights.io/s/971006bfMatheny
@HolyBlackCat, NVCC also warns about this: warning #959-D: declared size for bit field is larger than the size of the bit field type; truncated to 8 bits, clang might warn with -Wbitfield-width. Some flags combinations (not sure) in GCC can give width of ‘employee::padding’ exceeds its type.Matheny
@Matheny Yeah, I mean only in MSVC it appears to be a hard error.Effete
@HolyBlackCat, in case you are interested, this is the original example that happened to work on Linux amd64, but lately I realized it didn't work for Apple ARM: gitlab.com/correaa/boost-multi/-/blob/master/test/…Matheny
Are you aware that the size of std::string depends on the standard library and compiler being used? It's not just "amd64 linux", it's "amd64 linux with libstdc++ in debug mode"Illfounded
@Botje, I imagined, which makes my need for an automatic solution even stronger. Use the other example (data) for a type that is a bit less dependent on the environment.Matheny
@DanielLangr I think the function will not give the correct padding if there are gaps between the members.Matheny
@DanielLangr, here it is a counter-example in which the function you propose doesn't work: cppinsights.io/s/ac7a262f . It seems that using offset is inevitable.Matheny
For the most recent example, in which the static_assert was failing with 8 != 0, the calculations seem all right if you use a union PaddedData { data d; char buf[padding<std::complex<double>, double, int, int64_t>()]; }; and adjust the padding function to return a total size rather than the remainder ie buf is 48. A union might make aggregate arguments harder but it means that you can separate your struct with important data from the less important padding.Sleekit
@Matheny You're right, I haven't considered this case, my bad. I deleted my comment, since in was no longer relevant.Lonlona
static constexpr std::size_t fixed_size = sizeof(short) + sizeof(std::size_t); static constexpr std::size_t string_alignment = alignof(std::string); static constexpr std::size_t padding_needed = (fixed_size % string_alignment) == 0 ? 0 : string_alignment - (fixed_size % string_alignment); char padding[padding_needed]; need something like that?Estival
"I want to see (in memory) an array of employee as an strided array of std::string (names)." Doing so would invoke UB.Autoroute
🤣 short salary; and std::size_t age; tells you something about life, world, society, economy and all that stuff. Yes, that’s just how things are!Hendrik
@AndrejPodzimek, I overlooked that! I changed the integer types randomly to make the case. But given that people will work forever with meager salaries, perhaps it was the right choice of types. Wrapping around salaries is not a problem, but overflowing age, that is a problem.Matheny
A
10

Here is a standard-compliant, no ifs or buts, version.

template <template<std::size_t> class tmpl, std::size_t need_multiple_of>
struct adjust_padding
{
    template <std::size_t n>
    static constexpr std::size_t padding_size()
    {
        if constexpr (sizeof(tmpl<n>) % need_multiple_of == 0) return n;
        else return padding_size<n+1>();
    }

    using type = tmpl<padding_size<0>()>;
};

Use it like this:

template <std::size_t K>
struct need_strided
{
    double x;
    const char pad[K];
};

template <>
struct need_strided<0>
{
    double x;
};

using strided = adjust_padding<need_strided, 47>::type;

Now strided has a size that is a multiple of 47 (and of course is aligned correctly). On my computer it is 376.

You can make employee a template in this fashion:

template <std::size_t K>
struct employee { ...

or make it a member of a template (instead of double x):

template <std::size_t K>
struct employee_wrapper { 
   employee e;
   

and then use employee_wrapper as a vector element. But provide a specialization for 0 either way.

You can try using std::array instead of a C-style array and avoid providing a specialization for 0, but it may or may not get optimized out when the size is 0. [[no_unique_address]] (C++20) may help here.

Note, something like adjust_padding<need_strided, 117>::type may overflow the default constexpr depth of your compiler.

Autoroute answered 9/11, 2023 at 9:21 Comment(2)
The idea is almost foolproof. The recursion might indicate bigger problems like too much end-padding needed. A couple of modifications though: 1. make pad_ non-const, otherwise default assignment doesn't work. 2. benefit from non-standard zero-sized arrays and [[no_unique_address]], not sure if it is a suggestion or mandatory. 3. I replaced char[K] by std::array<char, K>, 4. I had to initialize pad_ otherwise, default operator== might not produce the correct result, and I would be forced to implement operator==. Other niceties and a concrete use here: godbolt.org/z/K3Y61eY5bMatheny
@Matheny You are right, it should not be const.Autoroute
W
1

Try this

   struct employee {
       std::string name;
       union {  
          struct {
              short salary;
              std::size_t age;
          };
          std::string dummy;
       };
   };

This should be 2 * sizeof(std::string) on any platforms or compilers. But just in case, someone should tell me why this is not the case and on which platform will this fail! Thanks!

This can easily be generalized to collection of data field of any size. As long as there is enough copies of std::string to cover the size of the total in the original struct to enforce the size of N * sizeof(std::string) in the altered struct where N is an integer.

Woke answered 13/11, 2023 at 12:16 Comment(2)
Interesting idea. However, 2 points: First, you cannot have an anonymous union because you need to define the destructor; otherwise, you get a compiler error. Second, this only works if the data within the anonymous struct has a size of less or equal the size of std::string. If it is larger, you'd have to add additional dummy elements (in their own struct). Example on godbolt.Noted
@Noted Tried it in Visual Studio, no compiler error there. I am aware of that the collected data size must be less than or equal to std::string for this to work. But this is the example given by the OP. You can always add more multiple of std::string dummy1, dummy2, dummy3 to expand the size. The bottom line is that the whole structure can be a multiple of sizeof(std::string).Woke
N
0

There is no need to make employee a template or to list its members twice. TL;DR: You can get down to succint code such as:

#pragma pack(1) // Prevent padding at the end.
struct employee_base {
    std::complex<double> a;
};

using employee = PaddingHelper<employee_base>;

Explanation follows.


The core idea is that (since C++17) you can simply derive from the type that contains the actual payload, and keep the aggregate initialization syntax (live on godbolt):

#pragma pack(1) // Prevent padding at the end of 'dummy'.
struct dummy {
    std::complex<double> a;
    double b;
    std::int64_t b2;
    bool c;
};

struct employee : dummy {
    char padding[
        (sizeof(dummy)/sizeof(decltype(dummy::a))+1)*sizeof(dummy::a) 
        - sizeof(dummy)]
        = {0};
};

static_assert(sizeof(employee) % sizeof(decltype(dummy::a)) == 0);

int main(){
    employee e{{41, 42, 43, false}};
}

The #pragma pack is of course non-standard, but gcc, clang and MSVC support it. We need it to ensure that we have control of the padding ourselves.

Note that if the base class has only a single member (or its size is a multiple of the first member), this adds additional padding. To fix this, we can exploit the empty base class optimization (live on godbolt):

#pragma pack(1)
template <std::size_t size>
struct padding{
    char pad[size] = {};
};

template <>
struct padding<0>{};

template <class BaseClass, class FirstMember>
constexpr auto GetPaddingSize()
{
    constexpr auto Size = 
        (sizeof(BaseClass) % sizeof(FirstMember) == 0
            ? 0 
            : (sizeof(BaseClass)/sizeof(FirstMember)+1)*sizeof(FirstMember) 
                - sizeof(BaseClass));
    return Size;
}

#pragma pack(1) // Prevent padding at the end of 'dummy'.
struct dummy {
    std::complex<double> a;
};

struct employee : dummy, padding<GetPaddingSize<dummy, decltype(dummy::a)>()> {
};

static_assert(sizeof(employee) % sizeof(decltype(dummy::a)) == 0);
static_assert(sizeof(employee) == sizeof(decltype(dummy::a)));

int main(){
    [[maybe_unused]] employee e{{41}};
}

Since you are dealing with aggregate types, you could use boost::pfr to get rid of the explicit specification of the first member via boost::pfr::tuple_element_t<0, BaseClass> (godbolt):

#include <boost/pfr.hpp>

#pragma pack(1)
template <std::size_t size>
struct padding{
    char pad[size] = {};
};

template <>
struct padding<0>{};

template <class BaseClass>
constexpr auto GetPaddingSize()
{
    using FirstMember = boost::pfr::tuple_element_t<0, BaseClass>;
    constexpr auto Size = 
        (sizeof(BaseClass) % sizeof(FirstMember) == 0
            ? 0 
            : (sizeof(BaseClass)/sizeof(FirstMember)+1)*sizeof(FirstMember) 
                - sizeof(BaseClass));
    return Size;
}

//--------------------

#pragma pack(1) // Prevent padding at the end.
struct employee_base {
    std::complex<double> a;
};

struct employee : employee_base, padding<GetPaddingSize<employee_base>()> {
};

static_assert(sizeof(employee) % sizeof(decltype(employee::a)) == 0);
static_assert(sizeof(employee) == sizeof(decltype(employee::a)));

//---------------------

#pragma pack(1) // Prevent padding at the end.
struct foo_base {
    std::size_t i;
    bool b;
};

struct foo : foo_base, padding<GetPaddingSize<foo_base>()> {
};

static_assert(sizeof(foo) % sizeof(decltype(foo::i)) == 0);

//---------------------

int main(){
    [[maybe_unused]] employee e{{41}};
    [[maybe_unused]] foo f{{41, false}};
}

And then you could introduce

template <class BaseClass>
struct PaddingHelper : BaseClass, padding<GetPaddingSize<BaseClass>()> {};

allowing to write e.g.

#pragma pack(1) // Prevent padding at the end.
struct employee_base {
    std::complex<double> a;
};

using employee = PaddingHelper<employee_base>;

See on godbolt. Of course, you can do the analogous thing without boost::pfr, you just have to add a second template argument to PaddingHelper that receives the first member type. I don't think it can become more succinct than this.

Noted answered 12/11, 2023 at 14:3 Comment(3)
A different idea would be to use a flexible array as last member, and overload operator new. In other words, to effectively allocate more memory than the size of the class. I.e. to implement the padding requirement during runtime rather than compile time. See e.g. this post (godbolt). However, note that sizeof will of course not know about the additional memory. And it will work only for objects allocated on the heap. So I wouldn't recommend it. Also see gcc.gnu.org/onlinedocs/gcc/Zero-Length.htmlNoted
Interesting, lot to unpack :) here. First question is, do I need double curly brackets now, right? (that is the main reason I discarded the inheritance option, but if that it is what it takes, ok)Matheny
From my understanding, double braces are not really required for the first version I have shown. See e.g. this post. However, clang prints a warning, hence I used double braces. I tend to think that this is a bug in clang. For the final version, also gcc prints a warning, although a different one. Even with double braces. I think this is just a "be aware, you might have a bug" warning, and could be ignored.Noted

© 2022 - 2024 — McMap. All rights reserved.