Is there a clever way of avoiding extra padding with nested classes in C++?
Asked Answered
D

5

13

These structs, align1 and align2, contain the same data, but align1 has more padding due to the nested layout. How can I get the memory saving alignment of align2 while also using a nested struct like in align1?

int main() {
    struct align1 {
        struct {
            double d;    // 8 bytes
            bool b1;    //+1 byte (+ 7 bytes padding) = 16 bytes
        } subStruct;
        bool b2;        //+1 byte (+ 7 bytes padding) = 24 bytes
    };
    struct align2 {
        double d;        // 8 bytes
        bool b1, b2;    //+2 byte (+ 6 bytes padding) = 16 bytes
    };

    std::cout << "align1: " << sizeof(align1) << " bytes\n";    // 24 bytes
    std::cout << "align2: " << sizeof(align2) << " bytes\n";    // 16 bytes

    return 0;
}

The nested subStruct struct is needed since it is going to be declared/defined outside. I'm using C++17 and Visual Studio 2017.

The resulting code can be as dirty or bad looking as hell. I just don't want it to throw random errors at me later on or break when changing the configuration.

Dougald answered 2/11, 2020 at 14:19 Comment(5)
Maybe your implementation allows "pragma pack". But this results typically in much slower code and is only an emergency exit for my point of view.Nikaniki
Most likely not clean, but I think I would try (environment specifically of course) whether I could achieve something with a union of both structs. Highly questionable. If you ask explicitly (by editing your question) for even unclean ideas (which you will test and take responsibility of yourself) I could propose some code in an answer. Otherwise I would worry about deserved downvotes....Skipjack
I'm going out on a limb here, that you cannot do so portably within the language, but different compilers have custom constructs for this.Alewife
How can i get the memory saving alignment of align2 while also using a nested struct like in align1 There is no portable solution for this, so are you okay with an implementation specific answer?Turning
IMO question suffers from XY problem. If you could explain why do you need full control over fields aliment it may turn out you should solve this issue in different way.Ferrotype
S
4

I explicitly rely on the permission to propose code which is "dirty or bad looking as" ... anything. To be even more clear, I only provide an idea. You need to test yourself and take responsibility yourself. I consider this question to explicitly allow untested code.

With this code:

typedef union
{
    struct
    {
        double d;   // 8 bytes
        bool b1;    //+1 byte (+ 7 bytes padding) = 16 bytes
    } nested;
    struct
    {
        double d;       // 8 bytes
        bool b1, b2;    //+2 byte (+ 6 bytes padding) = 16 bytes
    } packed;
} t_both;

I would expect the following attributes/features:

  • contains the substruct as potentially typedefed elsewhere (can be used from an included header file)
  • substruct accessable as XXX.nested.d and XXX.nested.b1
  • at same address as XXX.packed
  • access to XXX.packed.b2 to what is considered padding within nested
  • both substructs have the same total size, which I hope means that even making arrays of this is OK

Whatever you do with this, it probably conflicts with the requirement that when writing and reading a union, then all read accesses must be to the same part of the union as the most recent write. Writing one and reading the other would hence not be strictly allowed. That is what I consider unclearn about this code proposal. That said, I have often used this kind of unions in environments for which the respective construct has explicity been tested.

In order to illustrate here is a functionally identical and also equally unclean version, which better illustrates that the substruct can be typdefed elsewhere:


/* Inside an included header "whatever.h" : */
typedef struct
{
    double d;   // 8 bytes
    bool b1;    //+1 byte (+ 7 bytes padding) = 16 bytes
} t_ExternDefedStruct;
/* Content of including file */

#include "whatever.h"

typedef union
{
    t_ExternDefedStruct nested;
    struct
    {
        double d;       // 8 bytes
        bool b1, b2;    //+2 byte (+ 6 bytes padding) = 16 bytes
    } packed;
} t_both;
Skipjack answered 2/11, 2020 at 14:48 Comment(6)
I think i am infact suffering from the XY problem... LOL sorry im new. As much as this would help with my example code, the nested struct is templated (and defined outside) in my actual code, so would this still work for different templated type parameters?Dougald
I edited to extend with alternative. Does it help? If you really find the (very convincing) XY-problem comment helpful, then you deserve my respect. That takes a lot of abstracting. If you see a path in that direction and can't/won't test my proposal let me know. In that case I will probably better delete this....Skipjack
If you kind of like this, but need it to cover templated substruct, then please extend the code shown in your question accordingly and I will try.Skipjack
Thank you, also for the edit of the question! I will try to use this method while keeping the code somewhat clean and readable. I will probably not use the pragma pack method as it is said to slow down or even crash the program (which is the exact opposite of what I'm trying to achieve with my data structure).Dougald
@Devyy: you could maybe have your template figure out whether there was a free byte of padding at the end or not, and make that accessible somehow so your outer struct could template on that to decide between the simple way or a union hack (with a struct { char padding[n]; bool b2;}; to place the bool at the right byte offset. Or maybe always use the union, but choose n = sizeof(inner struct) if necessary so the bool ends up past the end of it and only the padding (which you never touch) overlaps. Either of those might optimize badly in some cases...Oilskin
@Dougald and Yunnosch: the answer mentions that it's potentially a problem to read one union member after writing another: yes, it's Undefined Behaviour in ISO C++. But fortunately several mainstream compilers do define the behaviour, including MSVC and all compilers that implement GNU extensions / the GNU dialect of C++. GNU C++ (and GNU89) defines the behaviour as working like C99 guarantees, so this covers gcc, clang and ICC at least. gcc.gnu.org/onlinedocs/gcc/…Oilskin
V
1

With #pragma pack(push, 1) and some manual padding, you can get them to be the same.

#include <iostream>

int main() {
#pragma pack(push, 1)
    struct align1 {
        struct {
            double d;   // 8 bytes
            bool b1;    //+1 byte (+ 0 bytes padding) = 9 bytes
        } subStruct;
        bool b2;        //+1 byte (+ 0 bytes padding) = 10 bytes
        char pad_[6];   //+6 bytes (+ 0 bytes padding) = 16 bytes 
    };
#pragma pack(pop)
    struct align2 {
        double d;       // 8 bytes
        bool b1, b2;    //+2 byte (+ 6 bytes padding) = 16 bytes
    };

    std::cout << "align1: " << sizeof(align1) << " bytes\n";    // 16 bytes
    std::cout << "align2: " << sizeof(align2) << " bytes\n";    // 16 bytes

    return 0;
}

Output:

align1: 16 bytes
align2: 16 bytes
Velvetvelveteen answered 2/11, 2020 at 15:13 Comment(0)
C
0

When your problem is padding, your answer is #pragma pack.

#pragma pack effect

It works on MSVC (where it was invented) and also on GCC (where it was added for compatibility with MSVC codebase).

Note that messing with alignment can end very badly. Putting a multi-byte members on odd (for them) bytes will result in run-time slowdowns. That is, in a good case scenario when your CPU supports unaligned operations at all. In a bad case, it crashes altogether (AFAIK trying to feed them to SSE instructions or non-x86, RISC CPUs).

The only legitimate use for #pragma pack(1) I personally know is mapping binary files straight to structs, especially headers of bitmap formats like BMPs (BITMAPINFOHEADER from wingdi.h) or TGAs. The other would be really big data structures, like gigabyte-sized arrays that @Arty mentioned.

In the bigger picture, padding is a time-memory trade-off. The CPU time saved on accessing a nicely aligned variable is well worth the wasted bytes, in overwhelming majority of cases. You need a really good reason to change that, because you're unlikely to come ahead without serious profiling of both approaches.

Cistaceous answered 2/11, 2020 at 15:10 Comment(6)
Also another popular legitimate use of dense packing is when you really need to save your memory, if you have GigaByte-arrays containing such structures then every single saved byte counts! If it is alright for you to spend more CPU time for accessing instead of spending more RAM.Bates
@Bates yeah, packing is a time-memory trade-off, sometimes you need to tweak it the other way.Cistaceous
"Putting a multi-byte members on odd bytes" will not just result in slowdowns, but is an unaligned access and thus undefined behaviour. Thus, for example, if the compiler has an update and is able to see through it, it is free to just remove your code altogether.Wingspread
@RasmusDamgaardNielsen Thanks for pointing that out. I forgot that standard doesn't even give us means to make unaligned access happen in the first place. #pragma pack is nonstandard. But unaligned access is defined behaviour - for compilers which provide tools to make it possible.Cistaceous
That is true, if you know your exact toolchain, then i suppose this is an option. But it is just so fraught with problems that you really know what you are doing. I hope you are not expecting this to work on neither any phones nor macs in two years, because unaligned memory access cause a hardware fault on ARM. I am just saying, be VERY carefull about non-standard stuff :)Wingspread
@RasmusDamgaardNielsen MSVC docs explicitly warns about possible crashes: learn.microsoft.com/en-us/cpp/preprocessor/pack?view=msvc-160 AFAIR this section was much more detailed back in the days when MIPS and Alpha were supported by Windows NT 4.0.Cistaceous
B
0

I implemented the following universal macros for packing structures in all popular compilers:

#if defined(_MSC_VER)
    #define ATTR_PACKED
    #define PACKED_BEGIN __pragma(pack(push, 1))
    #define PACKED_END __pragma(pack(pop))
#else
    #define ATTR_PACKED __attribute__((packed))
    #define PACKED_BEGIN
    #define PACKED_END
#endif

Put line PACKED_BEGIN before the outer struct, PACKED_END after outer struct, plus ATTR_PACKED word after struct word (before struct name) in all structures (including inner). All such marked structures will be densely packed the same way by all popular compilers, packed into the minimal possible size. See the code below; both of your structures will be aligned same way, both 10 bytes in size.

It was tested on online C++ compilers (click-open the following links to see tests): MSVC, GCC, and CLang.

If it is packed too densely you may add extra padding fields wherever needed, between or after fields, fields like char pad0[2]; and char pad1[3]; to insert 2 and 3 extra padding bytes.

#if defined(_MSC_VER)
    #define ATTR_PACKED
    #define PACKED_BEGIN __pragma(pack(push, 1))
    #define PACKED_END __pragma(pack(pop))
#else
    #define ATTR_PACKED __attribute__((packed))
    #define PACKED_BEGIN
    #define PACKED_END
#endif

#include <iostream>

int main() {
    PACKED_BEGIN
    struct ATTR_PACKED align1 {
        struct ATTR_PACKED {
            double d;
            bool b1;
        } subStruct;
        bool b2;
    };
    PACKED_END

    PACKED_BEGIN
    struct ATTR_PACKED align2 {
        double d;
        bool b1, b2;
    };
    PACKED_END

    std::cout << "align1: " << sizeof(align1) << " bytes\n";
    std::cout << "align2: " << sizeof(align2) << " bytes\n";

    return 0;
}

Output:

align1: 10 bytes
align2: 10 bytes
Bates answered 2/11, 2020 at 17:57 Comment(0)
M
0

C++ 11 introduced the keyword "alignas" that can be used, here's a link (https://en.cppreference.com/w/cpp/language/alignas)

Monkshood answered 12/11, 2020 at 18:7 Comment(1)
As I understand alignas is meant for something different. In this question it is not really alignment is needed but packing. Alignment just says where this structure is located in memory, pointer to structure should be aligned given in alignas amount. But this alignment doesn't influence the size of structure. No matter how you align your structure it will still occupy 24 bytes instead of 10 bytes (or 16 bytes) which is needed. See example of usage here.Bates

© 2022 - 2024 — McMap. All rights reserved.