Force C++ structure to pack tightly
Asked Answered
V

3

60

I am attempting to read in a binary file. The problem is that the creator of the file took no time to properly align data structures to their natural boundaries and everything is packed tight. This makes it difficult to read the data using C++ structs.

Is there a way to force a struct to be packed tight?

Example:

struct {
    short a;
    int b;
}

The above structure is 8 bytes: 2 for short a, 2 for padding, 4 for int b. However, on disk, the data is only 6 bytes (not having the 2 bytes of padding for alignment)

Please be aware the actual data structures are thousands of bytes and many fields, including a couple arrays, so I would prefer not to read each field individually.

Victualer answered 13/1, 2014 at 13:23 Comment(7)
Look up on packing structs. Be aware that some architectures require structs be aligned for them to be read properly.Tinney
en.cppreference.com/w/cpp/language/alignas says alignas(0) has no effect?Tinney
#pragma pack may help.Maupin
I am not sure if I can count on C++11, however, I will investigateVictualer
Just write code to decode the serialized data into your in-memory representation.Vernalize
If you byte-pack such structure, beware segfaults while accessing that integer valueIcsh
Also be concerned with endianness and field order when specifying bit fields in your struct, especially if you have any fields crossing byte or word boundaries!Wittol
F
81

If you're using GCC, you can do struct __attribute__ ((packed)) { short a; int b; }

On VC++ you can do #pragma pack(1). This option is also supported by GCC.

#pragma pack(push, 1)
struct { short a; int b; }
#pragma pack(pop)

Other compilers may have options to do a tight packing of the structure with no padding.

Firewarden answered 13/1, 2014 at 13:31 Comment(8)
Thank you. I am currently using GCC and that worked. I left a comment about the pragma in the event we change compilers though.Victualer
What happens to structure inheritance?Medford
IIRC only this structure will be tightly packed; the inheriting structure has to use it again. If this is inheriting from some structure that will be considered as one type and it's layout will be kept as it's defined in its declaration. All these are compiler-specific so you should read the compiler documentation.Firewarden
Neither of these things worked. I am writing a system that must read COFF binary files that have all of their fields packed together. The Windows compiler can read this data into fields that seem to be packed by default. The g++ compiler cannot do this no matter how you try to force it.Stepfather
I'm sure it works; live example. It's most probably an environmental issue. Check your compiler version, the flags you pass, etc. Try to dump your settings and see if there're any off values.Firewarden
struct attribute ((packed)) worked for me. Thanks!Humblebee
The #pragma seems to work both in Visual Studio and GCC.Paulus
Yeah, as the answer called out. VC++ is the compiler inside Visual Studio (IDE).Firewarden
L
18

You need to use a compiler-specific, non-Standard directive to specify 1-byte packing. Such as under Windows:

#pragma pack (push, 1)

The problem is that the creator of the file took no time to properly byte align the data structures and everything is packed tight.

Actually, the designer did the right thing. Padding is something that the Standard says can be applied, but it doesn't say how much padding should be applied in what cases. The Standard doesn't even say how many bits are in a byte. Even though you might assume that even though these things aren't specified they should still be the same reasonable value on modern machines, that's simply not true. On a 32-bit Windows machine for example the padding might be one thing whereas on the 64-bit version of Windows is might be something else. Maybe it will be the same -- that's not the point. The point is you don't know what the padding will be on different systems.

So by "packing it tight" the developer did the only thing they could -- use some packing that he can be reasonably sure that every system will be able to understand. In that case that commonly-understood packing is to use no padding in structures saved to disk or sent down a wire.

Liftoff answered 13/1, 2014 at 14:1 Comment(1)
What I meant by that statement is that the order of the struct's elements could have been arranged better such that alignment would have been good under many different circumstances. Not that they simply packed the bytes tight.Victualer
T
-2

A compiler/platform agnostic way to do it is to use byte arrays for the elements, since you need to be explicit about their sizes anyway.

So instead of:

struct __attribute__ ((packed)) {
    short a;
    int b;
}

you can instead use:

struct {
    char a[2];
    char b[4];
}

The alignment requirement of a char is guaranteed to be "the weakest" (in the language of the C11 standard) and is equal to 1. Technically the standard allows extraneous padding to any multiple of this, but I've not seen an implementation that does so. So in practice, this will pack. However YMMV so make sure you check (a static_assert of the sizeof the struct is an excellent safety check).

Of course, you must then also be explicit about conversion to types like short and int. Whether this is a net benefit is left as an exercise to the reader.

Tellurium answered 7/5, 2024 at 4:0 Comment(7)
Even if you pack the members together this way (and complicate the code for every member access by reinterpreting the bytes as whatever larger type), the struct will still be padded to a word boundary.Buggery
Whether it complicates it depends on what you're already doing to manage endianness and data type conversion, and is already mentioned in the last two sentences. Structs are also not padded to word boundaries, but to the structure alignment (~~typically the alignment of the first element~~ which is set by the member with strictest alignment requirement). I'm sure your comment (and downvote?) is applicable to some scenarios, but not here, it would seem.Tellurium
I thought I already did in the first place, but: 1) no, the alignment is not inherently the strictest alignment requirement of the members, but a multiple of that value. The compiler is perfectly free to word-align structs by default as long as no member is larger than that, and assuming that the member sizes divide evenly (which is essentially automatic on any modern architecture). And it often will for performance reasons. Second, acknowledging the overhead in treating a char[2] like a short doesn't make it go away.Buggery
Third, using a compiler-specific mechanism allows the compiler to understand the intent, so that has the best chance (read the compiler documentation, of course) of actually instructing it not to add padding to a word boundary at the end of the struct.Buggery
Yes true, superfluous padding is allowed by the standard and acknowledging this would improve the answer. This is all implementation defined, after all. But I have not found, amongst ~15 compilers I've tried with various optimisation flags, a single one that added padding - I know this can come as a shock (it took some study to convince myself!), but this is what's wrong about your rebuttal. By my count, it's more reliable than __attribute__ ((packed)) which explicitly only concerns internal padding, not trailing, and has caught many out.Tellurium
I not at all championing the merits of this approach, only that it is an approach, and suppressing it based on false assumptions does more harm than good to the collective knowledge.Tellurium
I've incorporated your prudent observation about potentially having multiples of the alignment requirement. I think it's a much better answer now.Tellurium

© 2022 - 2025 — McMap. All rights reserved.