Clarification about Bit-field ordering semantics in C
Asked Answered
B

3

7

I have troubles understanding the exact meaning of a paragraph of C99 draft standard (N1256) about bit-fields (6.7.2.1:10):

6.7.2.1 Structure and union specifiers

[...]

Semantics

[...]

An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

The emphasized sentence stretches my English skills to the limit: I don't understand if it refers to individual bit-fields inside a unit, or to bits ordering inside the individual bit-fields or something else.

I'll try to make my doubt clearer with an example. Let's assume that unsigned ints are 16 bits, that the implementation chooses an unsigned int as the addressable storage unit (and that bytes are 8 bits wide), and no other alignment or padding issues arise:

struct Foo {
    unsigned int x : 8;
    unsigned int y : 8;
};

thus, assuming x and y fields are stored inside the same unit, what is implementation-defined according to that sentence? As I understand it, it means that inside that unsigned int unit, x can be stored either at a lower address than y or vice-versa, but I'm not sure, since intuitively I'd think that if no bit fields overlaps with two underlying storage units, the declaration order would impose the same ordering for the underlying bit-fields.

Note: I fear I'm missing some terminology subtlety here (or, worse, some technical one), but I couldn't understand which.

Any pointer appreciated. Thanks!

Budde answered 6/9, 2013 at 7:22 Comment(2)
What you said and more... There is no guarantee what bit will be modified by unsigned x : 1, if the lowest bit or the highest bit. So if sizeof(unsigned int) == 4, x could be saved in the bit 1 or in the bit 32.Voe
possible duplicate of Representing individual bits in CVoe
G
8

I don't really see what is unclear with

The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined.

It talks about the allocation of a bit-field, and not the bits inside a field. So other than for non-bit-field members, you can't be sure in what order bit-fields inside an addressable unit are ordered.

Otherwise the representation of the bit-field itself is guaranteed to be "the same" as the underlying type, with a division into value bits and a sign bit (if applicable).

In essence it says that the anatomy of the storage unit that contains the bit-fields is implementation defined, and you shouldn't try to access the bits through other means (union or so) since this would make your code non-portable.

Gruesome answered 6/9, 2013 at 10:30 Comment(4)
As I've hinted in the question, I feared I miss something about bit-fields in general (I hoped it was only some terminological confusion, but from the answers I'm getting I guess I have to better understand the technical aspects as well). I never had to work with bit-fields, but probably I'm going to need them in the near future, thus I tried to grasp the details directly from the standard. Your explanation is very clear (+1), but I still have to find the time to connect it to the standard's text, though.Budde
@LorenzoDonati, yes, the semantics of that are not so easy to grasp, I agree. Avoid bit-fields as much as you can, as they don't fulfill much purpose in general. If you are interest in portably manipulating bits in a compressed form, using bit computations on an unsigned type such as uint64_t is probably preferable. For binary data compatibility between different types of platforms, bit-fields are not of much help either.Gruesome
I realize I need to spend more time on the subject, but still it seems to me that the standard's guarantees for bit-fields are rather scant. All seems implementation-defined. I'm not sure now, but I think I'll need to program the access to hardware registers of some external dedicated I/O interface card. I thought bit-fields could ease the (yet unknown) task (so I'm acting preemptively), but it seems it cannot be done portably. Anyway thanks! If no more detailed explanation will show up in the next couple of days I'll accept your answer.Budde
So other than for non-bit-field members, you can't be sure in what order bit-fields inside an addressable unit are ordered.: you can't be sure in general. In case if implementation(s) is/are known, then you can be sure, since the order is implementation-defined.Algo
D
2

My take on it is, the C99 spec is talking about the bit endian of the bits fields, and how they are ordered in a 'unit' (byte word, etc). Essentially you're on your own if you start casting structs.

Example

bit  ex1    ex2   ex3
D7   x3     y0    x0
D6   x2     y1    x1
D5   x1     y2    x2
D4   x0     y3    x3
D3   y3     x0    y0
D2   y2     x1    y1
D1   y1     x2    y2
D0   y0     x3    y3

Above three different schemes for ordering the two 4 bit fields in byte 'unit'. All of them are legal as far as the C99 standard is concerned.

Dock answered 6/9, 2013 at 7:54 Comment(0)
T
2

Gibbon1's answer is correct, but I think example code is helpful for this sort of question.

#include <stdio.h>

int main(void)
{
    union {
        unsigned int x;
        struct {
            unsigned int a : 1;
            unsigned int b : 10;
            unsigned int c : 20;
            unsigned int d : 1;
        } bits;
    } u;
    
    u.x = 0x00000000;
    u.bits.a = 1;
    printf("After changing a: 0x%08x\n", u.x);
    u.x = 0x00000000;
    u.bits.b = 1;
    printf("After changing b: 0x%08x\n", u.x);
    u.x = 0x00000000;
    u.bits.c = 1;
    printf("After changing c: 0x%08x\n", u.x);
    u.x = 0x00000000;
    u.bits.d = 1;
    printf("After changing d: 0x%08x\n", u.x);
    
    return 0;
}

On a little-endian x86-64 CPU using MinGW's GCC, the output is:

After changing a: 0x00000001

After changing b: 0x00000002

After changing c: 0x00000800

After changing d: 0x80000000

Since this is a union, the unsigned int (x) and the bit field structure (a/b/c/d) occupy the same storage unit. The order of allocation of [the] bit fields decides whether u.bits.a refers to the least significant bit of x or the most significant bit of x. Typically, on a little-endian machine:

u.bits.a == (u.x & 0x00000001)
u.bits.b == (u.x & 0x000007fe) >> 1
u.bits.c == (u.x & 0xeffff800) >> 11
u.bits.d == (u.x & 0x80000000) >> 31

and on a big-endian machine:

u.bits.a == (u.x & 0x80000000) >> 31
u.bits.b == (u.x & 0x7fe00000) >> 21
u.bits.c == (u.x & 0x001ffffe) >> 1
u.bits.d == (u.x & 0x00000001)

What the standard is saying is that the C programming language does not require any particular endianness -- big-endian and little-endian machines can put data in the order that is most natural for their addressing scheme.

Thoroughgoing answered 29/8, 2015 at 19:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.