When to use bit-fields in C
Asked Answered
H

17

90

On the question 'why do we need to use bit-fields?', searching on Google I found that bit fields are used for flags.

Now I am curious,

  1. Is it the only way bit-fields are used practically?
  2. Do we need to use bit fields to save space?

A way of defining bit field from the book:

struct {
    unsigned int is_keyword : 1;
    unsigned int is_extern :  1;
    unsigned int is_static : 1;
} flags;
  1. Why do we use int?
  2. How much space is occupied?

I am confused why we are using int, but not short or something smaller than an int.

  1. As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?
Humfrid answered 24/7, 2014 at 12:6 Comment(1)
As about everything about bit-field is implementation defined, never?Capillary
B
73

Now I am curious, [are flags] the only way bitfields are used practically?

No, flags are not the only way bitfields are used. They can also be used to store values larger than one bit, although flags are more common. For instance:

typedef enum {
    NORTH = 0,
    EAST = 1,
    SOUTH = 2,
    WEST = 3
} directionValues;

struct {
    unsigned int alice_dir : 2;
    unsigned int bob_dir : 2;
} directions;

Do we need to use bitfields to save space?

Bitfields do save space. They also allow an easier way to set values that aren't byte-aligned. Rather than bit-shifting and using bitwise operations, we can use the same syntax as setting fields in a struct. This improves readability. With a bitfield, you could write

directions.alice_dir = WEST;
directions.bob_dir = SOUTH;

However, to store multiple independent values in the space of one int (or other type) without bitfields, you would need to write something like:

#define ALICE_OFFSET 0
#define BOB_OFFSET 2
directions &= ~(3<<ALICE_OFFSET); // clear Alice's bits
directions |= WEST<<ALICE_OFFSET; // set Alice's bits to WEST
directions &= ~(3<<BOB_OFFSET);   // clear Bob's bits
directions |= SOUTH<<BOB_OFFSET;  // set Bob's bits to SOUTH

The improved readability of bitfields is arguably more important than saving a few bytes here and there.

Why do we use int? How much space is occupied?

The space of an entire int is occupied. We use int because in many cases, it doesn't really matter. If, for a single value, you use 4 bytes instead of 1 or 2, your user probably won't notice. For some platforms, size does matter more, and you can use other data types which take up less space (char, short, uint8_t, etc.).

As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?

No, that is not correct. The entire unsigned int will exist, even if you're only using 8 of its bits.

Bookerbookie answered 24/7, 2014 at 13:17 Comment(7)
Could you expand more on the doing it manually section? Why would you need to do that?Firmament
@Firmament I'd be happy to add more detail; can you tell me what part of that you're struggling to understand?Bookerbookie
I think I understand now, "doing it manually" would be trying to extract the data without a backing struct, which is why you'd have to do the bit manipulation yourself. Correct?Firmament
Yes, exactly. I can clear up that language, "manually" probably isn't quite specific enough.Bookerbookie
@EricFinn If The space of an entire int is occupied , why is sizeof(directions) 4 bytes (It should be 8 bytes following what you stated)? In my machine, sizeof(int) is 4 bytesNelson
@xuanduc611 Sorry if it wasn't clear. That statement was a direct answer to "As I understand only 1 bit is occupied in memory, but not the whole unsigned int value." "The space of an entire int is occupied" is meant to explain why sizeof(directions) would be 4 (the entire unsigned int is occupied) as opposed to 1 (if the smallest data type large enough to contain the fields were used) or even 0.5 (what the asker thought would happen)Bookerbookie
@EricFinn By the standard, only five types are accepted: unsigned int, signed int, int, bool (C23, _Bool in C99), _BitInt (C23). Ability to use another types — "char, short, uint8_t, etc." — is implementation-dependent.Sodality
M
88

A quite good resource is Bit Fields in C.

The basic reason is to reduce the used size. For example, if you write:

struct {
    unsigned int is_keyword;
    unsigned int is_extern;
    unsigned int is_static;
} flags;

You will use at least 3 * sizeof(unsigned int) or 12 bytes to represent three small flags, that should only need three bits.

So if you write:

struct {
    unsigned int is_keyword : 1;
    unsigned int is_extern : 1;
    unsigned int is_static : 1;
} flags;

This uses up the same space as one unsigned int, so 4 bytes. You can throw 32 one-bit fields into the struct before it needs more space.

This is sort of equivalent to the classical home brew bit field:

#define IS_KEYWORD 0x01
#define IS_EXTERN  0x02
#define IS_STATIC  0x04
unsigned int flags;

But the bit field syntax is cleaner. Compare:

if (flags.is_keyword)

against:

if (flags & IS_KEYWORD)

And it is obviously less error-prone.

Malvia answered 24/7, 2014 at 12:18 Comment(13)
Nice answer! When talking about bit fields and their size in memory one should keep in mind that c++ compilers will allocate bit-fields in memory as follows: several consecutive bit-field members of the same type will be allocated sequentially. As soon as a new type needs to be allocated, it will be aligned with the beginning of the next logical memory block. The next logical block will depend on your processor. Some processors can align to 8-bit boundaries, while others can only align to 16-bit boundaries.Spriggs
Next question is: when do I need to save space? Almost never. Unless you're in very limited environments, avoid bit fields.Icelander
bitfields are almost never used for flags because if you need to add another one, you will break the ABIBuddleia
As an addition: it behaves more like a boolean: you can write flags.is_keyword == flags.is_extern ( compare with ((flags & IS_KEYWORD) == 0) == ((flags & IS_EXTERN) == 0) ). On the other hand, with traditional bitfields you can check multiple values with one compare statements: (flags & (IS_KEYWORD IS_EXTERN)) == IS_KEYWORD (it means IS_KEYWORD but not IS_EXTERN)Simmie
@GaborSch "like boolean" not absolutely true, you can have values larger than 1. But that is a rarely used use case.Malvia
@Malvia I was talking about the current example, we used boolean values here.Simmie
@z̫͋ - even if you use an opaque struct?Rabblerousing
@Rabblerousing If the struct is opaque, you can handle it only through a pointer. In C the type of the pointer is irrelevant and in C++ it only influences name mangling. So the short answer is "No", the long answer is "If it is opaque it never was part of the ABI."Malvia
C99 allows _Bool, which is preferred for "is_xyz" like fields.Angiosperm
@Angiosperm true, but it depends on what your goal is. Using the bit field approach will still head smaller structures in most cases. (>4 elements) _Bool is effectively at least char sized whereas using bit fields up to 32 entries will still be unsigned int sized.Malvia
@Malvia I meant _Bool as type in the bitfield: _Bool is_xyz : 1;. May affect static analysis (MISRA in my case) or behavior of _Generic.Angiosperm
Interesting, I did not consider than angle. Using _Bool instead of unsigned int may also make the resulting struct smaller (probably negated by padding). Actually good point.Malvia
It reduces the data size but it increases the code size. Speaking from experience here. Wish I had never &c &c ,,,Exposure
B
73

Now I am curious, [are flags] the only way bitfields are used practically?

No, flags are not the only way bitfields are used. They can also be used to store values larger than one bit, although flags are more common. For instance:

typedef enum {
    NORTH = 0,
    EAST = 1,
    SOUTH = 2,
    WEST = 3
} directionValues;

struct {
    unsigned int alice_dir : 2;
    unsigned int bob_dir : 2;
} directions;

Do we need to use bitfields to save space?

Bitfields do save space. They also allow an easier way to set values that aren't byte-aligned. Rather than bit-shifting and using bitwise operations, we can use the same syntax as setting fields in a struct. This improves readability. With a bitfield, you could write

directions.alice_dir = WEST;
directions.bob_dir = SOUTH;

However, to store multiple independent values in the space of one int (or other type) without bitfields, you would need to write something like:

#define ALICE_OFFSET 0
#define BOB_OFFSET 2
directions &= ~(3<<ALICE_OFFSET); // clear Alice's bits
directions |= WEST<<ALICE_OFFSET; // set Alice's bits to WEST
directions &= ~(3<<BOB_OFFSET);   // clear Bob's bits
directions |= SOUTH<<BOB_OFFSET;  // set Bob's bits to SOUTH

The improved readability of bitfields is arguably more important than saving a few bytes here and there.

Why do we use int? How much space is occupied?

The space of an entire int is occupied. We use int because in many cases, it doesn't really matter. If, for a single value, you use 4 bytes instead of 1 or 2, your user probably won't notice. For some platforms, size does matter more, and you can use other data types which take up less space (char, short, uint8_t, etc.).

As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?

No, that is not correct. The entire unsigned int will exist, even if you're only using 8 of its bits.

Bookerbookie answered 24/7, 2014 at 13:17 Comment(7)
Could you expand more on the doing it manually section? Why would you need to do that?Firmament
@Firmament I'd be happy to add more detail; can you tell me what part of that you're struggling to understand?Bookerbookie
I think I understand now, "doing it manually" would be trying to extract the data without a backing struct, which is why you'd have to do the bit manipulation yourself. Correct?Firmament
Yes, exactly. I can clear up that language, "manually" probably isn't quite specific enough.Bookerbookie
@EricFinn If The space of an entire int is occupied , why is sizeof(directions) 4 bytes (It should be 8 bytes following what you stated)? In my machine, sizeof(int) is 4 bytesNelson
@xuanduc611 Sorry if it wasn't clear. That statement was a direct answer to "As I understand only 1 bit is occupied in memory, but not the whole unsigned int value." "The space of an entire int is occupied" is meant to explain why sizeof(directions) would be 4 (the entire unsigned int is occupied) as opposed to 1 (if the smallest data type large enough to contain the fields were used) or even 0.5 (what the asker thought would happen)Bookerbookie
@EricFinn By the standard, only five types are accepted: unsigned int, signed int, int, bool (C23, _Bool in C99), _BitInt (C23). Ability to use another types — "char, short, uint8_t, etc." — is implementation-dependent.Sodality
C
31

Another place where bitfields are common are hardware registers. If you have a 32 bit register where each bit has a certain meaning, you can elegantly describe it with a bitfield.

Such a bitfield is inherently platform-specific. Portability does not matter in this case.

Canopy answered 24/7, 2014 at 12:19 Comment(3)
Portability doesn't just apply to the hardware. Different compilers for the same architecture may disagree on the ordering of bit fields.Khanate
While the caveat is true, I've rarely seen embedded projects where multiple compilers were used. Usually you stick with one for a project.Spent
If the hardware register is in a IP block and the IP block driver is used in multiple architectures you would have multiple compilers. I.e. it's not that uncommon as one would think.Mv
B
12

We use bit fields mostly (though not exclusively) for flag structures - bytes or words (or possibly larger things) in which we try to pack tiny (often 2-state) pieces of (often related) information.

In these scenarios, bit fields are used because they correctly model the problem we're solving: what we're dealing with is not really an 8-bit (or 16-bit or 24-bit or 32-bit) number, but rather a collection of 8 (or 16 or 24 or 32) related, but distinct pieces of information.

The problems we solve using bit fields are problems where "packing" the information tightly has measurable benefits and/or "unpacking" the information doesn't have a penalty. For example, if you're exposing 1 byte through 8 pins and the bits from each pin go through their own bus that's already printed on the board so that it leads exactly where it's supposed to, then a bit field is ideal. The benefit in "packing" the data is that it can be sent in one go (which is useful if the frequency of the bus is limited and our operation relies on frequency of its execution), and the penalty of "unpacking" the data is non-existent (or existent but worth it).

On the other hand, we don't use bit fields for booleans in other cases like normal program flow control, because of the way computer architectures usually work. Most common CPUs don't like fetching one bit from memory - they like to fetch bytes or integers. They also don't like to process bits - their instructions often operate on larger things like integers, words, memory addresses, etc.

So, when you try to operate on bits, it's up to you or the compiler (depending on what language you're writing in) to write out additional operations that perform bit masking and strip the structure of everything but the information you actually want to operate on. If there are no benefits in "packing" the information (and in most cases, there aren't), then using bit fields for booleans would only introduce overhead and noise in your code.

Boucher answered 24/7, 2014 at 15:14 Comment(0)
K
12

To answer the original question »When to use bit-fields in C?« … according to the book "Write Portable Code" by Brian Hook (ISBN 1-59327-056-9, I read the German edition ISBN 3-937514-19-8) and to personal experience:

Never use the bitfield idiom of the C language, but do it by yourself.

A lot of implementation details are compiler-specific, especially in combination with unions and things are not guaranteed over different compilers and different endianness. If there's only a tiny chance your code has to be portable and will be compiled for different architectures and/or with different compilers, don't use it.

We had this case when porting code from a little-endian microcontroller with some proprietary compiler to another big-endian microcontroller with GCC, and it was not fun. :-/

This is how I have used flags (host byte order ;-) ) since then:

# define SOME_FLAG        (1 << 0)
# define SOME_OTHER_FLAG  (1 << 1)
# define AND_ANOTHER_FLAG (1 << 2)

/* test flag */
if ( someint & SOME_FLAG ) {
    /* do this */
}

/* set flag */
someint |= SOME_FLAG;

/* clear flag */
someint &= ~SOME_FLAG;

No need for a union with the int type and some bitfield struct then. If you read lots of embedded code those test, set, and clear patterns will become common, and you spot them easily in your code.

Killian answered 27/2, 2015 at 14:57 Comment(2)
Can you share some actual code that would break with specific compilers or not work on a different architecture? Something like "NEVER" decorated with smiley faces but no counter-example sounds like a strong opinionated myth.Righthanded
IMO, if you are in a context where you are considering using bitfields, you probably should be thinking about endianness at the same time.Charissacharisse
B
8

Why do we need to use bit-fields?

When you want to store some data which can be stored in less than one byte, those kind of data can be coupled in a structure using bit fields.

In the embedded word, when one 32 bit world of any register has different meaning for different word then you can also use bit fields to make them more readable.

I found that bit fields are used for flags. Now I am curious, is it the only way bit-fields are used practically?

No, this not the only way. You can use it in other ways too.

Do we need to use bit fields to save space?

Yes.

As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?

No. Memory only can be occupied in multiple of bytes.

Boardinghouse answered 24/7, 2014 at 12:17 Comment(0)
E
5

A good usage would be to implement a chunk to translate to—and from—Base64 or any unaligned data structure.

struct {
    unsigned int e1:6;
    unsigned int e2:6;
    unsigned int e3:6;
    unsigned int e4:6;
} base64enc; // I don't know if declaring a 4-byte array will have the same effect.

struct {
    unsigned char d1;
    unsigned char d2;
    unsigned char d3;
} base64dec;

union base64chunk {
    struct base64enc enc;
    struct base64dec dec;
};

base64chunk b64c;
// You can assign three characters to b64c.enc, and get four 0-63 codes from b64dec instantly.

This example is a bit naive, since Base64 must also consider null-termination (i.e. a string which has not a length l so that l % 3 is 0). But works as a sample of accessing unaligned data structures.

Another example: Using this feature to break a TCP packet header into its components (or other network protocol packet header you want to discuss), although it is a more advanced and less end-user example. In general: this is useful regarding PC internals, SO, drivers, an encoding systems.

Another example: analyzing a float number.

struct _FP32 {
    unsigned int sign:1;
    unsigned int exponent:8;
    unsigned int mantissa:23;
}

union FP32_t {
    _FP32 parts;
    float number;
}

(Disclaimer: Don't know the file name / type name where this is applied, but in C this is declared in a header; Don't know how can this be done for 64-bit floating-point numbers since the mantissa must have 52 bits and—in a 32 bit target—ints have 32 bits).

Conclusion: As the concept and these examples show, this is a rarely used feature because it's mostly for internal purposes, and not for day-by-day software.

Epley answered 24/7, 2014 at 15:20 Comment(3)
Problems with union-izing float: Endian-ness. in an opposite endian machine, the needed structure may be struct _FP32 { unsigned int mantissa:23; unsigned int exponent:8; unsigned int sign:1; }. Bit fields are not well defined when larger than the bit-width of unsigned. Since an unsigned must only be at least 16 bits, any width > 16 runs into portability problems - something eluded to with "how can this be done for 64-bit floats".Lavonna
This answer is not standard C. The compiler is allowed to pack the bit fields in any way it wants, you can't rely on it being least-significant-first & no padding.Tael
"A good usage would be" - that's true. But: does it work? In my case it doesn't, because the compiler does not pack individual bits.Gi
B
5

Bit fields can be used for saving memory space (but using bit fields for this purpose is rare). It is used where there is a memory constraint, e.g., while programming in embedded systems.

But this should be used only if extremely required because we cannot have the address of a bit field, so address operator & cannot be used with them.

Brevity answered 24/7, 2014 at 15:23 Comment(1)
@Jerfov2 they save a tons of space. Imagine a server application that uses 48 bit numbers (millions of them). Do you want to pay for 48GB ram or 64GB? which one would your customer like more?Teodoro
A
3

To answer the parts of the question no one else answered:

Ints, not Shorts

The reason to use ints rather than shorts, etc. is that in most cases no space will be saved by doing so.

Modern computers have a 32 or 64 bit architecture and that 32 or 64 bits will be needed even if you use a smaller storage type such as a short.

The smaller types are only useful for saving memory if you can pack them together (for example a short array may use less memory than an int array as the shorts can be packed together tighter in the array). For most cases, when using bitfields, this is not the case.

Other uses

Bitfields are most commonly used for flags, but there are other things they are used for. For example, one way to represent a chess board used in a lot of chess algorithms is to use a 64 bit integer to represent the board (8*8 pixels) and set flags in that integer to give the position of all the white pawns. Another integer shows all the black pawns, etc.

Adamant answered 24/7, 2014 at 13:39 Comment(2)
Note: Many (100 of million per year - 2013) embedded processors use 8 and 16 bit architectures. C is very popular there.Lavonna
@chux-ReinstateMonica Nearly all microcontrollers ever !Roberson
H
2

You can use them to expand the number of unsigned types that wrap. Ordinary you would have only powers of 8,16,32,64... , but you can have every power with bit-fields.

struct a
{
    unsigned int b : 3 ;
} ;

struct a w = { 0 } ;

while( 1 )
{
    printf("%u\n" , w.b++ ) ;
    getchar() ;
}
Heldentenor answered 24/7, 2014 at 12:27 Comment(0)
M
2

Why do we use int? How much space is occupied?

One answer to this question that I haven't seen mentioned in any of the other answers, is that the C standard guarantees support for int. Specifically:

A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation defined type.

It is common for compilers to allow additional bit-field types, but not required. If you're really concerned about portability, int is the best choice.

Moriyama answered 10/12, 2014 at 20:41 Comment(0)
H
1

To utilize the memory space, we can use bit fields.

As far as I know, in real-world programming, if we require, we can use Booleans instead of declaring it as integers and then making bit field.

Haematothermal answered 24/7, 2014 at 12:14 Comment(1)
"In real world", booleans will normally be more than a bit.Canopy
A
1

If they are also values we use often, not only do we save space, we can also gain performance since we do not need to pollute the caches.

However, caching is also the danger in using bit fields since concurrent reads and writes to different bits will cause a data race and updates to completely separate bits might overwrite new values with old values...

Amin answered 24/7, 2014 at 12:23 Comment(0)
J
1

Bitfields are much more compact and that is an advantage.

But don't forget packed structures are slower than normal structures. They are also more difficult to construct since the programmer must define the number of bits to use for each field. This is a disadvantage.

Jung answered 24/12, 2015 at 3:16 Comment(0)
C
1

Nowadays, microcontrollers (MCUs) have peripherals, such as I/O ports, ADCs, DACs, onboard the chip along with the processor.

Before MCUs became available with the needed peripherals, we would access some of our hardware by connecting to the buffered address and data buses of the microprocessor. A pointer would be set to the memory address of the device and if the device saw its address along with the R/W signal and maybe a chip select, it would be accessed.

Oftentimes we would want to access individual or small groups of bits on the device.

Cholera answered 23/9, 2021 at 19:46 Comment(0)
S
0

In our project, we used this to extract a page table entry and page directory entry from a given memory address:

union VADDRESS {
    struct {
        ULONG64 BlockOffset : 16;
        ULONG64 PteIndex : 14;
        ULONG64 PdeIndex : 14;
        ULONG64 ReservedMBZ : (64 - (16 + 14 + 14));
    };

    ULONG64 AsULONG64;
};

Now suppose, we have an address:

union VADDRESS tempAddress;
tempAddress.AsULONG64 = 0x1234567887654321;

Now we can access PTE and PDE from this address: cout << tempAddress.PteIndex;

Shaduf answered 30/12, 2020 at 10:18 Comment(0)
F
0

Note another good use of bit fields is to assign configuration bits in the register configuration of a microcontroller say.

For example you will see a register that has the LSB as INT_FLAG and then the next 3 bits is ADC_CHANNEL and then another bit which is ADC_COMPLETE, and imagine you want to change channels for the next ADC reading so you could say

if(ADC_Config.ADC_COMPLETE)
{ 
  Result[ADC_CHANNEL] = ADC_Result;
  ADC_CHANNEL++;
  Start_Conversion;
}

Only the channel bits are incremented and because there are only 3 bits allocated it will count 0-7 then back to zero.

Note that I have just rattled off an imaginary example not based on any particular processor that exists.

The whole thing becomes much cleaner and less prone to error

Fourpenny answered 6/5 at 7:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.