Is there a bit-equivalent of sizeof() in C?
Asked Answered
E

4

35

Sizeof() doesn't work when applied to bitfields:

# cat p.c
  #include<stdio.h>
  int main( int argc, char **argv )
  {
    struct { unsigned int bitfield : 3; } s;
    fprintf( stdout, "size=%d\n", sizeof(s.bitfield) );
  }
# gcc p.c -o p
  p.c: In function ‘main’:
  p.c:5: error: ‘sizeof’ applied to a bit-field

...obviously, since it can't return a floating point partial size or something. However, it brought up an interesting question. Is there an equivalent, in C, that will tell you the number of bits in a variable/type? Ideally, it would also work for regular types as well, like char and int, in addition to bitfields.

Update:

If there's no language equivalent of sizeof() for bitfields, what is the most efficient way of calculating it - at runtime! Imagine you have loops that depend on this, and you don't want them to break if you change the size of the bitfield - and no fair cheating and making the bitfield size and the loop length a macro. ;-)

Endurant answered 23/7, 2010 at 15:32 Comment(2)
Pretty sure the layout of the structure is determined at compile time. So while in principle it could be inspected at runtime (though C doesn't provide a way to do this, if I'm reading answers below correctly), it would be invariant once compilation happened (with a particular compiler on a particular platform; it might, of course, vary depending on compiler and platform, based on word-boundary optimizations, etc.).Eritrea
WRT: "...and no fair cheating and making the bitfield size and the loop length a macro. ;-)" Are we language critics, or programmers tasked with producing portable code? The macro processor is an integral part of compiling C code (as it is necessary to process include files). So asking if you can do 'A' in language C without using certain parts of the C language is a specious question.Throaty
F
29

You cannot determine the size of bit-fields in C. You can, however, find out the size in bits of other types by using the value of CHAR_BIT, found in <limits.h>. The size in bits is simply CHAR_BIT * sizeof(type).

Do not assume that a C byte is an octet, it is at least 8 bit. There are actual machines with 16 or even 32 bit bytes.

Concerning your edit: I would say a bit-field int a: n; has a size of n bits by definition. The extra padding bits when put in a struct belong to the struct and not to the bit-field.

My advice: Don't use bit-fields but use (arrays of) unsigned char and work with bitmasks. That way a lot of behaviour (overflow, no padding) is well defined.

Frisch answered 23/7, 2010 at 15:57 Comment(12)
+1 cool, didn't know about CHAR_BIT. what if you needed to calculate the bitfield size at runtime?Endurant
That is just not possible (one of the reasons why people avoid bit-fields). A compiler could implement this as an extension for this, but I have never heard of one.Frisch
@schot: byte != char. In C, char is always 8 bit, thus CHAR_BIT is always 8. Regardless of CPU/etc. Long time ago it might have been different (constant exists for the historical reasons) but not anymore. Check C99, limits.h.Solorzano
@schot: what i meant by run-time is this. imagine you have a bitfield that's 3 bits long. you write a bunch of loops that hardcode 3 in them. then one day you change the bitfield to 4 and the loops break. other than putting that "3" in a location very close to the bitfield definition (would be a good idea anyways), what would be the best way to introspect? maybe shift a 1 til it turns to a 0? who knows... i'm curious the best method.Endurant
@Dummy00001: Sorry, but you are wrong. The C99 standard gives a lower limit for CHAR_BIT as 8. And in Appendix J 3.4 It explicitly states as implementation defined behaviour "The number of bits in a byte."Frisch
@schot: I think we're arguing wording. Section 3.6 (referred by J.3.4) clearly separates byte and char. char would always have 8 bits (as per limits.h' CHAR_BIT definition). if implementation's byte is larger, it has to truncate. Or as per 3.7.1 character is a single-byte character with bit representation that fits in a byte. Implementation's byte can be larger, but one will not be able to access the extra bits from C with char.Solorzano
@Dummy00001: I agree that a machine byte != C byte. But a char is not always 8 bits. 5.2.4.2.1: "Their implementation-defined values shall be equal *or greater* in magnitude (absolute value) to those shown, with the same sign." And then shows: "- number of bits for smallest object that is not a bit-field (byte) CHAR_BIT 8". And 6.2.6.1: "Values stored in non-bit-field objects of anyother object type consist of n×CHAR_BIT bits, where n is the size of an object of that type, in bytes."Frisch
When will this stupid CHAR_BIT argument finally die? On anything except DSPs and 30+ year old legacy mainframes, CHAR_BIT is 8. POSIX requires CHAR_BIT==8, Windows is tied to x86 where CHAR_BIT==8, and the whole Internet and interoperability between networked machines is built on octets. Unless you have a very unusual target (in which case your code will likely not be portable anyway), there is absolutely no point in even thinking about the possibility of CHAR_BIT!=8.Loggerhead
schot... HP Pascal has "bitsizeof" in addition to "sizeof", which gives the size of a variable or field in bits. Extremely useful, and an obvious thing to add to C.Finding
@R..: what about on those DSPs you speak of? Besides, using a numeric constant (even if it is invariant) is an anti-pattern: en.wikipedia.org/wiki/… ... so, while I agree that unless you're in a very specialized environment, it's not worth thinking about CHAR_BIT not being 8, I don't understand why it's a "stupid argument" to point out that it can be used. Or are you arguing against something more specific than that?Eritrea
For sign correctness, use ((size_t)CHAR_BIT) * sizeof(type)Amari
+1 on your advice: My advice: Don't use bit-fields but use (arrays of) unsigned char and work with bitmasks. That way a lot of behaviour (overflow, no padding) is well defined. I recommend uint8_t is all, instead of unsigned char, for clarity and explicitness.Flied
S
5

It is impossible to find a size of bit-field using sizeof(). Refer to C99:

  • 6.5.3.4 The sizeof operator, bit-field is clearly not supported by sizeof()
  • 6.7.2.1 Structure and union specifiers here it is clarified that bit-field isn't self standing member.

Otherwise, you can try to assign to the bit-field member -1u (value with all bits set) and then find the index of the most significant bit. E.g. (untested):

s.bitfield = -1u;
num_bits = ffs(s.bitfield+1)-1;

man ffs for more.

Solorzano answered 23/7, 2010 at 16:27 Comment(6)
+1 this is looking closer to an optimal length finder. what's ffs()?Endurant
@eruciform: ffs = find first set. a function (often mapped directly to a CPU instruction) to find first bit set in the int. bits are numbered from 1. if input int is 0, then return is too 0.Solorzano
nice! that is definitely a function i've never seen. and here i thought i had largely visited the cobweb-encrusted corners of C over the years!Endurant
There's no (portable) way to have ffs computed at compile-time, so this is generally inefficient. However, your loops probably don't depend on the count of bits but just looping over bits, in which case you can initialize a bitfield with -1 and do something like for (counter.bf=-1; counter.bf; counter.bf>>=1). (Tip: this will only work if your bitfield is unsigned.)Loggerhead
Note that ffs is a POSIX function and not available on all platforms, or for datatypes larger than an int. Of course you can roll your own implementation but that would be a bit slow.Cluff
Interesting approach, but it will not work for signed bitfields nor for unsigned bitfields whose bit-length is exactly that of type unsigned int. Note also that it will mishandle bit-fields longer than that.Tanberg
G
4

I implemented this solution[1]

#include <stdio.h>
#define bitoffsetof(t, f) \
    ({ union { unsigned long long raw; t typ; }; \
    raw = 0; ++typ.f; __builtin_ctzll(raw); })

#define bitsizeof(t, f) \
    ({ union { unsigned long long raw; t typ; }; \
    raw = 0; --typ.f; 8*sizeof(raw)-__builtin_clzll(raw)\
        -__builtin_ctzll(raw); })

struct RGB565 { unsigned short r:5, g:6, b:5; };
int main()
{
    printf("offset(width): r=%d(%d) g=%d(%d) b=%d(%d)\n",
        bitoffsetof(RGB565, r), bitsizeof(RGB565, r),
        bitoffsetof(RGB565, g), bitsizeof(RGB565, g),
        bitoffsetof(RGB565, b), bitsizeof(RGB565, b));
}


$ gcc bitfieldtest.cpp && ./a.out
offset(width): r=0(5) g=5(6) b=11(5)
[1] https://twitter.com/suarezvictor/status/1477697986243272706

UPDATE: I confirmed this is solved at compile time:

void fill(int *x)
{
    x[0]=bitoffsetof(RGB565, r);
    x[1]=bitsizeof(RGB565, r);
    x[2]=bitoffsetof(RGB565, g);
    x[3]=bitsizeof(RGB565, g);
    x[4]=bitoffsetof(RGB565, b);
    x[5]=bitsizeof(RGB565, b);
}

Assembler output:

fill:
.LFB12:
    .cfi_startproc
    movl    $0, (%rdi)
    movl    $5, 4(%rdi)
    movl    $5, 8(%rdi)
    movl    $6, 12(%rdi)
    movl    $11, 16(%rdi)
    movl    $5, 20(%rdi)
    ret
Geminate answered 2/1, 2022 at 18:8 Comment(5)
Interesting approach, but you might add that the value of padding bits may change when a bitfield value is changed, hence the above code is not guaranteed to work.Tanberg
I think that problem is not possible since the current implementation doesn't access any existing instance of the structure, it creates one for counting them and initializes it to zero as first step. Then this is discarded. I guess the compiler can solve all that at compile time and just provide the needed constant. The macros just receives a structure type and the name of the field, you cannot touch anything from outside. Does this clarifies that it cannot be any problem?Geminate
I confirm that the compiler solves that at compile time, see the updateGeminate
It only proves that a particular compiler, running the code at compile time performs well on this test. Fact is in this particular test there are no padding bits, so the behavior is well defined (except for the aliasing issue, namely reading from a union member that was not last written to, a different yet problematic issue) but the OP's example struct { unsigned int bitfield : 3; } s; has at least 5 padding bits whose value could change when s.bitfield is set.Tanberg
this is fantastic and I think with std::popcount and std::find/find_if being constexpr, even a portable template can be created along the very same lines. thank you for this idea!Brainstorming
T
-1

Use a set of #define statements to specify the bitwidths in the definition of the structure, and then use the same #define when printing, or whatever.

You get the same 'define once, use many times', albeit you do have the bitfield size definitions cluttering up your global name space:

# cat p.c
#include<stdio.h>
int main( int argc, char **argv )
{
   #define bitfield_sz 3
   struct { unsigned int bitfield : bitfield_sz; } s;
   fprintf( stdout, "size=%d\n", bitfield_sz );
}
# gcc p.c -o p
# ./p
size=3
#
Throaty answered 4/2, 2022 at 18:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.