Why are non-const references to bitfields prohibited?

Asked 12/7, 2013 at 5:29 Answered 12/7, 2013 at 6:46

Section 9.6/3 in C++11 is unusually clear: "A non-const reference shall not be bound to a bit-field." What is the motivation behind this prohibition?

I understand that it's not possible to directly bind a reference to a bitfield. But if I declare something like this,

struct IPv4Header {
  std::uint32_t version:4,         // assumes the IPv4 Wikipedia entry is correct
                IHL:4,
                DSCP:6,
                ECN:2,
                totalLength:16;
};

why can't I say this?

IPv4Header h;

auto& ecn = h.ECN;

I'd expect the underlying code to actually bind to the entire std::uint32_t that contains the bits I'm interested in, and I'd expect read and write operations to generate code to do the appropriate masking. The result might be big and slow, but it seems to me that it should work. This would be consistent with the way the Standard say that references to const bitfields work (again from 9.6/3):

If the initializer for a reference of type const T& is an lvalue that refers to a bit-field, the reference is bound to a temporary initialized to hold the value of the bit-field; the reference is not bound to the bit-field directly.

This suggests that writing to bitfields is the problem, but I don't see what it is. I considered the possibility that the necessary masking could introduce races in multithreaded code, but, per 1.7/3, adjacent bitfields of non-zero width are considered a single object for purposes of multithreading. In the example above, all the bitfields in an IPv4Header object would be considered a single object, so multithreaded code attempting to modify a field while reading other fields would, by definition, already be racy.

I'm clearly missing something. What is it?

Edmond answered 12/7, 2013 at 5:29 Comment(0)

Non-const references can't be bound to bit-fields for the same reason pointers can't point to bit-fields.

While it is not specified whether references occupy storage, it is clear that in non-trivial cases they are implemented as pointers in disguise, and this implementation of references is "intended" by the authors of the language. And just like pointers, references have to point to an addressable storage unit. In normal hardware, the smallest addressable storage unit is per byte (not per bit). It is impossible to bind a non-const reference to a storage unit that is not addressable. Since non-const references require direct binding, a non-const reference cannot be bound to a bit-field. You can take a const reference only because the compiler is allowed to copy the value.

The only way to produce a pointer/reference that can point to bit-fields would be to implement some sort of "superpointer" that in addition to the actual address in storage would also contain some sort of bit-offset and bit-width information, in order to tell the writing code which bits to modify. Note that this additional information would have to be present in all data pointer types, since there's no such type in C++ as "pointer/reference to bit-field". This is basically equivalent to implementing a higher-level storage addressing model, quite detached from the addressing model provided by the underlying OS/hardware platform. C++ language never intended to require that sort of abstraction from the underlying platform out of pure efficiency considerations.

One viable approach would be to introduce a separate category of pointers/references such as "pointer/reference to bit-field", which would have a more complicated inner structure than an ordinary data pointer/reference. Such types would be convertible from ordinary data pointer/reference types, but not the other way around. But it doesn't seem to be worth it.

In practical cases, when I have to deal with data packed into bits and sequences of bits, I often prefer to implement bit-fields manually and avoid language-level bit-fields. The name of bit-field is a compile-time entity with no possibility of run-time selection of any kind. When run-time selection is necessary, a better approach is to declare an ordinary uint32_t data field and manage the individual bits and groups of bits inside it manually. The run-time selection of such manual "bit-field" is easily implemented through masks and shifts (both can be run-time values). Basically, this is close to manual implementation of the aforementioned "superpointers".

Admiralty answered 12/7, 2013 at 6:46 Comment(1)

I'm marking this as the answer, because it makes explicit what I think is the key argument: that if a reference to the word holding a bitfield were to work the way I sketched, there would be a need for additional information regarding the offset of the bitfield into the word, and that's not practically implementable given the references-are-pointers-under-the-hood model that C++ employs. – Edmond 12/7, 2013 at 16:8

You can’t take a non-const reference to a bitfield for the same reason you can’t take its address with &: its actual address is not necessarily aligned to char, which is definitionally the smallest addressable unit of memory in the C++ abstract machine. You can take a const reference to it because the compiler is free to copy the value, as it won’t be mutated.

Consider the issue of separate compilation. A function taking a const uint32_t& needs to use the same code to operate on any const uint32_t&. If different write behaviour is required for ordinary values and bitfield values, then the type doesn’t encode enough information for the function to work correctly on both.

Lubber answered 12/7, 2013 at 5:36 Comment(8)

This doesn't really answer the question, IMO. Why can't the non-const reference bind to the word containing the bitfield, then, on a write, perform the necessary masking to modify only the bits in the bitfield? This is presumably what happens when the bitfield is directly modified, no? – Edmond 12/7, 2013 at 5:57

The last bit is wrong. A const reference implies nothing about the value being or not being mutated. It only prevents mutation through that reference. – Birthroot 12/7, 2013 at 6:28

@AndreyT That still doesn't make that wording any more correct. The compiler is not free to copy it because it won't be mutated. There's no causation here. The compiler is free to copy it by decree. It's not because of the mutability of the bitfield, it's because it is written so. Actually, it's not really free to copy it, it's required to copy it. – Birthroot 12/7, 2013 at 7:1

@R.MartinhoFernandes: I have updated my answer with better justification. Compiling KnowItAllWannabe’s code with Clang (no optimisations) produces a copy, shift, and bitwise AND when binding to the reference; even if you add a function call, it’s done at the call site. I can’t think of another way to implement it without whole-program compilation. – Lubber 12/7, 2013 at 7:12

To implement it a reference to "uint32_t : 4" would probably need to store both the address of the uint32_t, and the offset of the 4 bits used. It should be possible, however, keeping in mind the const means thread safe discussion, providing non-const references to bitfields would still be problematic, as the reference could affect other bitfields. – Gantrisin 12/7, 2013 at 7:25

@R.MartinhoFernandes: Technically speaking, I don't believe that const references to bitfields are required to copy anything, because the text in 9.6/3 that I quoted is in a non-normative note. – Edmond 12/7, 2013 at 16:4

"definitionally" To be a pedant, I'd appreciate a citation to where this is defined. Not that I doubt it, but still... Anyway, the 2nd paragraph is a very succinct practical argument about this, following the theoretical one. – Lamia 16/1, 2018 at 9:55

Sure. :) It’s implied by §4.4¶1 “The fundamental storage unit in the C++ memory model is the byte. […] Every byte has a unique address” and §6.9.2¶3 “A value of a pointer type that is a pointer to […] an object represents the address of the first byte in memory occupied by the object”. (Source: N4659.) Technically this doesn’t exclude the possibility that things other than bytes do not also have unique addresses, but every edition of the standard is consistent with the assumption that it was intended to be exclusive, due to the explicit prohibition on taking the address of a bitfield. – Lubber 16/1, 2018 at 23:26

Recommended topics

Hot tags