std::bit_cast padding and undefined behavior
Asked Answered
B

1

6

I want to know how to use std::bit_cast in a well-defined way, noticeably in presence of indetermined bits. When is std::bit_cast usage defined behavior, when is it undefined behavior?

Thus I need a clarification about the wording of cppreference about std::bit_cast. I don't understand the meaning of the following paragraph:

For each bit in the value representation of the result that is indeterminate, the smallest object containing that bit has an indeterminate value; the behavior is undefined unless that object is of unsigned char or std::byte type. The result does not otherwise contain any indeterminate values.

When the input (From) is containing indeterminate bits (noticeably padding bits) what are the mentioned objects and what is meaning the distinction about their type (std::byte vs other)?

Let's write an example (not an actual use-case, merely an illustration of a situation with indeterminate bits):

#include <bit>
#include <cstdint>
struct S40 {
    std::uint8_t a = 0x51;
    std::uint32_t b = 0xa353c0f1;
};
struct S3 {
    std::uint8_t a : 3;
};

int main() {
    S40 s40;
    std::uint64_t a64 = std::bit_cast<std::uint64_t>(s40);
    // (3) on tested compilers, a64 is 0xa353c0f1UUUUUU51 where U is undefined byte
    S3 s3;
    s3.a = 0b101;
    std::uint8_t a8 = std::bit_cast<std::uint8_t>(s3);
    // (4) on tested compilers, a8 is uuuuu101, u being indeterminate bits
}

Live
In this example, I'm playing with alignment in order to induce padding bits between a and b.
(note how clang is actually throwing garbage into the padding bits).
Is the production of a64 undefined behavior?
with S3, I tried to emulate a less than 1 byte type with padding bits.
Is the production of a8 defined behavior, while its value is undefined due to the presence of indeterminate bits?


For the record, I leave here the original example that tried to stress the possible difference between the 1 byte case and the other situations. But as it was rightly stressed in one answer, it does not demonstrate anything as passing a bit field inside the std::bit_cast implied an cast to a fully defined value (of its underlying type).

#include <bit>
#include <cstdint>

struct S9 {
    std::uint16_t a : 9;
};
struct S7 {
    std::uint8_t a : 7;
};

int main() {
    S9 s9;
    s9.a = 42;
    std::uint16_t a16 = std::bit_cast<std::uint16_t>(s9.a);
    // (1) a16 may be uuuuuuu000101010, u being indeterminate bits
    S7 s7;
    s7.a = 42;
    std::uint8_t a8 = std::bit_cast<std::uint8_t>(s7.a);
    // (2) a8 may be u0101010, u being indeterminate bits
}

What are the objects we are speaking of?
Is the production of a8 considered as defined behavior, while its value is undefined?
Is the production of a16 considered as undefined behavior, while its value is undefined?


Brant answered 13/2, 2024 at 14:30 Comment(11)
Not that it will necessarily always help, but when cppreference seems vague, I recommend reading an actual standard draft. For example, eel.is maintains the latest draftIsotone
Good idea, if I understand my draft correctly, there is no undefined behavior, only if the "cast" result in something that does not correspond to a valid value representation for the returned type. You still may have indeterminate values.Brant
@Brant Unfortunately, in this case, the standard contains the exact same wording as cppreference. It also says "[...] for each bit in the value representation of the result that is indeterminate, the smallest object containing that bit has an indeterminate value; the behavior is undefined [...]"Cauca
@FrançoisAndrieux Ah, my draft is N4860 2020/03/31. Too old?Brant
"// (1) a16 is uuuuuuu000101010" bitset is implementation defined, so it is one possible implementation.Hypostatize
@Brant N4861: C++20 final working draft and N4868: C++20 first post-publication draft (contains editorial fixes to C++20 only). For C++20, I'd go for N4868.Tallowy
@Hypostatize indeed but I allowed myself this shortcut to illustrate the problem.Brant
Bitsets do not do what you think they are doing. They are fully implementation defined and will not be exactly the number of bits you specify. In fact the only guarantee you have is that they behave as values with the number of bits specified. Just think of bit_cast as a type punning mechanism (memcpy + start of lifetime of the object cast to)Acrocarpous
I'm lost: most recent C++20 wording I find from cppreference is timsong-cpp.github.io/cppwp/n4868/bit.cast#lib:bit_cast which does not speak about the indeterminate bits and undefine behavior, while the very last draft on eel.is/c++draft/bit.cast#lib:bit_cast is. I think that the (my) confusion has been introduced in C++23.Brant
@PepijnKramer Small nit, but I think you meant to say "bit fields" and not "bitsets"Isotone
@Isotone No problem you're absolutely right ;)Acrocarpous
I
2

In your example you are passing bit-fields by const lvalue reference to std::bit_cast.

A bit-field can't be bound to a reference. What happens instead is that a new object of the referenced type is created as a temporary object with the value of the bit-field. That temporary object is then bound to the reference in the bit_cast parameter and bit_cast operates on that as source.

So, nothing happens here. In both cases the source object is already of the type to which you try to bit_cast. You are simply copying the source with the same type. The representation of the bit-field has already become irrelevant with the conversion in the function argument.

The matter would be very different if you passed s9/s7 directly to bit_cast, as the other answer explains.

Introjection answered 13/2, 2024 at 17:26 Comment(2)
I realize that my example with bit fields has only created confusion. My question as nearly nothing to do with that but only with std::bit_cast behavior. Besides, I overlooked the implicit conversion that is happening when trying to pass a bit field value around. I will edit my question with, I hope, a better example.Brant
@Brant I don't have time to fully update my answer and there are probably also some details where the standard isn't clear (at least to me), but in short: "Is the production of a64 undefined behavior?": Yes, assuming it isn't ill-formed because of size mismatch. "Is the production of a8 defined behavior, while its value is undefined due to the presence of indeterminate bits?": The value is not undefined, but indeterminate. Otherwise yes, assuming uint8_t is unsigned char (in theory could be another type). Using an indeterminate value for anything but copying it is UB.Introjection

© 2022 - 2025 — McMap. All rights reserved.