Can you std::bit_cast to a std::array to obtain the bytes of an object?
Asked Answered
R

3

21

In his recent talk “Type punning in modern C++” Timur Doumler said that std::bit_cast cannot be used to bit cast a float into an unsigned char[4] because C-style arrays cannot be returned from a function. We should either use std::memcpy or wait until C++23 (or later) when something like reinterpret_cast<unsigned char*>(&f)[i] will become well defined.

In C++20, can we use an std::array with std::bit_cast,

float f = /* some value */;
auto bits = std::bit_cast<std::array<unsigned char, sizeof(float)>>(f);

instead of a C-style array to get bytes of a float?

Resonator answered 10/10, 2019 at 9:59 Comment(0)
K
18

Yes, this works on all major compilers, and as far as I can tell from looking at the standard, it is portable and guaranteed to work.

First of all, std::array<unsigned char, sizeof(float)> is guaranteed to be an aggregate (https://eel.is/c++draft/array#overview-2). From this follows that it holds exactly a sizeof(float) number of chars inside (typically as a char[], although afaics the standard doesn't mandate this particular implementation - but it does say the elements must be contiguous) and cannot have any additional non-static members.

It is therefore trivially copyable, and its size matches that of float as well.

Those two properties allow you to bit_cast between them.

Kohn answered 10/10, 2019 at 20:27 Comment(5)
Note that struct X { unsigned char elems[5]; }; satisfies the rule you're citing. It can certainly be list-initialized with up to 4 elements. It can also be list-initialized with 5 elements. I don't think any standard library implementer hates people enough to actually do this, but I think it's technically conformant.Byerly
Thanks! – Barry, I don't think that's quite right. The standard says: "can be list-initialized with up to N elements". My interpretation is that "up to" implies "no more than". Which means you can't do elems[5]. And at that point I can't see how you could end up with an aggregate where sizeof(array<char, sizeof(T)>) != sizeof(T)?Kohn
I believe the purpose of the rule ("an aggregate that can be list-initialized...") is to allow either struct X { unsigned char c1, c2, c3, c4; }; or struct X { unsigned char elems[4]; }; – so while the chars need to be the elements of that aggregate, this allows them to be either direct aggregate elements or elements of a single sub-aggregate.Kohn
@Timur "up to" does not imply "no more than". In the same way that the implication P -> Q does not imply anything about the case where !PByerly
Even if the aggregate contains nothing but an array of exactly 4 elements, there's no guarantee that array itself won't have padding. Implementations of it may not have padding (and any implementations that do should be considered dysfunctional), but there's no guarantee that array itself will not.Lehman
P
14

The accepted answer is incorrect because it fails to consider alignment and padding issues.

Per [array]/1-3:

The header <array> defines a class template for storing fixed-size sequences of objects. An array is a contiguous container. An instance of array<T, N> stores N elements of type T, so that size() == N is an invariant.

An array is an aggregate that can be list-initialized with up to N elements whose types are convertible to T.

An array meets all of the requirements of a container and of a reversible container ([container.requirements]), except that a default constructed array object is not empty and that swap does not have constant complexity. An array meets some of the requirements of a sequence container. Descriptions are provided here only for operations on array that are not described in one of these tables and for operations where there is additional semantic information.

The standard does not actually require std::array to have exactly one public data member of type T[N], so in theory it is possible that sizeof(To) != sizeof(From) or is_­trivially_­copyable_­v<To>.

I will be surprised if this doesn't work in practice, though.

Ploughboy answered 10/10, 2019 at 10:23 Comment(1)
If they are different sizes, the code should fail to compile, right? As long as it doesn't compile to something broken I'd call that solid.Winniewinnifred
S
3

Yes.

According to the paper that describes the behaviour of std::bit_cast, and its proposed implementation as far as both types have the same size and are trivially copyable the cast should be successful.

A simplified implementation of std::bit_cast should be something like:

template <class Dest, class Source>
inline Dest bit_cast(Source const &source) {
    static_assert(sizeof(Dest) == sizeof(Source));
    static_assert(std::is_trivially_copyable<Dest>::value);
    static_assert(std::is_trivially_copyable<Source>::value);

    Dest dest;
    std::memcpy(&dest, &source, sizeof(dest));
    return dest;
}

Since a float (4 bytes) and an array of unsigned char with size_of(float) respect all those asserts, the underlying std::memcpy will be carried out. Therefore, each element in the resulting array will be one consecutive byte of the float.

In order to prove this behaviour, I wrote a small example in Compiler Explorer that you can try here: https://godbolt.org/z/4G21zS. The float 5.0 is properly stored as an array of bytes (Ox40a00000) that corresponds to the hexadecimal representation of that float number in Big Endian.

Stride answered 10/10, 2019 at 12:17 Comment(12)
Are you sure that std::array is guaranteed to not have padding bits etc.?Ploughboy
Unfortunately, the mere fact that some code works doesn't imply no UB in it. For example, we can write auto bits = reinterpret_cast<std::array<unsigned char, sizeof(float)>&>(f) and get exactly the same output. Does it prove anything?Resonator
@L.F. according to specification: std::array satisfies the requirements of ContiguiosContainer (since C++17).Stride
@ManuelGil: std::vector also satisfies the same criteria and obviously cannot be used here. Is there something requiring that std::array hold the elements inside the class (in a field), preventing it from being a simple pointer to the inner array? (like in vector, which also has a size, which array does not require to have in a field)Psychology
@Psychology The aggregate requirement of std::array effectively requires it to store the elements inside, but I am worrying about layout problems.Ploughboy
@Psychology [array.overview]/2Ploughboy
@L.F.: eel.is/c++draft/dcl.init.aggr#def:aggregate - "no user-declared or inherited constructors" ... well, that sounds like a pretty big constraint, it cannot be anything else than wrapped array.Psychology
@Psychology I know, but we can still have layout issues. timsong-cpp.github.io/cppwp/n4659/class#bit-1Ploughboy
@L.F.: Still cannot see your point (other than that it could be over-aligned, possibly). If array is aggregate (no custom ctor) with list-initialisation, then it has to be array wrapper (or maybe recursive template like tuple - adding fields one by one). It cannot have any other field, so, alignas could mess things up? Or what do you suggest?Psychology
Let us continue this discussion in chat.Psychology
@Psychology what was the conclusion from chat discussion? I'm curious if the accepted answer can be safely used or not.Keratoid
@Keratoid Too long to remember. My point of view is that accepted answer is correct and that there are no padding/alignment issues, because byte/char impose none (the float has to be at least as aligned as char/byte). No padding issues either (we are not reinterpreting a class/struct but raw float).Psychology

© 2022 - 2024 — McMap. All rights reserved.