Suppose we get some data as a sequence of bytes, and want to reinterpret that sequence as a structure (having some guarantees that the data is indeed in the correct format). For example:
#include <fstream>
#include <vector>
#include <cstdint>
#include <cstdlib>
#include <iostream>
struct Data
{
std::int32_t someDword[629835];
std::uint16_t someWord[9845];
std::int8_t someSignedByte;
};
Data* magic_reinterpret(void* raw)
{
return reinterpret_cast<Data*>(raw); // BAD! Breaks strict aliasing rules!
}
std::vector<char> getDataBytes()
{
std::ifstream file("file.bin",std::ios_base::binary);
if(!file) std::abort();
std::vector<char> rawData(sizeof(Data));
file.read(rawData.data(),sizeof(Data));
if(!file) std::abort();
return rawData;
}
int main()
{
auto rawData=getDataBytes();
Data* data=magic_reinterpret(rawData.data());
std::cout << "someWord[346]=" << data->someWord[346] << "\n";
data->someDword[390875]=23235;
std::cout << "someDword=" << data->someDword << "\n";
}
Now the magic_reinterpret
here is actually bad, since it breaks strict aliasing rules and thus causes UB.
How should it instead be implemented to not cause the UB and not do any copies of data like with memcpy
?
EDIT: the getDataBytes()
function above was in fact considered some unchangeable function. A real-world example is ptrace(2)
, which on Linux, when request==PTRACE_GETREGSET
and addr==NT_PRSTATUS
, writes (on x86-64) one of two possible structures of different sizes, depending on tracee bitness, and returns the size. Here ptrace
calling code can't predict what type of structure it will get until it actually does the call. How could it then safely reinterpret the results it gets as the correct pointer type?
reinterpret_cast
). You could do it in C and use aunion
to do type-punning, maybe write a function in C to do only this? – Aeriela