Let's say, I have an array of unsigned chars that represents a bunch of POD objects (e.g. either read from a socket or via mmap). Which types they represent and at what position is determined at runtime, but we assume, that each is already properly aligned.
What is the best way to "cast" those bytes into the respective POD type?
A solution should either be compliant to the c++ standard (let's say >= c++11) or at least be guaranteed to work with g++ >= 4.9, clang++ >= 3.5 and MSVC >= 2015U3. EDIT: On linux, windows, running on x86/x64 or 32/64-Bit arm.
Ideally I'd like to do something like this:
uint8_t buffer[100]; //filled e.g. from network
switch(buffer[0]) {
case 0: process(*reinterpret_cast<Pod1*>(&buffer[4]); break;
case 1: process(*reinterpret_cast<Pod2*>(&buffer[8+buffer[1]*4]); break;
//...
}
or
switch(buffer[0]) {
case 0: {
auto* ptr = new(&buffer[4]) Pod1;
process(*ptr);
}break;
case 1: {
auto* ptr = new(&buffer[8+buffer[1]*4]) Pod2;
process(*ptr);
}break;
//...
}
Both seem to work, but both are AFAIK undefined behavior in c++1). And just for completeness: I'm aware of the "usual" solution to just copy the stuff into an appropriate local variable:
Pod1 tmp;
std::copy_n(&buffer[4],sizeof(tmp), reinterpret_cast<uint8_t*>(&tmp));
process(tmp);
In some situations it might be no overhead in others it is and in some situations it might even be faster but performance aside, I no longer can e.g. modify the data in place and to be honest: it just annoys me to know that I have the right bits at an appropriate location in memory but I just can't use them.
A somewhat crazy solution I came up with is this:
template<class T>
T* inplace_cast(uint8_t* data) {
//checks omitted for brevity
T tmp;
std::memmove((uint8_t*)&tmp, data, sizeof(tmp));
auto ptr = new(data) T;
std::memmove(ptr, (uint8_t*)&tmp, sizeof(tmp));
return ptr;
}
g++ and clang++ seem to be able to optimize away those copies but I think this puts a lot of burden on the optimizer and might cause other optimizations to fail, doesn't work with const uint8_t*
(although I don't want to actually modify it) and just looks horrible (don't think you would get that past code review).
1) The first one is UB because it breaks strict aliasing, the second one is probably UB (discussed here) because the standard just says that the resulting object is not initialized and has indeterminate value (instead of guaranteeing that the underlying memory is untouched). I believe the first one's equivalent c-code is well defined, so compilers might allow this for compatibility with c-headers, but I'm unsure of this.
What is the best way to "cast" those bytes into the respective POD type?
I'm aware of the "usual" solution to just copy the stuff into an appropriate local variable
– Unguiculateit just annoys me to know that I have the right bits at an appropriate location in memory but I just can't use them.
then, maybe, C++ isn't the right language for you, or at least objects are not the right thing. If you want bits, why use structs/classes for the data at all? Just take the byte array and modify it like you want. – UnguiculateIt gives me enough control... pay the price for the overhead of copying the data.
I understand, but ... sometimes, we just can't have everything. The restrictions in the standard are real, and other than a) getting then standard modified and/or b) ensure that a specific platform and compiler won't ever have problems with this kind of UB; I'm pretty sure there is no magic-bullet-solution. – Unguiculatethere is no magic-bullet-solution
That maybe true and I asked this question precisely to find that out. It never ceases to amaze me, what you can do in c++ that probably wasn't intended by the designers of this or that feature. That being said. I showed one way to achieve pretty much what I want in (what I believe to be) standards compliant c++ code. So it is possible - the question ins now how to do it best. – Exaction-fno-strict-aliasing
is a possibility, but I'd prefer something that works with the default compiler settings, because 3 Years from now someone will probably copy the code to another project and forgets that the specific settings are necessary (Of course the same problem might apply, if one relies on a compiler extension). – Exactionchar*
to refer to the memory of any object, but you can't use any pointer to refer to an array of chars. But thanks for having a look anyways. – Exactionvoid *
, then you can laterstatic_cast
it to the desired type. Basically you tells the compiler that the memory location is "typeless" before the cast. This should be safe. – Suppose-flifetime-dse
) – Fireman