memmove in-place change of effective type (type-punning)

Asked 14/10, 2017 at 14:46 Answered 23/6, 2021 at 18:18

In the following question: What's a proper way of type-punning a float to an int and vice-versa?, the conclusion is that the way to construct doubles from integer bits and vise versa is via memcpy.

That's fine, and the pseudo_cast conversion method found there is:

template <typename T, typename U>
inline T pseudo_cast(const U &x)
{
    static_assert(sizeof(T) == sizeof(U));    
    T to;
    std::memcpy(&to, &x, sizeof(T));
    return to;
}

and I would use it like this:

int main(){
  static_assert(std::numeric_limits<double>::is_iec559);
  static_assert(sizeof(double)==sizeof(std::uint64_t));
  std::uint64_t someMem = 4614253070214989087ULL;
  std::cout << pseudo_cast<double>(someMem) << std::endl; // 3.14
}

My interpretation from just reading the standard and cppreference is/was that is should also be possible to use memmove to change the effective type in-place, like this:

template <typename T, typename U>
inline T& pseudo_cast_inplace(U& x)
{
    static_assert(sizeof(T) == sizeof(U));
    T* toP = reinterpret_cast<T*>(&x);
    std::memmove(toP, &x, sizeof(T));
    return *toP;
}

template <typename T, typename U>
inline T pseudo_cast2(U& x)
{
    return pseudo_cast_inplace<T>(x); // return by value
}

The reinterpret cast in itself is legal for any pointer (as long as cv is not violated, item 5 at cppreference/reinterpret_cast). Dereferencing however requires memcpy or memmove (§6.9.2), and T and U must be trivially copyable.

Is this legal? It compiles and does the right thing with gcc and clang. memmove source and destinations are explicitly allowed to overlap, according to cppreference std::memmove and memmove,

The objects may overlap: copying takes place as if the characters were copied to a temporary character array and then the characters were copied from the array to dest.

Edit: originally the question had a trivial error (causing segfault) spotted by @hvd. Thank you! The question remains the same, is this legal?

Elene answered 14/10, 2017 at 14:46 Comment(16)

Looks like a strict alias violation to me because x and the return value of pseudo_cast_inplace point to the same memory location but have different types. – Mayamayakovski 14/10, 2017 at 14:54

@nwp, ok... But - Reading about effective types and strict aliasing, at en.cppreference.com/w/c/language/object#Effective_type - I thought I changed the effective type and after that I'm not "using" the original, or am I? Compare the effective type "setting" example here: en.cppreference.com/w/c/string/byte/memmove – Elene 14/10, 2017 at 15:4

std::memmove(&toP, ...) -- toP is a pointer already, you don't want a pointer to a pointer at this point. – Pyjamas 14/10, 2017 at 15:12

Trivial error. Thank you. With that the code works. The question remains the same - is it legal? – Elene 14/10, 2017 at 15:16

Worth pointing out is that "effective type" is a C term and the rules in C++ are different, but both with C's rules and with C++'s it's a good question. – Pyjamas 14/10, 2017 at 15:24

An interesting data point is that with MSVS2017 the inplace-version generates a real function call, while the pseudo_cast just becomes vmovsd xmm1,qword ptr [rsp+50h]. memcpy is well known and gets special treatment from many compilers. – Puett 14/10, 2017 at 15:28

@BoPersson Might be that it does not dare to optimize explicit memmove's at all. – Dc 14/10, 2017 at 15:30

Don't you think that effective type of memmove input and output in this case is the same U type? – Housewifery 14/10, 2017 at 15:34

I think you've stumbled into an area of the c++ standard that has disappeared up its own backside. There is a struggle in the c++ community between the "optimisationists" and the "objects are just bytes" camps. At the moment the optimisationists have the upper hand and as a result the second function transgresses the strict alias rule which a lot of code optimisation depends upon. Returning a different object is the way to do it. I know it looks daft, and to an old-timer assembly programmer like me feels wrong, but it gives the optimiser every opportunity to make better code. – Navarro 14/10, 2017 at 16:32

It makes absolutely no sense to move data from a location to the same location! Thus, the line std::memmove(toP, &x, sizeof(T)); makes no sense after that line: T* toP = reinterpret_cast<T*>(&x); std::memmove(toP, &x, sizeof(T));. Don't write such code! – Anisette 14/10, 2017 at 18:15

That kind of code is not portable as the standard allows different binary representation for integers and also for floating points values. – Anisette 14/10, 2017 at 18:28

@Anisette The question is not about taste, but regards what the standard says about a specific construct. Further, constructing doubles out of bits is an extremely useful thing to do in very specific cases for example in a specific kind of high performance double random number construction, and serialisation. Even Java (sic) has a way to do this now, for example Double.longBitsToDouble(0x3FFL << 52 | x >>> 12) - 1.0 and the question I linked to shows how to do it portably (under implementation specified constraints, not undefined), given sufficient asserts, eg ::is_iec559, sizeof, ... – Elene 14/10, 2017 at 18:43

@RichardHodges, yes I follow the endless threads on the cpp standard mailing list. The memcpy version is acceptable but I really thought there would be a way to do this (a la std::launder or placement new) at least for trivial types, but it seems not. I'm working in context of high performance application an would also like to do things like std::vector resize_uninitialized(size) for trivial types before filling up the (gigabytes) of data. reserve+push_back works sometimes, but not for example when splitting up processing of the vector in chunks. – Elene 14/10, 2017 at 19:9

@JohanLundberg The fact that a piece of code is well defined by the standard does not mean that it is a good practice! Here, the code is misleading as the move is useless and only make the code harder to understand. If one really want to create a double from a bit pattern, then he should write a function like double from_bit_pattern(uint64_t bits). – Anisette 14/10, 2017 at 19:18

@Phil1970, yes - and how to do that portably and effectively is partly what inspired this question - can we just leave this in agreement and move on? :D – Elene 14/10, 2017 at 19:22

@Phil1970: What if e.g. one has one function (whose source code one may not change) that populates an array of 64-bit long values, and another (likewise unalterable) that reads an array of 64-bit long long values, and one wants to use the first to produce an array that could be read by the second? How should one go about it in a way that's not likely to result in compilers generating loads of silly and useless code? – Liddie 25/10, 2017 at 20:9

C++ does not allow a double to be constructed merely by copying the bytes. An object first needs to be constructed (which may leave its value uninitialised), and only after that can you fill in its bytes to produce a value. This was underspecified up to C++14, but the current draft of C++17 includes in [intro.object]:

An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2).

Although constructing a double with default initialision does not perform any initialisation, the construction does still need to happen. Your first version includes this construction by declaring the local variable T to;. Your second version does not.

You could modify your second version to use placement new to construct a T in the same location that previously held an U object, but in that case, when you pass &x to memmove, it is no longer required to read the bytes that had made up x's value, because the object x has already been destroyed by the earlier placement new.

Pyjamas answered 14/10, 2017 at 15:49 Comment(6)

this is not very convincing, as far as I can tell, because pseudo_cast_inplace() returns a reference and hence no lvalue-to-rvalue conversion occurs at the return statement ( *pto it's not an access there, it is when used in pseudo_cast2(), but there your argument does not apply anymore ) that said, I think it's illegal due to strict aliasing violation ... – Liberec 14/10, 2017 at 16:1

@MassimilianoJanes "it is when used in pseudo_cast2(), but there your argument does not apply anymore" -- That's exactly where it does apply. Although it receives a reference to T, the referenced object still has type U (because no T object was ever created), so the lvalue-to-rvalue conversion violates the aliasing rules. – Pyjamas 14/10, 2017 at 16:4

ok, reading your answer I wrongly assumed you were referring to pseudo_cast_inplace(), sorry ... – Liberec 14/10, 2017 at 16:10

Ok. So, does that also mean that the C examples (use of malloc) here are not valid C++? en.cppreference.com/w/c/string/byte/memmove . Could we use such code (for example a serialization library) compiled as C from C++? – Elene 14/10, 2017 at 16:27

@JohanLundberg In C++, the official interpretation is that even int *p = (int *) malloc(sizeof(int)); if (p) *p = 3; is invalid because no int object has been constructed. See p0137r1.html, which introduced the new wording: "Drafting note: this maintains the status quo that malloc alone is not sufficient to create an object." – Pyjamas 14/10, 2017 at 16:33

@JohanLundberg I missed one bit in your comment: "Could we use such code (for example a serialization library) compiled as C from C++?" If the code is actually compiled by a C compiler, instead of just being copied into a C++ project, then C's rules apply, not C++'s, and that should be enough to allow calling such a function from C++ code. – Pyjamas 14/10, 2017 at 16:42

My reading of the standard suggests that both these functions will result in UB.

consider:

int main()
{
    long x = 10;
    something_with_x(x*10);
    double& y = pseudo_cast_inplace<double>(x);
    y = 20;
    something_with_y(y*10);
}

Because of the strict alias rule, it seems to me that there's nothing to stop the compiler from reordering instructions to produce code as-if:

int main()
{
    long x = 10;
    double& y = pseudo_cast_inplace<double>(x);
    y = 20;
    something_with_x(x*10);   // uh-oh!
    something_with_y(y*10);
}

I think the only legal way to write this is:

template <typename T, typename U>
inline T pseudo_cast(U&& x)
{
    static_assert(sizeof(T) == sizeof(U));
    T result;
    std::memcpy(std::addressof(result), std::addressof(x), sizeof(T));
    return result;
}

Which in reality results in the exact same assembler output (i.e. none whatsoever - the entire function is elided, as are the variables themselves) - at least on gcc with -O2

Navarro answered 14/10, 2017 at 19:45 Comment(2)

I think, you nailed it. – Andreeandrei 14/10, 2017 at 20:42

Unfortunately, from what I've seen both gcc and clang will sometimes erroneously optimize out code which reads an object with one type and writes back the exact same bit patterns using a different type, and then fail to recognize the aliasing relationship between the two types implied by the operation. While the fact that the memcpy gets optimized out may be good from an efficiency perspective, I'm not sure how effectively one could prove that such omission won't illegitimately alter program semantics. – Liddie 25/10, 2017 at 20:5

This should be legal in C++20. Example in godbolt.

template <typename T, typename U>
requires (
    sizeof(U) >= sizeof(T) and 
    std::alignment_of_v<T> <= std::alignment_of_v<U> and 
    std::is_trivially_copyable_v<T> and
    std::is_trivially_destructible_v<U>
)
[[nodiscard]] T& reinterpret_object(U& obj)
{
    // Get access to object representation
    std::byte* bytes = reinterpret_cast<std::byte*>(&obj); 
    
    // Copy object representation to temporary buffer.
    // Implicitly create a T object in the destination storage. The lifetime of U object ends.
    // Copy temporary buffer back.
    void* storage = std::memmove(bytes, bytes, sizeof(T));
    
    // Storage pointer value is 'pointer to T object', so we are allowed to cast it to the proper pointer type.
    return *static_cast<T*>(storage);
}

reinterpret_cast to a different pointer type is allowed (7.6.1.10)

An object pointer can be explicitly converted to an object pointer of a different type.
Accessing the object representation through an std::byte* pointer is allowed (7.2.1)
If a program attempts to access the stored value of an object through a glvalue whose type is not similar to one of the following types the behavior is undefined
- a char, unsigned char, or std::byte type.
std::memmove behaves as-if copying to a temporary buffer and can implicitly create objects (21.5.3)

The functions memcpy and memmove are signal-safe. Both functions implicitly create objects ([intro.object]) in the destination region of storage immediately prior to copying the sequence of characters to the destination.

Implicit object creation is described in (6.7.2)

Some operations are described as implicitly creating objects within a specified region of storage. For each operation that is specified as implicitly creating objects, that operation implicitly creates and starts the lifetime of zero or more objects of implicit-lifetime types ([basic.types]) in its specified region of storage if doing so would result in the program having defined behavior. If no such set of objects would give the program defined behavior, the behavior of the program is undefined. If multiple such sets of objects would give the program defined behavior, it is unspecified which such set of objects is created. [Note 4: Such operations do not start the lifetimes of subobjects of such objects that are not themselves of implicit-lifetime types. — end note]

Further, after implicitly creating objects within a specified region of storage, some operations are described as producing a pointer to a suitable created object. These operations select one of the implicitly-created objects whose address is the address of the start of the region of storage, and produce a pointer value that points to that object, if that value would result in the program having defined behavior. If no such pointer value would give the program defined behavior, the behavior of the program is undefined. If multiple such pointer values would give the program defined behavior, it is unspecified which such pointer value is produced.

It is not specified that std::memmove is such a function and its returned pointer value would be a pointer to the implicitly created object. But it makes sense that is is so.
Returning a pointer to the new object is allowed by (7.6.1.9)

A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.

If std::memmove does not return a usable pointer value, std::launder<T>(reinterpret_cast<T*>(bytes)) (17.6.5) should be able to produce such a pointer value.

Additional notes:

I'm not 100% sure if all the requires are correct or some condition is missing.
To get zero overhead, the compiler must to optimize the std::memmove away (gcc and clang seem to do it).
The lifetime of the original object ends (6.7.3)

A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling a destructor or pseudo-destructor ([expr.prim.id.dtor]) for the object.

This means that using the original name or pointers or references to it will result in undefined behaviour.

The object can be "revived" by reinterpreting it back reinterpret_object<U>(reinterpret_object<T>(obj)) and that should allow using the old references (6.7.3)

If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object. An object o1 is transparently replaceable by an object o2 if:
- the storage that o2 occupies exactly overlays the storage that o1 occupied, and
- o1 and o2 are of the same type (ignoring the top-level cv-qualifiers), and
- o1 is not a complete const object, and
- neither o1 nor o2 is a potentially-overlapping subobject ([intro.object]), and
- either o1 and o2 are both complete objects, or o1 and o2 are direct subobjects of objects p1 and p2, respectively, and p1 is transparently replaceable by p2.
The object representations should be "compatible", interpreting the bytes of the original object as bytes of the new one can produce "garbage" or even trap representations.

Helmick answered 23/6, 2021 at 18:18 Comment(0)

Accessing a double while the actual type is uint64_t is undefined behavior because compiler will never consider that an object of type double can share the address of an object of type uint64_t intro.object:

Unless an object is a bit-field or a base class subobject of zero size, the address of that object is the address of the first byte it occupies. Two objects a and b with overlapping lifetimes that are not bit-fields may have the same address if one is nested within the other, or if at least one is a base class subobject of zero size and they are of different types; otherwise, they have distinct addresses.

Trachytic answered 14/10, 2017 at 17:32 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags