Why would the behavior of std::memcpy be undefined for objects that are not TriviallyCopyable?
Asked Answered
N

10

80

From http://en.cppreference.com/w/cpp/string/byte/memcpy:

If the objects are not TriviallyCopyable (e.g. scalars, arrays, C-compatible structs), the behavior is undefined.

At my work, we have used std::memcpy for a long time to bitwise swap objects that are not TriviallyCopyable using:

void swapMemory(Entity* ePtr1, Entity* ePtr2)
{
   static const int size = sizeof(Entity); 
   char swapBuffer[size];

   memcpy(swapBuffer, ePtr1, size);
   memcpy(ePtr1, ePtr2, size);
   memcpy(ePtr2, swapBuffer, size);
}

and never had any issues.

I understand that it is trivial to abuse std::memcpy with non-TriviallyCopyable objects and cause undefined behavior downstream. However, my question:

Why would the behavior of std::memcpy itself be undefined when used with non-TriviallyCopyable objects? Why does the standard deem it necessary to specify that?

UPDATE

The contents of http://en.cppreference.com/w/cpp/string/byte/memcpy have been modified in response to this post and the answers to the post. The current description says:

If the objects are not TriviallyCopyable (e.g. scalars, arrays, C-compatible structs), the behavior is undefined unless the program does not depend on the effects of the destructor of the target object (which is not run by memcpy) and the lifetime of the target object (which is ended, but not started by memcpy) is started by some other means, such as placement-new.

PS

Comment by @Cubbi:

@RSahu if something guarantees UB downstream, it renders the entire program undefined. But I agree that it appears to be possible to skirt around UB in this case and modified cppreference accordingly.

Neotype answered 21/4, 2015 at 16:3 Comment(25)
It would probably be helpful to add fuel to your question by citing the standard section where this is claimed (if at all) in addition to the link on cppreference. The downstream calamity is pretty clear (and apparently the focus of answers that seem to have missed that realization in your question).Thermostat
@WhozCraig, I use a draft standard (N3337) for references and didn't find any such claim for the behavior of std::memcpy.Neotype
@RSahu Why not the FD of C++14? The days of C++11 are long gone ;)Tubercular
@Columbo, I wish I could make that claim for my work. We still use VS2008 :)Neotype
@RSahu Not far ahead of you; we still use VS2010, which at-least has a half-baked C++03x.Thermostat
@Tubercular We don't even have a GCC release that is C++14 feature-complete yet...Plautus
There's an interesting recent paper.Plautus
§3.9/3 [basic.types] "For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes making up obj1 are copied into obj2, obj2 shall subsequently hold the same value as obj1". (emphasis mine) The subsequent sample uses std::memcpy.Huerta
I just learned that in C, objects do not have types. The "type of an object" in C is merely a property of an access expression. This causes all kinds of funny issues for C++, where objects do have types. For example, CWG #1116. memcpy will probably suffer from similar issues.Cello
@Cello I don't think that's quite right. Objects in C have declared type and effective type (aliasing checks are worded in terms of effective) and write through lvalue and memcpy have the magic property of imbuing freshly allocated bytes with an effective type IF they have no declared type: 6.5/6 in C11Electron
@Electron I have to admit that I still don't fully understand C's object model. In C++ all objects have a type (subject to CWG 1116). This is different at least to the fact that C admits that there are objects without declared type. It seems to me as if the effective type of an object with a declared type cannot change. If memcpy in C++ truly can create new objects, it might be possible to "change" the type of an object in C++.Cello
@Cello "I just learned that in C, objects do not have types" - the standard uses the term "object of type T" quite often. It seems to me that the object model is not properly defined in either language.Rum
@Cello "In C++ all objects have a type" - what's the type of the object allocated by std::malloc(8) ? (or std::aligned_storage<8,8>)Rum
@MattMcNabb malloc does not create objects. This is brought up in some DRs, and mentioned in N4430. aligned_storage_t is some POD type. It is still unclear to me (even after CWG 1116 and N4430 and the discussion on the UB reflector), if that POD-type-object still exists after you have placement-new'ed a new object on top of its storage.Cello
@Cello OK, N4430's change is substantial to the definition of "object".Rum
@MattMcNabb I don't think this matters for the case of malloc. N4430 explicitly says it maintains the status quo regarding malloc. This probably implies that one should interpret the current (pre-N4430) wording "an object is a region of storage" not as an equivalence (not all regions of storage are objects).Cello
@Cello I don't see how that statement can be a definition if it's not stating an equivalence. So , what is an object exactly?Rum
@MattMcNabb Wouldn't you agree that a dog is an animal? In the same or a similar sense, I think, is an object a region of storage.Cello
...and cppreference has since been edited again to say "unspecified and may be UB" with a link back to this thread. If the Standard doesn't specify something... it's unspecified. You can't just say "well, it wasn't explicitly called out as UB, therefore it's defined behaviour!", which is what this thread seems to amount to. N3797 3.9 points 2~3 do not define what memcpy does for non-trivially-copyable objects, so it's unspecified at best. That's pretty much functionally equivalent to UB in my eyes as both are useless for writing reliable, i.e. portable code.Croaky
@underscore_d, that section of the standard does not say anything about what happens to objects of types that are not trivially copyable. So, you are right in that regard. Using memcpy may lead to UB downstream. By carefully managing use of memcpy on objects of non-trivially copyable types, we have a robust application for close to 20 years (it has been tested on both Linux and Windows, over many versions of compilers). Hence, the behavior of memcpy itself cannot be UB for such types. You run into trouble only if you mismanage those objects.Neotype
@RSahu I misspoke before. If the Standard doesn't define something (including as unspecified a.k.a. 'implementation-defined, but you don't need to tell anyone how') - then that thing is UB, not unspecified. I'm sure someone with your rep knows that UB is a concept, not a manifestation, and so something is still UB even if it's not exploded... yet. That probably won't happen in your case, but you're still relying on choices - admittedly, probably sensible ones - by popular compiler writers to optionally handle this in an implementation-defined way, rather than just optimising it away or worseCroaky
@underscore_d, memcpy is performing dumb copying of bytes. As long as the memory locations it is accessing are valid, there is no reason for it to invoke UB. I think it's overreaching to say that memcpy will invoke UB when used with non-trivially copyable objects. Once again, I want to emphasize that it can lead to UB easily if those objects are not managed carefully by an application.Neotype
@RSahu Again, UB is not an event that one invokes. (Although colloquial usage often implies that.) It's just a description of any behaviour - whatever, and however consistent, that is - resulting from any code doing something the Standard calls "undefined" or simply omits to address. Of course your program doesn't go out of its way to make your life a misery or, I dunno, play an orchestral hit every time you memcpy a non-trivial object. That doesn't mean it's not, by the letter, UB. You're just benefiting from the fact that your compilers 'implementation-define' this - and in a way you like!Croaky
Good related find: #30114897Baecher
@Croaky How do you know which object is "hit" by memcpy? There are several objects at the same address.Cyrilla
T
50

Why would the behavior of std::memcpy itself be undefined when used with non-TriviallyCopyable objects?

It's not! However, once you copy the underlying bytes of one object of a non-trivially copyable type into another object of that type, the target object is not alive. We destroyed it by reusing its storage, and haven't revitalized it by a constructor call.

Using the target object - calling its member functions, accessing its data members - is clearly undefined[basic.life]/6, and so is a subsequent, implicit destructor call[basic.life]/4 for target objects having automatic storage duration. Note how undefined behavior is retrospective. [intro.execution]/5:

However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

If an implementation spots how an object is dead and necessarily subject to further operations that are undefined, ... it may react by altering your programs semantics. From the memcpy call onward. And this consideration gets very practical once we think of optimizers and certain assumptions that they make.

It should be noted that standard libraries are able and allowed to optimize certain standard library algorithms for trivially copyable types, though. std::copy on pointers to trivially copyable types usually calls memcpy on the underlying bytes. So does swap.
So simply stick to using normal generic algorithms and let the compiler do any appropriate low-level optimizations - this is partly what the idea of a trivially copyable type was invented for in the first place: Determining the legality of certain optimizations. Also, this avoids hurting your brain by having to worry about contradictory and underspecified parts of the language.

Tubercular answered 21/4, 2015 at 16:12 Comment(15)
I agree with your points but they all point to undefined behavior downstream. None of them should make behavior of memcpy undefined. It's supposed to do a dumb copy of bytes from the source to the destination.Neotype
@RSahu Ok, I made some adjustments. Better?Tubercular
"A program may end the lifetime of any object ..." It is unclear to me whether or not the part about "reusing the storage" also applies to objects with nontrivial destruction. Can you also end the lifetime of an object with trivial destruction by calling its destructor?Cello
@Cello Well, the lifetime of an object, in any case, ends after its storage is "reused or released" ([basic.life]/1.4). The part about the destructor is kinda optional, but the storage thing is mandatory.Tubercular
It seems to me an object of trivially copyable type can have non-trivial initialization. So if memcpy ends the lifetime of the destination object with such a type, it will not have been resurrected. This is inconsistent with your argumentation, I think (though it might be an inconsistency in the Standard itself).Cello
(I think it is possible that this is not entirely well-specified, or that important information is either missing from the Standard or very hard to deduce. For example, what does "reuse the storage" mean?)Cello
@Cello Reusing the storage <=> Directly modifying one or more bytes of the object representation through a glvalue of type char or unsigned char? I dunno. Specified nowhere, goddamit.,Tubercular
@Cello Yeah, I ran against that wall as well. However, I believe that the last quote I provided defines this as an exception. I.e. once the last byte is copied, the object is initialized.Tubercular
Ok, after some more thoughts and digging into the std-discussion list: The lifetime of any object is ended when its storage is reused (agreed, but IMHO this is clearer in 3.8p1). Reuse is probably underspecified, but I guess overwriting via memcpy is intended to count as reuse. The triviality of init (or vacuousness) is a property of the init, not of the type. There is no init via ctor of the target object when memcpy, hence the init is always vacuousCello
But this reading of 3.8/1 suggests that memcpying to string does not need 3.8/1.2 - which I think is not intended (it is intended that 3.8/1.2 applies in this case, methinks). Maybe send an email to std-discussion? (Regarding "reusing": groups.google.com/a/isocpp.org/d/topic/std-discussion/… )Cello
@Cello I thought it was that way: I.e. the vacuousness of initialization is dependent on the particular initialization in each case. That seems to follow from the wording. But that doesn't make sense at all, becaused that implies that we can have any initialization we wish to have, and that suffices. Isn't that also what you're thinking? In that case, yeah, I'd start a discussion.Tubercular
Comments are not for extended discussion; this conversation has been moved to chat.Colorist
"once you copy the underlying bytes of one object of a non-trivially copyable type into another object of that type, the target object is not alive" not if all written bytes are part of a member subobject with trivial typeCyrilla
@Tubercular I disagree that memcpy will count as reuse of storage, and eel.is/c++draft/basic.types#3 seems to as well.Pileate
The behavior of memcpy is undefined, because eel.is/c++draft/basic.types.general#2 and eel.is/c++draft/basic.types.general#3 only specify that memcpy works for trivially copyable bytes, and because everything not explicitly specified is undefined by eel.is/c++draft/intro.defs#defns.undefined.Matti
F
29

It is easy enough to construct a class where that memcpy-based swap breaks:

struct X {
    int x;
    int* px; // invariant: always points to x
    X() : x(), px(&x) {}
    X(X const& b) : x(b.x), px(&x) {}
    X& operator=(X const& b) { x = b.x; return *this; }
};

memcpying such object breaks that invariant.

GNU C++11 std::string does exactly that with short strings.

This is similar to how the standard file and string streams are implemented. The streams eventually derive from std::basic_ios which contains a pointer to std::basic_streambuf. The streams also contain the specific buffer as a member (or base class sub-object), to which that pointer in std::basic_ios points to.

Fern answered 21/4, 2015 at 16:35 Comment(3)
OTOH, I would guess that it is easy to specify that memcpy in such cases simply breaks the invariant, but the effects are strictly defined (recursively memcpys the members until they're trivially copyable).Cello
@dyp: I don't like that because it seems too easy to break encapsulation if this is considered well-defined.Plesiosaur
@Cello That might lead performance freaks to "unwittingly" copy non-copyable objects.Fern
L
25

Because the standard says so.

Compilers may assume that non-TriviallyCopyable types are only copied via their copy/move constructors/assignment operators. This could be for optimization purposes (if some data is private, it could defer setting it until a copy / move occurs).

The compiler is even free to take your memcpy call and have it do nothing, or format your hard drive. Why? Because the standard says so. And doing nothing is definitely faster than moving bits around, so why not optimize your memcpy to an equally-valid faster program?

Now, in practice, there are many problems that can occur when you just blit around bits in types that don't expect it. Virtual function tables might not be set up right. Instrumentation used to detect leaks may not be set up right. Objects whose identity includes their location get completely messed up by your code.

The really funny part is that using std::swap; swap(*ePtr1, *ePtr2); should be able to be compiled down to a memcpy for trivially copyable types by the compiler, and for other types be defined behavior. If the compiler can prove that copy is just bits being copied, it is free to change it to memcpy. And if you can write a more optimal swap, you can do so in the namespace of the object in question.

Landgrave answered 21/4, 2015 at 16:12 Comment(20)
The question to address is: Why does the standard deem it necessary to say so?Neotype
I think the second part of your answer ("Now, in practice..") is much more important and interesting than the first part. In this case, the Standard explicitly makes things more complicated by restricting the spec for memcpy (indirectly) on trivially copyable types. So we can infer that there's probably a reason why the Standard is worded like this, and "because the standard says so" does not go far enough IMHO.Cello
I'm not convinced that the spec actually renders memcpy'ing them undefined. If you use the resulting thing as anything but an array of charor unsigned chars, of course, that's another matter.Plautus
@Plautus If you memcpy from one object of type T to another that is not an array of chars, wouldn't the dtor of the target object cause UB?Cello
@Cello Only if you depend on the side effects produced by the dtor. (Or do you mean UB when that dtor runs later?)Plautus
@Plautus Yes, I mean when the dtor runs later. Of course, this is not always the case (since you can emit it for dynamic lifetime), but I'd guess that in the OP's case, the dtor of the object will be called.Cello
@Cello Sure, unless you placement new a new object there in the mean time. My reading is that memcpy'ing into something counts as "reusing the storage", so it ends the lifetime of what was previously there (and since there's no dtor call, you have UB if you depend on the side effect produced by the dtor), but doesn't begin the lifetime of a new object, and you get UB later at the implicit dtor call unless an actual T is constructed there in the mean time.Plautus
@T.C., that seems theoretical or perhaps true in some uncommon platforms. Take a look at ideone.com/04RSIL. Is there any sane reason why such a program would cause undefined behavior?Neotype
@RSahu The easiest case is where the compiler injects identity into objects, which is legal. As an example, bijectively linking iterators to the containers they come from in std so that your code catches invalidated iterator use early instead of by overwriting memory or the like (a kind of instrumented iterator).Landgrave
@RSahu: For node-based containers, it's not uncommon for several nodes to point at the object itself, and the object's copy/move constructors handle that. If you memcpy, you invalidate those node pointers that point to the container. Also the magic iterator invalidation bits that MSVC uses. Also, objects who's location is part of the state, like mutex. The real question is why is it UB to copy the bytes out, then back in, then keep using the object? §3.9/2 only allows that for trivially copiable types, but I don't know why.Huerta
@MooingDuck, those are very valid reasons why using memcpy on those object will cause problems downstream. Is that reason enough to say the behavior of memcpy is undefined for such objects?Neotype
@RSahu if something guarantees UB downstream, it renders the entire program undefined. But I agree that it appears to be possible to skirt around UB in this case and modified cppreference accordingly.Electron
@Cubbi, thanks for updating the page at en.cppreference.com/w/cpp/string/byte/memcpy.Neotype
@Electron I rephrased it again. If you clobber something of dynamic storage duration with memcpy and just leak it afterwards, the behavior should be well-defined (if you don't depend on the effects of the dtor) even if you don't create a new object there, because there's no implicit dtor call that would cause UB.Plautus
@Cubbi: Would there be anything "Undefined" about using memcpy to copy the bytes underlying a non-trivially-copyable object into an array of bytes or an equivalent thereof, if the copied bytes are never used as an object [but might be examined, as bytes, to extract information from them in Implementation-Defined fashion]?Ecosystem
@Ecosystem I think a C++ compiler could legally put a non-memcpy-able object into a non-memcpy-able address space (like, some kind of protected memory), and audit who has the right to read from it (namely, the code that can read from it). The C++ standard leaves a lot of room for latitude. It can also, as I note, detect your memcpy is UB and simply not do it: similar optimizations when someone tried to detect integer overflow via UB have bitten people. Does it happen now? Not to my knowledge.Landgrave
@Yakk: If an object which isn't trivially copyable has e.g. a field of a PODS type, what would the Standard say about passing the address of that field to memcpy? If that would be defined, I don't see how the compiler could locate an object in non-memcpy'able space unless it could ensure that code would never extract the address of a PODS member and memcpy it.Ecosystem
@Ecosystem PODS objects are legally allowed to be copied bytewise. Other objects cannot. The exact specs is covered in the standard (POD is no longer the criteria). There should be a SO Q&A on it somewhere.Landgrave
@Yakk-AdamNevraumont "a C++ compiler could legally put a non-memcpy-able object into a non-memcpy-able address space" only if you start by throwing away the whole C/C++ memory model and start overCyrilla
"if some data is private, it could defer setting it until a copy / move occurs" Really? "Data" has a trivial type, by definition. And of course the implementation can postpone the writing of a data member until it is possibly read. Can you provide an example of an optimization on real code?Cyrilla
C
19

C++ does not guarantee for all types that their objects occupy contiguous bytes of storage [intro.object]/5

An object of trivially copyable or standard-layout type (3.9) shall occupy contiguous bytes of storage.

And indeed, through virtual base classes, you can create non-contiguous objects in major implementations. I have tried to build an example where a base class subobject of an object x is located before x's starting address. To visualize this, consider the following graph/table, where the horizontal axis is address space, and the vertical axis is the level of inheritance (level 1 inherits from level 0). Fields marked by dm are occupied by direct data members of the class.

L | 00 08 16
--+---------
1 |    dm
0 | dm

This is a usual memory layout when using inheritance. However, the location of a virtual base class subobject is not fixed, since it can be relocated by child classes that also inherit from the same base class virtually. This can lead to the situation that the level 1 (base class sub)object reports that it begins at address 8 and is 16 bytes large. If we naively add those two numbers, we'd think it occupies the address space [8, 24) even though it actually occupies [0, 16).

If we can create such a level 1 object, then we cannot use memcpy to copy it: memcpy would access memory that does not belong to this object (addresses 16 to 24). In my demo, is caught as a stack-buffer-overflow by clang++'s address sanitizer.

How to construct such an object? By using multiple virtual inheritance, I came up with an object that has the following memory layout (virtual table pointers are marked as vp). It is composed through four layers of inheritance:

L  00 08 16 24 32 40 48
3        dm         
2  vp dm
1              vp dm
0           dm

The issue described above will arise for the level 1 base class subobject. Its starting address is 32, and it is 24 bytes large (vptr, its own data members and level 0's data members).

Here's the code for such a memory layout under clang++ and g++ @ coliru:

struct l0 {
    std::int64_t dummy;
};

struct l1 : virtual l0 {
    std::int64_t dummy;
};

struct l2 : virtual l0, virtual l1 {
    std::int64_t dummy;
};

struct l3 : l2, virtual l1 {
    std::int64_t dummy;
};

We can produce a stack-buffer-overflow as follows:

l3  o;
l1& so = o;

l1 t;
std::memcpy(&t, &so, sizeof(t));

Here's a complete demo that also prints some info about the memory layout:

#include <cstdint>
#include <cstring>
#include <iomanip>
#include <iostream>

#define PRINT_LOCATION() \
    std::cout << std::setw(22) << __PRETTY_FUNCTION__                   \
      << " at offset " << std::setw(2)                                  \
        << (reinterpret_cast<char const*>(this) - addr)                 \
      << " ; data is at offset " << std::setw(2)                        \
        << (reinterpret_cast<char const*>(&dummy) - addr)               \
      << " ; naively to offset "                                        \
        << (reinterpret_cast<char const*>(this) - addr + sizeof(*this)) \
      << "\n"

struct l0 {
    std::int64_t dummy;

    void report(char const* addr) { PRINT_LOCATION(); }
};

struct l1 : virtual l0 {
    std::int64_t dummy;

    void report(char const* addr) { PRINT_LOCATION(); l0::report(addr); }
};

struct l2 : virtual l0, virtual l1 {
    std::int64_t dummy;

    void report(char const* addr) { PRINT_LOCATION(); l1::report(addr); }
};

struct l3 : l2, virtual l1 {
    std::int64_t dummy;

    void report(char const* addr) { PRINT_LOCATION(); l2::report(addr); }
};

void print_range(void const* b, std::size_t sz)
{
    std::cout << "[" << (void const*)b << ", "
              << (void*)(reinterpret_cast<char const*>(b) + sz) << ")";
}

void my_memcpy(void* dst, void const* src, std::size_t sz)
{
    std::cout << "copying from ";
    print_range(src, sz);
    std::cout << " to ";
    print_range(dst, sz);
    std::cout << "\n";
}

int main()
{
    l3 o{};
    o.report(reinterpret_cast<char const*>(&o));

    std::cout << "the complete object occupies ";
    print_range(&o, sizeof(o));
    std::cout << "\n";

    l1& so = o;
    l1 t;
    my_memcpy(&t, &so, sizeof(t));
}

Live demo

Sample output (abbreviated to avoid vertical scrolling):

l3::report at offset  0 ; data is at offset 16 ; naively to offset 48
l2::report at offset  0 ; data is at offset  8 ; naively to offset 40
l1::report at offset 32 ; data is at offset 40 ; naively to offset 56
l0::report at offset 24 ; data is at offset 24 ; naively to offset 32
the complete object occupies [0x9f0, 0xa20)
copying from [0xa10, 0xa28) to [0xa20, 0xa38)

Note the two emphasized end offsets.

Cello answered 25/4, 2015 at 13:58 Comment(10)
That is a great answer. Thank you for the in depth explanation and demo code.Neotype
Only a subobject can be non continuous. A complete object is continuous.Cyrilla
@Cyrilla Is this guaranteed by the standard? What about padding bytes? Would an object consisting of three pages, the middle one inaccessible, be non-compliant?Cello
@Cello Not continuously significant! Not all bytes matter. Bytes that don't matter... don't matter. So you can say there are "holes" in the representation, but the memory occupied by the representation is inside sizeof(T) bytes starting at the address of the complete object, which was my point. You can have an object of a non abstract class type in sufficiently big and aligned storage. It's a strong requirement at the language semantic level and memory access level: all allocated memory is equivalent. Storage can be reuse.Cyrilla
Only const objects that are global, or static, that are constantly const (no mutable members and no modification in c/dtor) might be treated specially in practice, because they can be put in read-only memory, and could be put in "special" memory as proposed in other answers. But other objects are not constant in memory and the freedom given by C++ means that memory is not typed: all non constant memory storing user defined objects is generic.Cyrilla
Padding is just a hole in the layout of a datatype. It means some part of memory is not used to determine the value (or behavior) of some object. In a struct, padding would behave like an unnamed member that's never initialized, and for which it never matters that it's uninitialized (no issues of reading that uninitialized value). It doesn't affect the underlying storage of the object: the object is still identified with a continuous region of storage. You can end its lifetime and reuse the storage, that has no type.Cyrilla
"Would an object consisting of three pages, the middle one inaccessible, be non-compliant?" I am for an extension with user defined object layout: the layout would be describe as a set of constraints and the compiler would have to obey these or bail out. That would make ABI compatibility of some interface classes very explicit in the source code. So you should be able to create such a silly layout with that hypothetical annotation. You could mmap an area and then poke holes in it by restricting access to pages in the middle. You could fit an object (with the large padding exactly there).Cyrilla
I just don't see how the compiler could ever fit such "discontinuous" object on a discontinuous memory mapping while still allowing the user to reuse its memory. So unless it's a constant object (object in constant memory) which cannot be overwritten, it's impossible. Bytes that don't matter... don't matter in the representation of an object, matter as raw bytes usable for something else.Cyrilla
Virtual inheritance is a unique feature in C++: it's the only case when the "direct subobject" relationship isn't one to one: a virtual base subobject has a strong direct subobject relationship with all indirectly derived class subobjects. This is seen in the initialization process, as the virtual bases are initialized by the most direct object ctor. (Virtual private is nearly meaningless.) Non virtual inheritance is less transparent to derived classes than virtual inheritance: initialization is delegated but overriding is not, and you can override virtual functions of a private base class.Cyrilla
But membership is a non "transparent", non practically transitive relationship: you can't do anything with a private member of a member. You can't act on its initialization or override anything in it. So direct inheritance is a stronger relation than membership, and virtual inheritance is stronger than indirect non virtual inheritance. It means you can't use memcpy on a base class but you can on a member because it has no strong relation with its "super-object" (whatever that means - it isn't clear that member subobject is well defined).Cyrilla
K
6

Many of these answers mention that memcpy could break invariants in the class, which would cause undefined behaviour later (and which in most cases should be reason enough not to risk it), but that doesn't seem to be what you're really asking.

One reason for why the memcpy call itself is deemed to be undefined behaviour is to give as much room as possible to the compiler to make optimizations based on the target platform. By having the call itself be UB, the compiler is allowed to do weird, platform-dependent things.

Consider this (very contrived and hypothetical) example: For a particular hardware platform, there might be several different kinds of memory, with some being faster than others for different operations. There might, for instance, be a kind of special memory that allows extra fast memory copies. A compiler for this (imaginary) platform is therefore allowed to place all TriviallyCopyable types in this special memory, and implement memcpy to use special hardware instructions that only work on this memory.

If you were to use memcpy on non-TriviallyCopyable objects on this platform, there might be some low-level INVALID OPCODE crash in the memcpy call itself.

Not the most convincing of arguments, perhaps, but the point is that the standard doesn't forbid it, which is only possible through making the memcpy call UB.

Khaki answered 22/4, 2015 at 9:32 Comment(4)
Thank you for addressing the core question. It's interesting that the highly upvoted answers talk about the downstream effects but not the core question.Neotype
"there might be several different kinds of memory" Do you have a specific CPU in mind?Cyrilla
"there might be several different kinds of memory" In C/C++? There is only one type of malloc, one type of new.Cyrilla
A compiler can choose to put const global objects in read-only memory, for instance. That's an example of special memory optimization that is not far-fetched. This particular example is more hypothetical and contrived, but it's theoretically possible for the compiler to in the same way place a global non-trivially-copyable in some kind of non-memcopyable memory if it wants to.Khaki
C
4

memcpy will copy all the bytes, or in your case swap all the bytes, just fine. An overzealous compiler could take the "undefined behaviour" as an excuse to to all kinds of mischief, but most compilers won't do that. Still, it is possible.

However, after these bytes are copied, the object that you copied them to may not be a valid object anymore. Simple case is a string implementation where large strings allocate memory, but small strings just use a part of the string object to hold characters, and keep a pointer to that. The pointer will obviously point to the other object, so things will be wrong. Another example I have seen was a class with data that was used in very few instances only, so that data was kept in a database with the address of the object as a key.

Now if your instances contain a mutex for example, I would think that moving that around could be a major problem.

Celie answered 22/4, 2015 at 12:12 Comment(1)
Yes but that's a user code problem, not a core language problem.Cyrilla
K
2

Another reason that memcpy is UB (apart from what has been mentioned in the other answers - it might break invariants later on) is that it is very hard for the standard to say exactly what would happen.

For non-trivial types, the standard says very little about how the object is laid out in memory, in which order the members are placed, where the vtable pointer is, what the padding should be, etc. The compiler has huge amounts of freedom in deciding this.

As a result, even if the standard wanted to allow memcpy in these "safe" situations, it would be impossible to state what situations are safe and which aren't, or when exactly the real UB would be triggered for unsafe cases.

I suppose that you could argue that the effects should be implementation-defined or unspecified, but I'd personally feel that would be both digging a bit too deep into platform specifics and giving a little bit too much legitimacy to something that in the general case is rather unsafe.

Khaki answered 22/4, 2015 at 9:55 Comment(2)
I have no problem with saying that use of memcpy to write to such an object invokes UB, since an object could have fields which are constantly changing but will cause bad things to happen if they're changed in ways the compiler doesn't know about. Given T *p, is there any reason why memcpy(buffer, p, sizeof (T)), where buffer is a char[sizeof (T)]; should be allowed to do anything other than write some bytes into the buffer?Ecosystem
The vptr is just another hidden member (or many such members for MI). It doesn't matter where they are located, if you copy a complete object onto another of the same type.Cyrilla
C
2

First, note that it is unquestionable that all memory for mutable C/C++ objects has to be un-typed, un-specialized, usable for any mutable object. (I guess the memory for global const variables could hypothetically be typed, there is just no point with such hyper complication for such tiny corner case.) Unlike Java, C++ has no typed allocation of a dynamic object: new Class(args) in Java is a typed object creation: creation an object of a well defined type, that might live in typed memory. On the other hand, the C++ expression new Class(args) is just a thin typing wrapper around type-less memory allocation, equivalent with new (operator new(sizeof(Class)) Class(args): the object is created in "neutral memory". Changing that would mean changing a very big part of C++.

Forbidding the bit copy operation (whether done by memcpy or the equivalent user defined byte by byte copy) on some type gives a lot freedom to the implementation for polymorphic classes (those with virtual functions), and other so called "virtual classes" (not a standard term), that is the classes that use the virtual keyword.

The implementation of polymorphic classes could use a global associative map of addresses which associate the address of a polymorphic object and its virtual functions. I believe that was an option seriously considered during the design of the first iterations C++ language (or even "C with classes"). That map of polymorphic objects might use special CPU features and special associative memory (such features aren't exposed to the C++ user).

Of course we know that all practical implementations of virtual functions use vtables (a constant record describing all dynamic aspects of a class) and put a vptr (vtable pointer) in each polymorphic base class subobject, as that approach is extremely simple to implement (at least for the simplest cases) and very efficient. There is no global registry of polymorphic objects in any real world implementation except possibly in debug mode (I don't know such debug mode).

The C++ standard made the lack of global registry somewhat official by saying that you can skip the destructor call when you reuse the memory of an object, as long as you don't depend on the "side effects" of that destructor call. (I believe that means that the "side effects" are user created, that is the body of the destructor, not implementation created, as automatically done to the destructor by the implementation.)

Because in practice in all implementations, the compiler just uses vptr (pointer to vtables) hidden members, and these hidden members will be copied properly bymemcpy; as if you did a plain member-wise copy of the C struct representing the polymorphic class (with all its hidden members). Bit-wise copies, or complete C struct members-wise copies (the complete C struct includes hidden members) will behave exactly as a constructor call (as done by placement new), so all you have to do it let the compiler think you might have called placement new. If you do a strongly external function call (a call to a function that cannot be inlined and whose implementation cannot be examined by the compiler, like a call to a function defined in a dynamically loaded code unit, or a system call), then the compiler will just assume that such constructors could have been called by the code it cannot examine. Thus the behavior of memcpy here is defined not by the language standard, but by the compiler ABI (Application Binary Interface). The behavior of a strongly external function call is defined by the ABI, not just by the language standard. A call to a potentially inlinable function is defined by the language as its definition can be seen (either during compiler or during link time global optimization).

So in practice, given appropriate "compiler fences" (such as a call to an external function, or just asm("")), you can memcpy classes that only use virtual functions.

Of course, you have to be allowed by the language semantic to do such placement new when you do a memcpy: you cannot willy-nilly redefine the dynamic type of an existing object and pretend you have not simply wrecked the old object. If you have a non const global, static, automatic, member subobject, array subobject, you can overwrite it and put another, unrelated object there; but if the dynamic type is different, you cannot pretend that it's still the same object or subobject:

struct A { virtual void f(); };
struct B : A { };

void test() {
  A a;
  if (sizeof(A) != sizeof(B)) return;
  new (&a) B; // OK (assuming alignement is OK)
  a.f(); // undefined
}

The change of polymorphic type of an existing object is simply not allowed: the new object has no relation with a except for the region of memory: the continuous bytes starting at &a. They have different types.

[The standard is strongly divided on whether *&a can be used (in typical flat memory machines) or (A&)(char&)a (in any case) to refer to the new object. Compiler writers are not divided: you should not do it. This a deep defect in C++, perhaps the deepest and most troubling.]

But you cannot in portable code perform bitwise copy of classes that use virtual inheritance, as some implementations implement those classes with pointers to the virtual base subobjects: these pointers that were properly initialized by the constructor of the most derived object would have their value copied by memcpy (like a plain member wise copy of the C struct representing the class with all its hidden members) and wouldn't point the subobject of the derived object!

Other ABI use address offsets to locate these base subobjects; they depend only on the type of the most derived object, like final overriders and typeid, and thus can be stored in the vtable. On these implementation, memcpy will work as guaranteed by the ABI (with the above limitation on changing the type of an existing object).

In either case, it is entirely an object representation issue, that is, an ABI issue.

Cyrilla answered 15/6, 2018 at 19:24 Comment(4)
I read your answer but couldn't figure out the essence of what you are trying to say.Neotype
tl; dr: You can use memcpy on polymorphic classes in practice, where the ABI implies you can, so it's inherently implementation dependent. In any case, you need to use compiler barriers to hide what you are doing (plausible deniability) AND you must still respect language semantics (no attempt to change the type of an existing object).Cyrilla
That's a subset of the object types that are not TriviallyCopyable. Just want to make sure that your answer intends to address the behavior of memcpy only for the polymorphic object types.Neotype
I explicitly discuss virtual classes, a super set of polymorphic classes. I think the historical reason to forbid memcpy for some types was the implementation of virtual functions. For non virtual types, I have no idea!Cyrilla
W
2

Ok, lets try your code with a little example:

#include <iostream>
#include <string>
#include <string.h>

void swapMemory(std::string* ePtr1, std::string* ePtr2) {
   static const int size = sizeof(*ePtr1);
   char swapBuffer[size];

   memcpy(swapBuffer, ePtr1, size);
   memcpy(ePtr1, ePtr2, size);
   memcpy(ePtr2, swapBuffer, size);
}

int main() {
  std::string foo = "foo", bar = "bar";
  std::cout << "foo = " << foo << ", bar = " << bar << std::endl;
  swapMemory(&foo, &bar);
  std::cout << "foo = " << foo << ", bar = " << bar << std::endl;
  return 0;
}

On my machine, this prints the following before crashing:

foo = foo, bar = bar
foo = foo, bar = bar

Weird, eh? The swap does not seem to be performed at all. Well, the memory was swapped, but std::string uses the small-string-optimization on my machine: It stores short strings within a buffer that's part of the std::string object itself, and just points its internal data pointer at that buffer.

When swapMemory() swaps the bytes, it swaps both the pointers and the buffers. So, the pointer in the foo object now points at the storage in the bar object, which now contains the string "foo". Two levels of swap make no swap.

When std::string's destructor subsequently tries to clean up, more evil happens: The data pointer does not point at the std::string's own internal buffer anymore, so the destructor deduces that that memory must have been allocated on the heap, and tries to delete it. The result on my machine is a simple crash of the program, but the C++ standard would not care if pink elephants were to appear. The behavior is totally undefined.


And that is the fundamental reason why you should not be using memcpy() on non-trivially copyable objects: You do not know whether the object contains pointers/references to its own data members, or depends on its own location in memory in any other way. If you memcpy() such an object, the basic assumption that the object cannot move around in memory is violated, and some classes like std::string do rely on this assumption. The C++ standard draws the line at the distinction between (non-)trivially copyable objects to avoid going into more, unnecessary detail about pointers and references. It only makes an exception for trivially copyable objects and says: Well, in this case you are safe. But do not blame me on the consequences should you try to memcpy() any other objects.

Wyne answered 3/12, 2019 at 17:10 Comment(0)
B
0

What I can perceive here is that -- for some practical applications -- the C++ Standard may be to restrictive, or rather, not permittive enough.

As shown in other answers memcpy breaks down quickly for "complicated" types, but IMHO, it actually should work for Standard Layout Types as long as the memcpy doesn't break what the defined copy-operations and destructor of the Standard Layout type do. (Note that a even TC class is allowed to have a non-trivial constructor.) The standard only explicitly calls out TC types wrt. this, however.

A recent draft quote (N3797):

3.9 Types

...

2 For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value. [ Example:

  #define N sizeof(T)
  char buf[N];        T obj; // obj initialized to its original value
  std::memcpy(buf, &obj, N); // between these two calls to std::memcpy,       
                             // obj might be modified         
  std::memcpy(&obj, buf, N); // at this point, each subobject of obj of scalar type
                             // holds its original value 

—end example ]

3 For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2, obj2 shall subsequently hold the same value as obj1. [ Example:

T* t1p;
T* t2p;       
     // provided that t2p points to an initialized object ...         
std::memcpy(t1p, t2p, sizeof(T));  
     // at this point, every subobject of trivially copyable type in *t1p contains        
     // the same value as the corresponding subobject in *t2p

—end example ]

The standard here talks about trivially copyable types, but as was observed by @dyp above, there are also standard layout types that do not, as far as I can see, necessarily overlap with Trivially Copyable types.

The standard says:

1.8 The C++ object model

(...)

5 (...) An object of trivially copyable or standard-layout type (3.9) shall occupy contiguous bytes of storage.

So what I see here is that:

  • The standard says nothing about non Trivially Copyable types wrt. memcpy. (as already mentioned several times here)
  • The standard has a separate concept for Standard Layout types that occupy contiguous storage.
  • The standard does not explicitly allow nor disallow using memcpy on objects of Standard Layout that are not Trivially Copyable.

So it does not seem to be explicitly called out UB, but it certainly also isn't what is referred to as unspecified behavior, so one could conclude what @underscore_d did in the comment to the accepted answer:

(...) You can't just say "well, it wasn't explicitly called out as UB, therefore it's defined behaviour!", which is what this thread seems to amount to. N3797 3.9 points 2~3 do not define what memcpy does for non-trivially-copyable objects, so (...) [t]hat's pretty much functionally equivalent to UB in my eyes as both are useless for writing reliable, i.e. portable code

I personally would conclude that it amounts to UB as far as portability goes (oh, those optimizers), but I think that with some hedging and knowledge of the concrete implementation, one can get away with it. (Just make sure it's worth the trouble.)


Side Note: I also think that the standard really should explicitly incorporate Standard Layout type semantics into the whole memcpy mess, because it's a valid and useful usecase to do bitwise copy of non Trivially Copyable objects, but that's beside the point here.

Link: Can I use memcpy to write to multiple adjacent Standard Layout sub-objects?

Baecher answered 18/8, 2016 at 15:11 Comment(5)
It's logical that TC status is needed for a type to be memcpyable as such objects must have default copy/move constructors & assign ops, which are defined as simple bytewise copies - like memcpy. If I say my type is memcpyable but has a non-default copy, I contradict myself & my contract with the compiler, which says that for TC types, only the bytes matter. Even if my custom copy ctor/assign just does a bytewise copy & adds a diagnostic message, ++s a static counter or something - that implies I expect the compiler to analyse my code & prove it doesn't mess with byte representation.Croaky
SL types are contiguous but can have user-provided copy/move ctors/assign ops. Proving all user ops bytewise equivalent to memcpy would mandate the compiler do unrealistic/unfair volumes of static analysis for each type. I don't have on-record this is the motivation, but it seems convincing. But if we believe cppreference - Standard layout types are useful for communicating with code written in other programming languages - are they much use without said languages being able to take copies in a defined way? I guess we can then only pass out a pointer after safely assigning on C++'s side.Croaky
@Croaky - I do not agree that it is logical to require this. TC is only necessary to make sure that a memcpy is semantically equivalent to a logical object copy. The OP example shows that swapping two objects bitwise is an example where no logical copy is performed, IMHO.Baecher
And there is no requirement on the compiler to check anything. If the memcpy messes up the object state, then you should not have used memcpy! What the std should explicitly allow I think, would be exactly a bitwise swap as OP with SL types, even if they are not TC. Course there would be cases where it breaks down (self referencing objects etc.) but that is hardly a reason to leave this in limbo.Baecher
Well, sure, maybe they could say: 'you can copy this if you want, & it's defined to have the same state, but whether that's safe - e.g. doesn't cause pathological sharing of resources - is on you'. Not sure whether I'd side with this. But agree that, whatever is decided... a decision should be made. Most cases like this of the Standard not being specific leave folk wanting the ability uneasy about whether they're safe to use it, & folk like me who read threads like this uneasy about the conceptual acrobatics some people use to put words in the mouth of the Standard where it leaves gaps ;-)Croaky

© 2022 - 2024 — McMap. All rights reserved.