memcpy zero bytes into const variable - undefined behavior?
Asked Answered
S

2

36

In C and C++, is it undefined behavior to memcpy into a const variable when the number of bytes to be copied is zero?

int x = 0;
const int foo = 0;
memcpy( (void *)&foo, &x, 0 );

This question is not purely theoretical. I have a scenario in which memcpy is called and if the destination pointer points to const memory, then the size argument is guaranteed to be zero. So I'm wondering whether I need to handle it as a special case.

Silkweed answered 9/10, 2022 at 14:35 Comment(21)
Why use memcpy in C++ at all? That's what std::copy is for. The whole (void*) cast will disregard any constness and typesafety (that's so important in C++). Also make sure you ask your question specifcally for "C" and "C++" they're different languages with different rulesPiloting
Presumably, if the destination is a pointer to const memory, then it is an invalid pointer and the behaviour is undefined according to cppreference.Statuesque
Does this answer your question? memcpy with destination pointer to const dataGrille
If you are copying 0 bytes, then you are not writing to protected memory. You are not writing anything.Lengthen
Why would this be undefined? The wonky pointer casts are usually legal, it's the deferencing (or writing to the result of one) that's illegal.Pokey
@PepijnKramer The library is C but should also compile in/be compatible with C++.Silkweed
Ok I see, well in that case. You might want to have 2 overloaded C++ functions calling this library function. One for const and one for non const foo, and raise an error for the const version. Since it not about 0 bytes copied or not it is about const correctnessPiloting
@PepijnKramer the question isn't about copying to a const pointer. It's about passing a pointer that was originally const when there is nothing to copy anyway.Lengthen
@WeatherVane IMO it is. You should not overwrite a const variable, not even with memcpy. Looking at the number of bytes being zero is just "working" around the problem.Piloting
@PepijnKramer is isn't overwriting a const variable, as the question makes quite clear. memset() won't do anything. It won't dereference anything, or attempt to write anywhere.Lengthen
@PepijnKramer You may have missed the part of the question that makes it clear that this is not a practical question. It is a hypothetical question designed to explore the details of the language's rules.Kopans
Whether or not this is undefined behavior this conundrum is easily solved simply by adding an if statement that checks the number of bytes to copy and calling memcpy only if it is not 0. I would expect modern C++ compilers to compile the whole thing away, making the whole thing a moot point without any worries of whether this is undefined behavior, or not.Rig
@FrançoisAndrieux Ok right. I think I just get hung up on the "C" style (void*) cast. Nothing in memcpy (C++ standard) seems to mention what happens if number of bytes is 0. Exploring a bit on godbolt, no code is emitted when copying 0 bytes (godbolt.org/z/9b34fPzrb)Piloting
@SamVarshavchik Yes that would be the practical approachPiloting
@Pokey The standard imposes some limitations concerning memcpy that make some things surprisingly undefined behavior. For example memcpy( NULL, NULL, 0 ) is technically undefined behavior because the pointers passed in must be valid, even though no copy is actually occurring. As for my original question, I couldn't find anything in the standard covering this exact scenario, though there may be something in there.Silkweed
@PepijnKramer "Why use memcpy in C++ at all?" - there are several situations/corners in C++ where the only way to do type punning without UB is to go via memcpy, so it's not unreasonable to see it in C++ code.Ideomotor
@PepijnKramer My actual call doesn't use 0 as a literal but another variable that will be 0 if the destination pointer points to const memory or non-zero if it points to writable memory. So it's doubtful that the compiler will simply omit the call altogether, as it would in my trivial example.Silkweed
@JesperJuhl: From C++20 onward, doesn't std::bit_cast take care of most or all of those situations?Magnetize
@NateEldredge It might. I haven't researched in detail personally.Ideomotor
@JacksonAllan I checked on Godbolt and it seems compilers do not omit the zero check: godbolt.org/z/vK4Y4KKnhHorror
@EricMSchmidt: Since gcc calls an implementation of memmove which isn't bundled with gcc, it has no way of knowing whether that function might malfunction if passed invalid pointers with a size of zero. Since ZeroCountCheckedMemcpy woud have defined behavior in that case but memcpy would not, omitting the check could adversely affect the behavior of what should be a Strictly Conforming C Program.Levins
M
38

The older question Is it guaranteed to be safe to perform memcpy(0,0,0)? points out 7.1.4p1:

Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow: If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined.

The prototype for memcpy is

void *memcpy(void * restrict s1, const void * restrict s2, size_t n);

where the first parameter is not const-qualified, and &foo points to non-modifiable storage. So this code is UB unless the description of memcpy explicitly states otherwise, which it does not. It merely says:

The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.

This implies that memcpy with a count of 0 does not copy any characters (which is also confirmed by 7.24.1p2 "copies zero characters", thanks Lundin), but it does not exempt you from the requirement to pass valid arguments.

Magnetize answered 9/10, 2022 at 18:36 Comment(22)
Since C++ doesn't relax the requirements of the C library, this is equally true for C++.Declarer
I suspect this will be a Shroedinger's bug - there is absolutely no reason why calling it in the way you describe would cause any problems whatsoever; but as soon as you actually implement it, you will discover that there is exactly one library on the planet that crashes when you do this, and it's the one you are using.Then
@BettyCrokker Or worse, the library would be perfectly fine if you actually called it. But the compiler sees the call as a special cased memcpy, and infers it can't happen and removes the calling code path.Gimcrackery
"This implies that memcpy with a count of 0 does nothing" C17 7.24.1/2 explicitly states that memcpy (a copy function) copies zero bytes in case the size parameter is zero.Scrod
If the question has already been answered somewhere, it should be closed as a dupQp
I will gladly complain to any compiler vendor that makes memcpy(NULL, NULL, 0); not work. Requiring the argument to be readable (let alone writable) on zero bytes is a perverse reading of the standard.Whitechapel
@Joshua: A major goal of the Standard is to allow compilers to be as useful as possible. If on some platform, an implementation that behaved oddly when given memcpy(0,0,0) would be genuinely more useful for some task than one that would treat it as a no-op, I think the Standard would be intended to allow such treatment. If there is no case where such treatment would be useful, then nobody should have any reason to care about whether the Standard would allow such treatment, and thus there would be no reason for the Standard to spend ink forbidding it.Levins
@supercat: git had to wrap memmove already because of a similar bug where the function worked just fine but the compiler's optimizer would occasionally assume the argument to memmove was not null and make improper optimizations on that assumption. Unfortunately I don't know which compiler was the offender.Whitechapel
@Joshua: Ironically, such "optimizations" actually end up forcing less efficient code generation, since a compiler that's processes if (size) memcpy(dest, src, size); by calling an external library function can't omit the if, but of course the function will have to do its own size==0 check in addition to the one performed by the calling code. One of the design principles behind C was that if on some platform no machine code would be necessary to ensure acceptable behavior in some case, neither the programmer nor compiler should have to write code to handle that case.Levins
@Joshua: Someone seeking to write a quality compiler, given a choice between having it processing some programs in more useful ways or a less useful ways, should make a bonafide effort to have it process the programs in the more useful way. The fact that clang and gcc use the Standard as an excuse to behave in gratuitously meaningless fashion doesn't imply the Standard is defective, but merely that the authors aren't making a bona fide effort to produce maximally-useful compilers.Levins
@supercat: You'd expect that a 2022 compiler inlines memmove and thus eliminates any redundant checks. "Expect" as in "file a bug report if it doesn't"Ugaritic
@MSalters: On many platforms, the per-byte cost of a simple in-line memcpy or memmove will be so much greater than that of an optimized one that the former would only be faster when copying fewer than about 16 bytes. A well-designed C implementation should have multiple different library functions which would be employed in different scenarios (e.g. where the destination is known to be word-aligned but the source isn't, and the size is known to be a multiple of 4), but neither clang nor gcc is designed to accommodate such things.Levins
@MSalters: Have you seen the contents of a modern memmove? Inlining it is a mistake unless the compiler has a lot more info than usual. (It rarely knows the actual alignment of a char* pointer for example.)Whitechapel
@MSalters: Also, take a look at godbolt.org/z/v8j8v49E5 and tell me what you think.Levins
@supercat: I know that there are indeed quite some reasonable different implementations, optimal for different cases. That's why I expect memmove itself to be inlined, so your "n<16" check can be optimized for the many cases where that size is some sizeof Foo.Ugaritic
@Joshua: I have - saw it on a SO question not that long ago. It's not uncommon to pass &someStackObject for instance, where the optimizer knows it's aligned and can reasonably guess the object is in L1 cache.Ugaritic
@MSalters: In the linked godbolt example, a compiler would know that n is exactly 7, and would know that the start of the destination address is one byte before the start of the source, but would generate a machine code instruction to call library function memmove.Levins
@supercat: I saw it. Tricky one - either p or p+1 is misaligned, if not both. And the function is basically a special case of memmove. Experimenting a bit more, GCC is the exception. It also calls memmove on ARM. MSVC and clang inline the code for x64 and ARM, and ICC inlines it too.Ugaritic
@MSalters: The "Gratuitously Clever Compiler" is exceptional in quite a few ways. What I find really funny with the example I quoted is that gcc takes code that's written as a loop, and converts it into a call to an outside memmove function. If gcc had come with a multiple-entry library function like memcpyup8: ldr r2,[r0,#0] / str r2,[r1,#0] / memcpyup7: ldr r2,[r0,#1] / str r2,[r0,#1] / ..., calling one of its entry points could offer better performance than using a loop, and be more compact in cases where one funciton could be called from multiple places in the program, but...Levins
...I am doubtful that a memmove function could be written to be faster than a simple in-line loop, or that the slight space savings from the memcpy call would be appreciated by programmers who wrote the loop instead of a memcpy call.Levins
@supercat: I've seen a memmove implementation that would at first call check whether or not the CPU had SIMD support and swap in an alternate version of itself that could move 16 bytes at a time. The compiler's not emitting code that does that.Whitechapel
@Joshua: Probably not, though in the days before write-protected code segments, both the PC and IIRC the Macintosh had some conventions for code sequences that could be patched automatically if a numeric coprocessor was present.Levins
L
-2

It's clear that on the vast majority of platforms, an implementation which would process memcpy(anything, anything, 0) as a no-op, regardless of the validity of the source and destination pointers, would be in every way, in essentially every non-contrived scenario, as good or better than one that does anything else.

The way the Standard is written, however, could be interpreted as specifying that compilers are allowed to treat as UB any situation where the destination address is not associated with writable storage.

If one is using an implementation that seeks to process corner cases applying the philosophy documented in the Rationale document published by the authors of the Standard, without regard for whether the Standard unambiguously mandates such behavior, all memcpy and memmove operations where the size is zero will be reliably processed as no-ops. If the size will often be zero, there may be performance advantages to skipping a memcpy or memmove call in the zero-size case, but such a check would never be required for correctness.

If, however, one wishes to ensure reliable compatibility with compiler configurations that aggressively assumes that code will never receive inputs that trigger corner cases that aren't 100% unambiguously mandated by the Standard, and is designed to generate nonsensical code if such inputs are received, then it will be necessary to add a size==0 check in any case where a zero size might be accompanied by anything other than a pointer to writable storage, recognizing that such a check may negatively affect performance in situations where the size is very seldom zero.

Levins answered 10/10, 2022 at 6:40 Comment(24)
Comments are not for extended discussion; this conversation has been moved to chat.Conclave
@Levins Every single time someone asks a question about undefined behavior in C on this site, you post a version of this rant, and I have to ask, why do you keep posting it here, where none of the people whose minds you need to change will read it? You could write a position paper for the C committee. You could bring it up on the GCC or LLVM development mailing lists. You could do some actual research to back up your assertions and then publish in PLDI or ASPLOS. You have better options than getting ignored here.Farly
@zwol: My answer directly ties in with the question asked, mentioning why guaranteeing defined behavior might potentially increase the cost of straightforward implementation on some platforms. There is no reason why anyone who isn't targeting such platforms should need to worry about such things, and 20 years ago I think pretty much everyone would have agreed that there was no need to worry about them, but a lot of code is processed with compilers that gratuitously make it necessary to worry about such things, lest compiler "optimizations" facilitate arbitrary remote code execution.Levins
@zwol: Can someone be a good C programmer today without understanding both (1) the Standard was not written with the intention of making programmers jump through gratuitous hoops, and (2) some compilers are designed to behave in dangerously nonsensical fashion unless programmers jump through gratuitous hoops not intended by the Standard? While #2 may seem like an outrageous claim that I would not expect people to believe without evidence, since it's outrageous to think that supposedly-general-purpose compilers would be designed in such fashion, the only way one could say that they weren't...Levins
...would be to argue either that the hoops required to avoid nonsensical behavior were required by the authors of the Standard, or be unaware of the hoops that some compilers require. Perhaps what's needed is a retronym to distinguish the family of dialects the C Standard was chartered to describe from the one gcc and clang seek to process, so as to allow most of these questions to be easily and non-controversially answered "defined in all commonplace dialects of [retronym] but not defined in gcc-c or clang-c.".;Levins
Nope. Nobody here cares. Posters on this site only care about what you actually have to do to get your code to work today, which means accepting current-generation compilers' interpretations of the standard as what C is. Again, you'd be much better served to write up a N-document for the committee, or an academic paper on how much """legacy""" code has actually been broken by current compilers, or really anything besides writing more or less the same 500 words over and over again on a site where nobody cares.Farly
If you want to think that makes people here "bad C programmers", you go right ahead.Farly
@zwol: Perhaps I wasn't quite accurate: people who use tools should be familiar with how those tools work, including potentially unexpected aspects. Further, the authors of clang and gcc have stated that in cases where the Standard doesn't mandate that a piece of code be processed meaningfully, they should not be expected to give notice if the next version of their compiler arbitrarily change the behavior of such code.Levins
@zwol: I've rewritten the answer to be more focused on practical aspects. Better?Levins
@Levins Yeah, I'll retract my downvote on that basis, but please consider transferring your efforts to a venue where it might actually bring about the change you want.Farly
Two cents about UB. To my knowledge, Linux kernel is built using -fno-strict-aliasing, -fno-delete-null-pointer-checks, and -fno-strict-overflow. Meaning that Linux kernel relies on a particular compiler's behavior in case of UB (e.g. that code like if (i+1 > i) if i is signed integer is NOT folded to if (1)). Perhaps the Linux kernel code can be changed/revised/fixed, so these no- can be removed leading to perf. increase.Disentitle
@pmor: There are many situations where machine code which was agnostic to possibilities of things like integer overflow, thus allowing e.g. (x+y)>y to be replaced with x>0, could be more efficient than code which rigidly defines behavior in all such cases. There is a huge difference between that, however, and the kinds of optimizations gcc performs around integer overflow, allowing even something as benign as uint1 = (ushort1*ushort2) & 0xFFFF; to arbitrarily corrupt memory if ushort1 exceeds INT_MAX/ushort2.Levins
@pmor: BTW, I would hope Linux also includes a flag to prevent the "optimizations" that would cause side-effect-free infinite loops to arbitrarily corrupt memory even if every single operation performed with them would be defined if processed individually.Levins
@Levins I guess that not all C programmers know that "implementations ... may also treat pointers based on different origins as distinct even though they are bitwise identical" (demo, GCC bug 61502). I don't know whether such implementations (e.g. GCC) provide an option to disable such treatments. Re: "side-effect-free infinite loops to arbitrarily corrupt memory": can you elaborate?Disentitle
@pmor: If clang in either C or C++ mode, or gcc in C++ mode, is given something like unsigned test(unsigned x) { unsigned i=1; while((i & 255) != x) i*=3; if (x < 256) arr[x] = 1; return i; }, then in situations where the return value is unused the store to arr[x] will be performed unconditionally. If the Standard were written in a way that could accommodate behavior inconsistent with sequential programmer execution without defenestrating all requirements, then a good abstraction model would allow x < 256 to be replaced with ((i & 255)==x) && (x < 256), which could...Levins
....then be replaced with ((i & 255)==x), and that could in turn be consolidated with the previous check of that condition in the loop if such check were actually performed, but such consolidation should cause the check outside the loop to be replaced with with an artificial data dependency on the computation of (i & 255) == x within the loop, rather than being dropped altogether. This should in turn preclude the elimination of the loop [unless a compiler recognizes that it would be more efficient to drop the loop and retain the post-loop comparison than vice versa.Levins
@pmor: One of the goals of the Standard is to avoid forbidding optimizations that might, individually, be useful in some circumstances. It isn't designed to identify a set of optimizations that may be safely combined in arbitrary ways without limit.Levins
@pmor: Clang and gcc use an abstraction model that allows for the possibility of program executions where pointers might be coincidentally bitwise identical without being able to identify the same object, but does not allow for the possibility that such pointers might be used to access objects after they are observed to be identical.Levins
@Levins Consider this. There per abstract machine no xxx is printed. However, per ICC xxx is printed. Can a conforming C11 compiler eliminate infinite recursion per C11 5.1.2.3/4?Disentitle
@pmor: Practical conforming non-optimizing compilers for some platforms might by chance output xxx if just the right amount of memory happened to be available when the program launched, and the code for main() happened to get placed in memory before the code for f(), such that the stack frames that were generated for callingf() happened to overwrite the code for f() with machine instructions to jump to a spot in main() just past the call to f(). The way the Standard is written, there are very few circumstances where anything an otherwise-conforming implementation might do...Levins
...in response to any particular program would render it non-conforming, and provided that an implementation issues at least one diagnostic for all programs requiring one (a requirement that could satisfied by unconditionally outputting a "Warning: This program's diagnostics are garbage" message), all such situations would either involve #error directives or source programs that exercise at least one translation limit given in N1570 5.2.4.1. Since the program you linked does neither of those things, an implementation that output at least one diagnostic need not meet any other requirements.Levins
@pmor: The Standard tries to portray itself as a "contract" between programmers and implementations, but if one looks at the actual requirements imposed upon conforming programs and conforming implementations, it exercises almost no meaningful normative authority.Levins
If you specified a bytecount of 0, wouldn't it decrement to 0xFFFFFFFF and have to reach zero again to stop?Bardo
@puppydrum64: The behavior of memcpy is specified as treating a zero length copy as a request to copy zero bytes. If a compiler were bundled with its own memory-copy function(s), and it could determine at a particular call site that the length could never be zero, the compiler could invoke a memory-copy function which copied the first byte unconditionally, and could thus be a tiny bit faster than one which had to exit early in the length-is-zero case.Levins

© 2022 - 2024 — McMap. All rights reserved.