Why no sanity checks in legacy strcpy()
Asked Answered
P

9

10

Following is the most popular implementation of strcpy in traditional systems. Why dest and src are not checked for NULL in the start? I heard once that in old days the memory was limited so short code was always preferred. Will you implement strcpy and other similar functions with NULL pointer checks at the start now days? Why not?

char *strcpy(char *dest, const char *src)
{
   char *save = dest;
   while(*dest++ = *src++);
   return save;
}
Perceptive answered 1/9, 2010 at 8:40 Comment(8)
It may be safer in general. But that also means experienced developers have to pay the cost of safety (that they do not need) just so that inexperienced developers do not snarf up.Typesetting
Experienced developers have to pay the cost of safety (that they do not think they need)... (FTFY).Discreditable
@Brian Hooper - no, if you're using C, you should know exactly what you need. My embedded code never, ever needs NULL checks on strcpy because all buffers are statically allocated and used directly. There is absolutely no way I will ever pass NULL to strcpy. So why would I want to pay the price? There's no "do not think I need" about it.Metaphor
Not crashing on NULL pointers is not safety unless that's what's specified. Handling that case when it's not specified means passing off the problem to another function - which might have unexpected consequences. The only safe program is a terminated one.Shack
Useless NULL checks in functions that do not assign special meaning to NULL arguments are a bane of bad C libraries. They lock you into added waste and encourage bad coders to toss NULL pointers around as if they were a universally-valid "empty string" or something.Intercessor
@detly, perhaps you are right. The point I was half-seriously trying to make was that although inexperienced developers snarf up, so do experienced developers and the difference is chiefly that experienced developers know in advance that they are going to do so, and take steps to mitigate the damage it causes.Discreditable
@Brian Hooper - I concede that is a reasonable point to make :)Metaphor
On some systems NULL could actually be a valid address.Storeroom
S
20

NULL is a bad pointer, but so is (char*)0x1. Should it also check for that? In my opinion (I don't know the definitive reason why), sanity checks in such a low-level operation are uncalled for. strcpy() is so fundamental that it should be treated something like as asm instruction, and you should do your own sanity checks in the caller if needed. Just my 2 cents :)

Selvage answered 1/9, 2010 at 8:45 Comment(5)
I agree: low level routines should be implemented for efficiency and high level routines should add security when applicable.Vaillancourt
+1 for pointing out that NULL is only one example of the 99.9% of pointer space that's likely also invalid.Intercessor
What makes you say (char*)0x1 is necessarily a bad pointer? In C99, the null pointer is a special case in that it "is guaranteed to compare unequal to a pointer to any object or function." (6.3.2.3).Bethanybethe
i'm just illustrating a point, no need to be pedantic. On my machine, 0x1 is a bad pointer. Will you also criticize R for his inaccurate statistic of 99.9% as well? :PSelvage
If your program's size in memory is 4 megs on a 32-bit machine, then 99.9% of possible pointers are invalid -- and that's assuming char pointers with no alignment restrictions. Change that to int and the threshold goes up by 4x. And of course if you're on a 64-bit machine, 99.999999% of pointer values will be invalid in the vast majority of programs.Intercessor
M
15

There are no sanity checks because one of the most important underlying ideologies of C is that the developer supplies the sanity. When you assume that the developer is sane, you end up with a language that can be used to do just about anything, anywhere.

This is not an explicitly stated goal — it's quite possible for someone to come up with an implementation that does check for this, and more. Maybe they have. But I doubt that many people used to C would clamour to use it, since they'd need to put the checks in anyway if there was any chance that their code would be ported to a more usual implementation.

Metaphor answered 1/9, 2010 at 9:1 Comment(1)
...the developer supplies the sanity. - I like that ;)Diaz
D
11

The whole C language is written with the motto "We'll behave correctly provided the programmer knows what he's doing." The programmer is expected to know to make all the checks he needs to make. It's not just checking for NULL, it's ensuring that dest points to enough allocated memory to hold src, it's checking the return value of fopen to make sure the file really did open successfully, knowing when memcpy is safe and when memmove is required, and so on.

Getting strcpy to check for NULL won't change the language paradigm. You will still need to ensure that dest points to enough space -- and this is something that strcpy can't check for without changing the interface. You will also need to ensure that src is '\0'-terminated, which again strcpy can't possibly check.

There are some C standard library functions which do check for NULL: for example, free(NULL) is always safe. But in general, C expects you to know what you're doing.

[C++ generally eschews the <cstring> library in favour of std::string and friends.]

Dionisio answered 1/9, 2010 at 8:59 Comment(1)
..which is so utterly incompatible, it seems to be designed to hurt.Manaus
W
6
  1. It's usually better for the library to let the caller decide what it wants the failure semantics to be. What would you have strcpy do if either argument is NULL? Silently do nothing? Fail an assert (which isn't an option in non-debug builds)?

  2. It's easier to opt-in than it is to opt-out. It's trivial to write your own wrapper around strcpy that validates the inputs and to use that instead. If, however, the library did this itself, you would have no way of choosing not to perform those checks short of re-implementing strcpy. (For example, you might already know that the arguments you pass to strcpy aren't NULL, and it might be something you care about if you're calling it in a tight loop or are concerned about minimizing power usage.) In general, it's better to err on the side of granting more freedom (even if that freedom comes with additional responsibility).

Western answered 1/9, 2010 at 10:45 Comment(1)
+1 for custom wrapper that implements your error handling policy. (Though I probably wouldn't wrap strcpy individually. I use a StrOnBuf class that wraps the core character buffer manipulation routines, and can be configured to truncate silently, truncate with debug assert, or throw).Manaus
F
3

The most likely reason is: Because strcpy is not specified to work with NULL inputs (i.e. its behaviour in this case is undefined).

So, what should a library implementer choose to do if a NULL is passed in? I would argue that the best thing do to is to let the application crash. Think of it this way: A crash is a fairly obvious sign that something has gone wrong... silently ignoring a NULL input, on the other hand, may mask a bug that will be much harder to detect.

Furan answered 1/9, 2010 at 8:46 Comment(3)
No. strcpy on NULL input is undefined behaviour, which may crash, or it may silently do the right thing. You certainly can't rely on a runtime error from using strcpy with NULL.Dionisio
It might do, but the reality is that it won't.Libau
@Philip: Good point -- I've edited the answer to remove the erroneous statement that "the correct thing to do is to crash" (but I would still argue that it's the best thing to do).Furan
C
2

NULL checks were not implemented because C's earliest targets supported strong memory protections. When a process attempted to read from or write to NULL, the memory controller would signal the CPU that an out-of-range memory access was attempted (segmentation violation), and the kernel would kill the offending process.

This was an alright answer, because code attempting to read from or write to a NULL pointer is broken; the only answer is to re-write the code to check return values from malloc(3) and friends and take corrective action. By the time you're trying to use pointers to unallocated memory, it is too late to make a correct decision about how to fix the situation.

Coonan answered 1/9, 2010 at 8:47 Comment(0)
D
0

You should think of the C standard library functions as the thinnest possible additional layer of abstraction above the assembly code that you don't want to churn out to get your stuff over the door. Everything beyond that, like error checking, is your responsibility.

Diann answered 1/9, 2010 at 9:5 Comment(0)
P
0

According to me any function you would want to define would have a pre-condition and a post-condition. Taking care of the preconditions should never be part of a function. Following is a precondition to use strcpy taken from the man page.

The strcpy() function copies the string pointed to by src (including the terminating '\0' character) to the array pointed to by dest. The strings may not overlap, and the destination string dest must be large enough to receive the copy.

Now if the precondition is not met then things might be undefined.

Whether I would include a NULL check in my strcpy now. I would rather have another safe_strcpy, giving safety the priority I would definitely include NULL checks and handle overflow conditions. And accordingly my precondition gets modified.

Partner answered 1/9, 2010 at 9:13 Comment(0)
P
0

There is simply no error semantic defined for it. In particular there is no way for strcpy to return an error value. C99 simply states:

The strcpy function returns the value of s1.

So for a conforming implementation there wouldn't even a possibility to return the information that something went wrong. So why bother with it.

All this is voluntary, I think, since strcpy is replaced by most compilers by very efficient assembler directly. Error checks are up to the caller.

Papillose answered 1/9, 2010 at 11:18 Comment(3)
Since behavior is undefined if NULL is passed, strcpy could conceivably return something other than s1 when s1 is NULL. Or it could fail to return at all (crash or infinite loop).Intercessor
What if you are the first one and were asked to design strcpy from scratch and also write the C99 standards youself. Would you change it to return some error value?Perceptive
@user436748: unfortunately this is purely hypothetical, three options. For a design as "high level" I would just require it to return NULL if on error, to set errno with an indication and also that the original data is unchanged in such a case. this is done in several other places, but not here for strcpy. If I would design it as "low level" I'd go for "just do the right thing" but I would in addition require it to produce a segfault in case that one of the pointers is NULL. Then, you asked, for real C99 you could go for char* strcpy(char s1[static 1], char const s2[static 1]);Papillose

© 2022 - 2024 — McMap. All rights reserved.