Why does this intentionally incorrect use of strcpy not fail horribly?
Asked Answered
P

5

11

Why does the below C code using strcpy work just fine for me? I tried to make it fail in two ways:

1) I tried strcpy from a string literal into allocated memory that was too small to contain it. It copied the whole thing and didn't complain.

2) I tried strcpy from an array that was not NUL-terminated. The strcpy and the printf worked just fine. I had thought that strcpy copied chars until a NUL was found, but none was present and it still stopped.

Why don't these fail? Am I just getting "lucky" in some way, or am I misunderstanding how this function works? Is it specific to my platform (OS X Lion), or do most modern platforms work this way?

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    char *src1 = "123456789";
    char *dst1 = (char *)malloc( 5 );

    char src2[5] = {'h','e','l','l','o'};
    char *dst2 = (char *)malloc( 6 );

    printf("src1: %s\n", src1);
    strcpy(dst1, src1);
    printf("dst1: %s\n", dst1);
    strcpy(dst2, src2);
    printf("src2: %s\n", src2);
    dst2[5] = '\0';
    printf("dst2: %s\n", dst2);

    return 0;
}

The output from running this code is:

$ ./a.out   
src1: 123456789
dst1: 123456789
src2: hello 
dst2: hello
Potsdam answered 21/8, 2011 at 17:29 Comment(2)
Try doing strlen for all 4 of them.Fiftieth
If your source strings were, say, 64 bytes or so long, but the allocated space remained at 5 and 6, you'd probably see less well-behaved output. If you freed the space, you'd also see problems, most likely. Finally, run with valgrind and you will again see problems.Recha
T
19

First, copying into an array that is too small:

C has no protection for going past array bounds, so if there is nothing sensitive at dst1[5..9], then you get lucky, and the copy goes into memory that you don't rightfully own, but it doesn't crash either. However, that memory is not safe, because it has not been allocated to your variable. Another variable may well have that memory allocated to it, and later overwrite the data you put in there, corrupting your string later on.

Secondly, copying from an array that is not null-terminated:

Even though we're usually taught that memory is full of arbitrary data, huge chunks of it are zero'd out. Even though you didn't put a null-terminator in src2, chances are good that src[5] happens to be \0 anyway. This makes the copy succeed. Note that this is NOT guaranteed, and could fail on any run, on any platform, at anytime. But you got lucky this time (and probably most of the time), and it worked.

Tremor answered 21/8, 2011 at 17:33 Comment(1)
NULL is (a macro that expands to) a *null pointer constant. It should not be used to refer to the null or NUL character, '\0'. The phrase "null terminator" would be ok.Optative
S
14

Overwriting beyond the bounds of allocated memory causes Undefined Behavior.
So in a way yes you got lucky.

Undefined behavior means anything can happen and the behavior cannot be explained as the Standard, which defines the rules of the language, does not define any behavior.

EDIT:
On Second thoughts, I would say you are really Unlucky here that the program works fine and does not crash. It works now does not mean it will work always, In fact it is a bomb ticking to blow off.

As per Murphy's Law:
"Anything that can go wrong will go wrong"["and most likely at the most inconvenient possible moment"]

[ ]- Is my edit to the Law :)

Squirmy answered 21/8, 2011 at 17:31 Comment(0)
B
4

Yes, you're quite simply getting lucky.

Typically, the heap is contiguous. This means that when you write past the malloced memory, you could be corrupting the following memory block, or some internal data structures that may exist between user memory blocks. Such corruption often manifests itself long after the offending code, which makes debugging this type of bugs difficult.

You're probably getting the NULs because the memory happens to be zero-filled (which isn't guaranteed).

Bunny answered 21/8, 2011 at 17:33 Comment(0)
G
4

As @Als said, this is undefined behaviour. This may crash, but it doesn't have to.

Many memory managers allocate in larger chunks of memory and then hand it to the "user" in smaller chunks, probably a mutliple of 4 or 8 bytes. So your write over the boundary probably simply writes into the extra bytes allocated. Or it overwrites one of the other variables you have.

Generation answered 21/8, 2011 at 17:34 Comment(2)
Or it may overwrite metadata created by the compiler, like your the return address of the current function (that's likely to cause a visible crash, though).Optative
@Keith: malloc-ed memory? I doubt it will overwrite the return address. But the heap is not do densely packed as one may think. Many memory managers have pools for certain small types, and only specially allocate large alloc sizes. This means that there are many gaps in such a heap, and overwriting those simply doesn't cause anything. That it is undefined behaviour is clear, and that not every heap is managed like that is also clear, but it explains why often, nothing happens when you write or read past such boundaries.Generation
I
1

You're not malloc-ing enough bytes there. The first string, "123456789" is 10 bytes (the null terminator is present), and {'h','e','l','l','o'} is 6 bytes (again, making room for the null terminator). You're currently clobbering the memory with that code, which leads to undefined (i.e. odd) behavior.

Implantation answered 21/8, 2011 at 17:34 Comment(3)
Well, it was intentional, after all. <g>Generation
Indeed, which is why I was surprised the behavior was not odd! (i.e., it worked just fine.) Hence the question.Potsdam
@Gabe: see my answer. The heap is not densely packed with useful data, it contains gaps, and you probably just wrote into such a gap. Then nothing happens. It remains undefined behaviour, of course. In other situations, it may overwrite memory, format your hard disk, cause world war III, cure cancer, or some such. <g>Generation

© 2022 - 2024 — McMap. All rights reserved.