Why are strlcpy and strlcat considered insecure?
Asked Answered
S

8

87

I understand that strlcpy and strlcat were designed as secure replacements for strncpy and strncat. However, some people are still of the opinion that they are insecure, and simply cause a different type of problem.

Can someone give an example of how using strlcpy or strlcat (i.e. a function that always null terminates its strings) can lead to security problems?

Ulrich Drepper and James Antill state this is true, but never provide examples or clarify this point.

Sweetener answered 22/1, 2010 at 4:13 Comment(0)
E
117

Firstly, strlcpy has never been intended as a secure version of strncpy (and strncpy has never been intended as a secure version of strcpy). These two functions are totally unrelated. strncpy is a function that has no relation to C-strings (i.e. null-terminated strings) at all. The fact that it has the str... prefix in its name is just a historical blunder. The history and purpose of strncpy is well-known and well-documented. This is a function created for working with so called "fixed width" strings (not with C-strings) used in some historical versions of Unix file system. Some programmers today get confused by its name and assume that strncpy is somehow supposed to serve as limited-length C-string copying function (a "secure" sibling of strcpy), which in reality is complete nonsense and leads to bad programming practice. C standard library in its current form has no function for limited-length C-string copying whatsoever. This is where strlcpy fits in. strlcpy is indeed a true limited-length copying function created for working with C-strings. strlcpy correctly does everything a limited-length copying function should do. The only criticism one can aim at it is that it is, regretfully, not standard.

Secondly, strncat on the other hand, is indeed a function that works with C-strings and performs a limited-length concatenation (it is indeed a "secure" sibling of strcat). In order to use this function properly the programmer has to take some special care, since the size parameter this function accepts is not really the size of the buffer that receives the result, but rather the size of its remaining part (also, the terminator character is counted implicitly). This could be confusing, since in order to tie that size to the size of the buffer, programmer has to remember to perform some additional calculations, which is often used to criticize the strncat. strlcat takes care of these issues, changing the interface so that no extra calculations are necessary (at least in the calling code). Again, the only basis I see one can criticise this on is that the function is not standard. Also, functions from strcat group is something you won't see in professional code very often due to the limited usability of the very idea of rescan-based string concatenation.

As for how these functions can lead to security problems... They simply can't. They can't lead to security problems in any greater degree than the C language itself can "lead to security problems". You see, for quite a while there was a strong sentiment out there that C++ language has to move in the direction of developing into some weird flavor of Java. This sentiment sometimes spills into the domain of C language as well, resulting in rather clueless and forced criticism of C language features and the features of C standard library. I suspect that we might be dealing with something like that in this case as well, although I surely hope things are not really that bad.

Expecting answered 22/1, 2010 at 4:43 Comment(10)
I do not completely agree. It would be nice if strlcpy and strlcat would report some sort of error condition if they bumped against the destination buffer size limit. Though you can check the returned length to test this, it's not obvious. But I think that's a minor criticism. The 'they encourage the use of C strings, and so they are bad' argument is silly.Nichollenicholls
"how these functions can lead to security problems" - fwiw I think the issue here is that some C functions are harder to use correctly than others. Some people have a mistaken belief that there is a special threshold of difficulty, below which a function is "secure" and above which it is "insecure". Such people are also usually of the belief that strcpy is above the threshold and hence "insecure", and their preferred string-copying function (whether it is strlcpy, strcpy_s or even strncpy) is below the threshold and hence "secure".Godliman
Of course there are are few functions (like gets) that are actually impossible to use correctly, and hence can reasonably be called "insecure" without further qualification of what's insecure about them :-)Godliman
There are plenty of reasons for disliking strlcpy/strlcat, but you don't state any of them. The discussion of C++ and Java is irrelevant. This answer just isn't helpful to the subject matter the question actually asked about.Foretime
@John Ripley: Firstly, I'm not "stating any of them" simply because I'm not aware of any reasons for disliking strlcpy/strlcat. One might "dislike" the general concept of zero-terminated string, but that's not what the question is about. If you know "plenty of reasons to dislike strlcpy/strlcat", you should probably write your own answer instead of expecting me to be able to read someone else's mind.Expecting
@John Ripley: Secondly, the question was specifically referring to some alleged "security problems" with strlcpy/strlcat. While I believe I understand what this is about, I personally refuse to recognize that as "security problems" within the realm of traditional C language, as I know it. That I stated in my answer.Expecting
"C standard library in its current form has no function for limited-length C-string copying whatsoever." Just sayin', snprintf...Flair
Most of this answer is useful, but the screed at the end about security problems is at best one person's (fairly extreme) opinion and at worst nonsense. You can get pedantic with your definition of what "security problem" means, but it is generally recognized that buffer overruns are a major source of software vulnerabilities, and C string functions without length checks are a major source of buffer overruns. I don't know why anyone would want to whitewash that, but it's not helpful.Overseer
It seems Linus Torvalds has a rather strong opinion about strlcpy. @AnT: Do you think this is relevant?Inviolate
@Flair 's comment about snprintf seems relevant. That was part of C99, available when this comment was posted. There was also memccpy as part of POSIX.1-2001, which has more recently become part of C23, section 7.26.2.2 (a standard made well after this answer was first posted), which can do this if the third parameter is passed as (int)'\0'Mikaelamikal
F
32

Ulrich's criticism is based on the idea that a string truncation that is not detected by the program can lead to security issues, through incorrect logic. Therefore, to be secure, you need to check for truncation. To do this for a string concatenation means that you are doing a check along the lines of this:

if (destlen + sourcelen > dest_maxlen)
{
    /* Bug out */
}

Now, strlcat does effectively do this check, if the programmer remembers to check the result - so you can use it safely:

if (strlcat(dest, source, dest_bufferlen) >= dest_bufferlen)
{
    /* Bug out */
}

Ulrich's point is that since you have to have destlen and sourcelen around (or recalculate them, which is what strlcat effectively does), you might as well just use the more efficient memcpy anyway:

if (destlen + sourcelen > dest_maxlen)
{
    goto error_out;
}
memcpy(dest + destlen, source, sourcelen + 1);
destlen += sourcelen;

(In the above code, dest_maxlen is the maximum length of the string that can be stored in dest - one less than the size of the dest buffer. dest_bufferlen is the full size of the dest buffer).

Formally answered 22/1, 2010 at 4:54 Comment(6)
The readability of Drepper's code is bad. With strlcpy (or any str function) I know directly that I'm copying a 0 terminated C string. With memcpy it can be any type of memory and I have a supplemental dimension to check when trying to understand the code. I had a legacy app to debug where everything was done with memcpy, it was a real PITA to correct. After porting to dedicated String function it is much easier to read (and faster because a lot of unnecessary strlen could be removed).Cere
Why wouldn't you just use strcpy instead of memcpy in the last example?Rayborn
@domen: Because the size to copy is already known, so memcpy() is sufficient (and is potentially more efficient than strcpy()).Formally
Well, it's confusing to have it in string operations. And as far as I know efficiency depends on implementation and is not standardized.Rayborn
@domen: memcpy() is a string operation - it's declared in <string.h>, after all.Formally
@Rayborn I agree that there is potential for confusion, but the reality is that working with C strings is pretty much working with raw memory anyway. Arguably, we would all be better off if people just stopped thinking of C as having "strings" (as distinct from any other contiguous memory chunks) at all.Allfired
J
22

When people say, "strcpy() is dangerous, use strncpy() instead" (or similar statements about strcat() etc., but I am going to use strcpy() here as my focus), they mean that there is no bounds checking in strcpy(). Thus, an overly long string will result in buffer overruns. They are correct. Using strncpy() in this case will prevent buffer overruns.

I feel that strncpy() really doesn't fix bugs: it solves a problem that can be easily avoided by a good programmer.

As a C programmer, you must know the destination size before you are trying to copy strings. That is the assumption in strncpy() and strlcpy()'s last parameters too: you supply that size to them. You can also know the source size before you copy strings. Then, if the destination is not big enough, don't call strcpy(). Either reallocate the buffer, or do something else.

Why do I not like strncpy()?

  • strncpy() is a bad solution in most cases: your string is going to be truncated without any notice—I would rather write extra code to figure this out myself and then take the course of action that I want to take, rather than let some function decide for me about what to do.
  • strncpy() is very inefficient. It writes to every byte in the destination buffer. You don't need those thousands of '\0' at the end of your destination.
  • It doesn't write a terminating '\0' if the destination is not big enough. So, you must do so yourself anyway. The complexity of doing this is not worth the trouble.

Now, we come to strlcpy(). The changes from strncpy() make it better, but I am not sure if the specific behavior of strl* warrants their existence: they are far too specific. You still have to know the destination size. It is more efficient than strncpy() because it doesn't necessarily write to every byte in the destination. But it solves a problem that can be solved by doing: *((char *)mempcpy(dst, src, n)) = 0;.

I don't think anyone says that strlcpy() or strlcat() can lead to security issues, what they (and I) are saying that they can result in bugs, for example, when you expect the complete string to be written instead of a part of it.

The main issue here is: how many bytes to copy? The programmer must know this and if he doesn't, strncpy() or strlcpy() won't save him.

strlcpy() and strlcat() are not standard, neither ISO C nor POSIX. So, their use in portable programs is impossible. In fact, strlcat() has two different variants: the Solaris implementation is different from the others for edge cases involving length 0. This makes it even less useful than otherwise.

Jeanmariejeanna answered 22/1, 2010 at 4:54 Comment(7)
strlcpy is faster than memcpy on many architectures, especially if the memcpy copies unnecessary trailing data. strlcpy also returns how much data you missed which might allow you to recover faster and with less code.Homograft
@jbcreix: my point is that there should be no data to miss, and n in my memcpy call is the exact number of bytes to be written, so the efficiency isn't that much of a problem either.Jeanmariejeanna
And how do you get that n? The only n you can know in advance is the buffer size. Of course if you suggest re implementing strlcpy each time you need it using memcpy and strlen that is okay too, but then why stopping at strlcpy, you don't need a memcpy function either, you can copy the bytes one by one. The reference implementation only loops through the data once in the normal case and that is better for most architectures. But even if the best implementation used strlen + memcpy, that is still no reason to not having to re-implement a secure strcpy again and again.Homograft
@jbcreix: I get the n by doing strlen() on the source string, and as I said, I make sure that the destination has enough space. Of course, if you're okay with the fact that your string may be truncated, you can use strlcpy(). Personally, I would rather call strlen() on the source and make sure the destination has enough space. So, I am not reimplementing strlcpy() at all. The trade off is between speed and copying all data. My point is that strcpy() is/can be safe if the programmer is careful. I am not saying that strlcpy() is completely useless. (continued...)Jeanmariejeanna
...but it's also not standard. And strncpy(), although standard, has deficiencies.Jeanmariejeanna
Alok, with strlcpy you sweep over the input string only once in the good case (which should be the majority) and twice in the bad case, with your strlen+memcpy you go over it twice always. If it makes a difference in practice is another story.Cere
"*((char *)mempcpy(dst, src, n)) = 0;" - Correct - Yes, Obviously Correct, No. Fails code review......Prearrange
R
12

I think Ulrich and others think it'll give a false sense of security. Accidentally truncating strings can have security implications for other parts of the code (for example, if a file system path is truncated, the program might not be performing operations on the intended file).

Recapture answered 22/1, 2010 at 5:50 Comment(3)
For example, an email client might truncate an email attachment's filename from malware.exe.jpg to malware.exe.Zack
@ChrisPeterson Which is why a good developer always checks the return values, to, in the case of strl* functions, know if the data was truncated and act accordingly.Footprint
"Ulrich and others think it'll give a false sense of security...." - Lol... meanwhile, Ulrich and friends make regular appearances on BugTraq and Full Disclosure for their one-off's. They should use the safer functions and avoid most of their problems. Then they can start telling others how to write more secure code...Hermy
H
7

There are two "problems" related to using strl functions:

  1. You have to check return values to avoid truncation.

The c1x standard draft writers and Drepper, argue that programmers won't check the return value. Drepper says we should somehow know the length and use memcpy and avoid string functions altogether, The standards committee argues that the secure strcpy should return nonzero on truncation unless otherwise stated by the _TRUNCATE flag. The idea is that people are more likely to use if(strncpy_s(...)).

  1. Cannot be used on non-strings.

Some people think that string functions should never crash even when fed bogus data. This affects standard functions such as strlen which in normal conditions will segfault. The new standard will include many such functions. The checks of course have a performance penalty.

The upside over the proposed standard functions is that you can know how much data you missed with strl functions.

Homograft answered 22/1, 2010 at 4:55 Comment(1)
note that strncpy_s is not a secure version of strncpy but basically a strlcpy replacement.Homograft
F
6

I don't think strlcpy and strlcat are consider insecure or it least it isn't the reason why they're not included in glibc - after all, glibc includes strncpy and even strcpy.

The criticism they got was that they are allegedly inefficient, not insecure.

According to the Secure Portability paper by Damien Miller:

The strlcpy and strlcat API properly check the target buffer’s bounds, nul-terminate in all cases and return the length of the source string, allowing detection of truncation. This API has been adopted by most modern operating systems and many standalone software packages, including OpenBSD (where it originated), Sun Solaris, FreeBSD, NetBSD, the Linux kernel, rsync and the GNOME project. The notable exception is the GNU standard C library, glibc [12], whose maintainer steadfastly refuses to include these improved APIs, labelling them “horribly inefficient BSD crap” [4], despite prior evidence that they are faster is most cases than the APIs they replace [13]. As a result, over 100 of the software packages present in the OpenBSD ports tree maintain their own strlcpy and/or strlcat replacements or equivalent APIs - not an ideal state of affairs.

That is why they are not available in glibc, but it is not true that they are not available on Linux. They are available on Linux in libbsd:

They're packaged in Debian and Ubuntu and other distros. You can also just grab a copy and use in your project - it's short and under a permissive license:

Fanchan answered 26/6, 2016 at 5:27 Comment(0)
C
3

Security is not a boolean. C functions are not wholly "secure" or "insecure", "safe" or "unsafe". When used incorrectly, a simple assignment operation in C can be "insecure". strlcpy() and strlcat() may be used safely (securely) just as strcpy() and strcat() can be used safely when the programmer provides the necessary assurances of correct usage.

The main point with all of these C string functions, standard and not-so-standard, is the level to which they make safe/secure usage easy. strcpy() and strcat() are not trivial to use safely; this is proven by the number of times that C programmers have gotten it wrong over the years and nasty vulnerabilities and exploits have ensued. strlcpy() and strlcat() and for that matter, strncpy() and strncat(), strncpy_s() and strncat_s(), are a bit easier to use safely, but still, non-trivial. Are they unsafe/insecure? No more than memcpy() is, when used incorrectly.

Cabalism answered 21/3, 2019 at 0:53 Comment(0)
F
0

strlcpy may trigger SIGSEGV, if src is not NUL-terminated.

/* Not enough room in dst, add NUL and traverse rest of src */
if (n == 0) {
    if (siz != 0)
        *d = '\0';      /* NUL-terminate dst */
    while (*s++)
        ;
}

return(s - src - 1);    /* count does not include NUL */
Folberth answered 3/6, 2021 at 22:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.