Why is strdup considered to be evil
Asked Answered
T

7

34

I've seen some posters stating that strdup is evil. Is there a consensus on this? I've used it without any guilty feelings and can see no reason why it is worse than using malloc/memcpy.

The only thing I can think might earn strdup a reputation is that callers might misuse it (eg. not realise they have to free the memory returned; try to strcat to the end of a strdup'ed string). But then malloc'ed strings are not free from the possibility of misuse either.


Thanks for the replies and apologies to those who consider the question unhelpful (votes to close). In summary of the replies, it seems that there is no general feeling that strdup is evil per se, but a general consensus that it can, like many other parts of C, be used improperly or unsafely.

There is no 'correct' answer really, but for the sake of accepting one, I accepted @nneoneo's answer - it could equally have been @R..'s answer.

Thromboplastin answered 20/10, 2012 at 3:18 Comment(3)
This question wasn't by any chance prompted by my comment earlier was it?Copeland
In reference to the comments on Can a loop cause issues with assignment in C.Nitrobacteria
@SethCarnegie yes, but I had seen the same sentiment expressed elsewhere, which is why I created a question, rather than just asking you.Thromboplastin
O
39

Two reasons I can think of:

  1. It's not strictly ANSI C, but rather POSIX. Consequently, some compilers (e.g. MSVC) discourage use (MSVC prefers _strdup), and technically the C standard could define its own strdup with different semantics since str is a reserved prefix. So, there are some potential portability concerns with its use.
  2. It hides its memory allocation. Most other str functions don't allocate memory, so users might be misled (as you say) into believing the returned string doesn't need to be freed.

But, aside from these points, I think that careful use of strdup is justified, as it can reduce code duplication and provides a nice implementation for common idioms (such as strdup("constant string") to get a mutable, returnable copy of a literal string).

Overstreet answered 20/10, 2012 at 3:25 Comment(1)
strdup() is making it to the C2x Standard (Draft PDF)Pacer
J
23

My answer is rather supporting strdup and it is no worse than any other function in C.

  1. POSIX is a standard and strdup is not too difficult to implement if portability becomes an issue.

  2. Whether to free the memory allocated by strdup shouldn't be an issue if anyone taken a little time to read the man page and understand how strdup works. If one doesn't understand how a function works, it's very likely the person is going to mess up something, this is applicable to any function, not just strdup.

  3. In C, memory & most other things are managed by the programmer, so strdup is no worse than forgetting to free malloc'ed memory, failing to null terminate a string, using incorrect format string in scanf (and invoking undefined behaviour), accessing dangling pointer etc.

(I really wanted to post this as a comment, but couldn't add in a single comment. Hence, posted it as an answer).

Jala answered 20/10, 2012 at 23:24 Comment(0)
C
11

I haven't really heard strdup described as evil, but some possible reasons some people dislike it:

  1. It's not standard C (but is in POSIX). However I find this reason silly because it's nearly a one-line function to add on systems that lack it.
  2. Blindly duplicating strings all over the place rather than using them in-place when possible wastes time and memory and introduces failure cases into code that might otherwise be failure-free.
  3. When you do need a copy of a string, it's likely you actually need more space to modify or build on it, and strdup does not give you that.
Coercion answered 20/10, 2012 at 3:26 Comment(0)
F
6

I think the majority of the concern about strdup comes from security concerns regarding buffer over runs, and improperly formatted strings. If a non-null terminated string is passed to strdup it can allocated an undefined length string. I don't know if this can be specifically leveraged into an attack but in general it is good secure coding practice to only use string functions which take a maximum length instead of relying on the null character alone.

Frame answered 20/10, 2012 at 5:5 Comment(6)
If a non-null terminated string is passed to any function that expects a null terminated string, the programmer has goofed badly.Pennon
Yet it does happen, this is the reason to use strncpy instead of strcpy, when security is a serious concern. This can also happen with unexpected user input, or corrupted files. In general it is best security practice to rely on an explicit length rather than the null termination of a string.Frame
For most practical purposes, I don't use strncpy(). It doesn't guarantee null termination. If you copy a 5-byte word into a 20 KiB buffer, it also writes 20475 nulls. Neither behaviour is acceptable to me. Normally, I make sure I know how long the string is and use memmove() or (occasionally) memcpy(); I do have relapses and use strcpy(), but only if I know there's enough space. (If it is any consolation, strncat() is worse than strncpy(); I never use it!) If I don't know the maximum length of the string, I can't manipulate it safely. I can't tell where it can be truncated even.Pennon
A char array without a null termination is not a "string" by definition. You won't expect fopen() to work when you give it an http URL instead of a filepath. Any programmer who gives a normal char array to a function that expects a string should either RTFM or not be allowed to stay within 100 metres of any production code. They will most likely also forget to check the return of malloc() for NULL.Snapper
If you're concerned about the security of your string handling (which you should always be), than it is a better idea to not throw raw str* calls all over your code. Write a string handling library that cares for all of the typical problems and exclusively use this. Of course, if you're more concerned about such silly things like "performance considerations" when using strlen() instead of my_strlen(), then you'll get what you deserve.Snapper
I'm not sure that I this answer makes any sense. strdup guarantees to allocate enough space to include the terminator. The alternative is the programmer writing news = malloc(strlen(olds) + 1); strcpy(news,olds) and possible forgetting the +1 (and using strncpy doesn't necessarily help if he's forgotten to think about the need for the null).Nitrobacteria
C
3

Many people obviously don't, but I personally find strdup evil for several reasons,

  • the main one being it hides the allocation. The other str* functions and most other standard functions require no free afterwards, so strdup looks innocuous enough and you can forget to clean up after it. dmckee suggested to just add it to your mental list of functions that need cleaning up after, but why? I don't see a big advantage over reducing two medium-length lines to one short one.

  • It allocates memory on the heap always, and with C99's (is it 99?) VLAs, you have yet another reason to just use strcpy (you don't even need malloc). You can't always do this, but when you can, you should.

  • It's not part of the ISO standard (but it is part of the POSIX standard, thanks Wiz), but that's really a small point as R.. mentioned that it can be added easily. If you write portable programs, I'm not sure how you'd tell if it was already defined or not though...

These are of course a few of my own reasons, no one else's. To answer your question, there is no consensus that I'm aware of.

If you're writing programs just for yourself and you find strdup no problem, then there's much less reason not to use it than if you are writing a program to be read by many people of many skill levels and ages.

Copeland answered 20/10, 2012 at 4:13 Comment(8)
Your first point discredits pretty much the entire C language? If you don't free strdup() you don't free your own things. Why would it differ? As VLA, especially on arbitrary size strings is asking for trouble and undefined behaviors with no warnings. As for the last bullet: it's not standard: Yes is it. It's part of the POSIX standard. It's just not part of the ISO C standard – which is portable enough for most people.Nephograph
@Nephograph your own things are eye-catching, strdup blends in. That was the point. Thanks for the point about the standardness.Copeland
I'm sorely tempted to downvote since I disagree with most of what you say. I won't, but I'll go on record as saying that I think your objections are not really relevant. Given the number of times people make mistakes simulating strdup() — usually by forgetting to allocate enough space for the terminating null — having a library function is far more sensible than making everyone reinvent the (7-line) function themselves.Pennon
7-line? I always though of it as one or two... char *new = malloc(strlen(old)+1); return new ? strcpy(new, old) : 0;Coercion
@R.. Better with memcpy, no? ie. size_t len = strlen(old) + 1; char *new = malloc(len); return new ? memcpy(new, old, len) : 0;Thromboplastin
Written out without conditionals, it's 7 lines: function prototype (in header), function definition, malloc, if, memcpy, return, close brace.Overstreet
@WilliamMorris memcpy doesn't zero-terminate the string and memory from malloc can contain nonzero bytes...Schlieren
@Schlieren yes that is the reason for the + 1 in size_t len = strlen(old) + 1;Thromboplastin
T
2

Why is strdup considered to be evil

  1. Conflicts with Future language directions.

  2. Reliance on errno state.

  3. Easier to make your own strdup() that is not quite like the POISX one nor the future C2x one.


With C2x on the way with certain inclusion of strdup(), using strdup() before that has these problems.

  • The C2x proposed strdup() does not mention errno whereas POSIX does. Code that relies on setting errno to ENOMEM or EINVAL can have trouble in the future.

  • The C2x proposed char *strdup(const char *s1) uses a const char * as the parameter. User coded versions of strdup() too often use char *s1, incurring a difference that can break select code that counts on the char * signature. I.E. function pointers.

  • User code that did roll their own strdup() were not following C's Future language directions with its "Function names that begin with str, mem, or wcs and a lowercase letter may be added to the declarations in the <string.h> header" and so may incur library conflicts with the new strdup() and user's strdup().

If user code wants strdup() code before C2x, consider naming it something different like my_strdup() and use a const char * parameter. Minimize or avoid any reliance on the state of errno after the call returns NULL.

My my_strdup() effort - warts and all.

Telegraphic answered 28/12, 2022 at 21:21 Comment(0)
L
1

My reason for disliking strdup, which hasn't been mentioned, is that it is resource allocation without a natural pair. Let's try a silly game: I say malloc, you say free. I say open you say close. I say create you say destroy. I say strdup you say ....?

Actually, the answer to strdup is free of course, and the function would have been better named malloc_and_strcpy to make that clear. But many C programmers don't think of it that way and forgets that strdup requires its opposite or "ending" free to deallocate.

In my experience, it is very common to find memory leaks in code which calls strdup. It's an odd function which combines strlen, malloc and strcpy.

Lophophore answered 30/5, 2016 at 3:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.