strcpy() return value
Asked Answered
H

6

40

A lot of the functions from the standard C library, especially the ones for string manipulation, and most notably strcpy(), share the following prototype:

char *the_function (char *destination, ...)

The return value of these functions is in fact the same as the provided destination. Why would you waste the return value for something redundant? It makes more sense for such a function to be void or return something useful.

My only guess as to why this is is that it's easier and more convenient to nest the function call in another expression, for example:

printf("%s\n", strcpy(dst, src));

Are there any other sensible reasons to justify this idiom?

Hehre answered 24/8, 2010 at 22:7 Comment(4)
Your guess is correct, but of course we all wish these functions returned a pointer to the terminating null byte (which would reduce a lot of O(n) operations to O(1)).Presently
A very correct observation. So many people just don't realize the cost of a strlen().Hehre
POSIX provides stpcpy(3). It it the same as strcpy(3), but returns a pointer to the NUL terminating byte.Member
Make sure to #include <string.h>, else you might run into reading a bad address like I did.Seymour
H
29

as Evan pointed out, it is possible to do something like

char* s = strcpy(malloc(10), "test");

e.g. assign malloc()ed memory a value, without using helper variable.

(this example isn't the best one, it will crash on out of memory conditions, but the idea is obvious)

Horner answered 24/8, 2010 at 22:16 Comment(3)
char *s = strcpy(xmalloc(10, my_jmpbuf), "test"); with an xmalloc that performs longjmp on failure would make this idiom sane.Presently
Thank you Yossarian, this way it makes a lot of sense. In general, if the destination argument is an expression, then the return value could be useful as it would be the evaluated result of that expression.Hehre
Possible, yes, very silly, certainly. The desire to avoid a helper variable is far outweighed by the fact that your program will bomb badly. You'd be better off using (or even writing if you don't have one) strdup: https://mcmap.net/q/98227/-strdup-what-does-it-do-in-c/….Schottische
M
21

char *stpcpy(char *dest, const char *src); returns a pointer to the end of the string, and is part of POSIX.1-2008. Before that, it was a GNU libc extension since 1992. It first appeared in Lattice C AmigaDOS in 1986.

gcc -O3 will in some cases optimize strcpy + strcat to use stpcpy or strlen + inline copying, see below.


C's standard library was designed very early, and it's very easy to argue that the str* functions are not optimally designed. The I/O functions were definitely designed very early, in 1972 before C even had a preprocessor, which is why fopen(3) takes a mode string instead of a flag bitmap like Unix open(2).

I haven't been able to find a list of functions included in Mike Lesk's "portable I/O package", so I don't know whether strcpy in its current form dates all the way back to there or if those functions were added later. (The only real source I've found is Dennis Ritchie's widely-known C History article, which is excellent but not that in depth. I didn't find any documentation or source code for the actual I/O package itself.)

They do appear in their current form in K&R first edition, 1978.


Functions should return the result of computation they do, if it's potentially useful to the caller, instead of throwing it away. Either as a pointer to the end of the string, or an integer length. (A pointer would be natural.)

As @R says:

We all wish these functions returned a pointer to the terminating null byte (which would reduce a lot of O(n) operations to O(1))

e.g. calling strcat(bigstr, newstr[i]) in a loop to build up a long string from many short (O(1) length) strings has approximately O(n^2) complexity, but strlen/memcpy will only look at each character twice (once in strlen, once in memcpy).

Using only the ANSI C standard library, there's no way to efficiently only look at every character once. You could manually write a byte-at-a-time loop, but for strings longer than a few bytes, that's worse than looking at each character twice with current compilers (which won't auto-vectorize a search loop) on modern HW, given efficient libc-provided SIMD strlen and memcpy. You could use length = sprintf(bigstr, "%s", newstr[i]); bigstr+=length;, but sprintf() has to parse its format string and is not fast.

There isn't even a version of strcmp or memcmp that returns the position of the difference. If that's what you want, you have the same problem as Why is string comparison so fast in python?: an optimized library function that runs faster than anything you can do with a compiled loop (unless you have hand-optimized asm for every target platform you care about), which you can use to get close to the differing byte before falling back to a regular loop once you get close.

It seems that C's string library was designed without regard to the O(n) cost of any operation, not just finding the end of implicit-length strings, and strcpy's behaviour is definitely not the only example.

They basically treat implicit-length strings as whole opaque objects, always returning pointers to the start, never to the end or to a position inside one after searching or appending.


History guesswork

In early C on a PDP-11, I suspect that strcpy was no more efficient than while(*dst++ = *src++) {} (and was probably implemented that way).

In fact, K&R first edition (page 101) shows that implementation of strcpy and says:

Although this may seem cryptic at first sight, the notational convenience is considerable, and the idiom should be mastered, if for no other reason than that you will see it frequently in C programs.

This implies they fully expected programmers to write their own loops in cases where you wanted the final value of dst or src. And thus maybe they didn't see a need to redesign the standard library API until it was too late to expose more useful APIs for hand-optimized asm library functions.


But does returning the original value of dst make any sense?

strcpy(dst, src) returning dst is analogous to x=y evaluating to the x. So it makes strcpy work like a string assignment operator.

As other answers point out, this allows nesting, like foo( strcpy(buf,input) );. Early computers were very memory-constrained. Keeping your source code compact was common practice. Punch cards and slow terminals were probably a factor in this. I don't know historical coding standards or style guides or what was considered too much to put on one line.

Crusty old compilers were also maybe a factor. With modern optimizing compilers, char *tmp = foo(); / bar(tmp); is no slower than bar(foo());, but it is with gcc -O0. I don't know if very early compilers could optimize variables away completely (not reserving stack space for them), but hopefully they could at least keep them in registers in simple cases (unlike modern gcc -O0 which on purpose spills/reloads everything for consistent debugging). i.e. gcc -O0 isn't a good model for ancient compilers, because it's anti-optimizing on purpose for consistent debugging.


Possible compiler-generated-asm motivation

Given the lack of care about efficiency in the general API design of the C string library, this might be unlikely. But perhaps there was a code-size benefit. (On early computers, code-size was more of a hard limit than CPU time).

I don't know much about the quality of early C compilers, but it's a safe bet that they were not awesome at optimizing, even for a nice simple / orthogonal architecture like PDP-11.

It's common to want the string pointer after the function call. At an asm level, you (the compiler) probably has it in a register before the call. Depending on calling convention, you either push it on the stack or you copy it to the right register where the calling convention says the first arg goes. (i.e. where strcpy is expecting it). Or if you're planning ahead, you already had the pointer in the right register for the calling convention.

But function calls clobber some registers, including all the arg-passing registers. (So when a function gets an arg in a register, it can increment it there instead of copying to a scratch register.)

So as the caller, your code-gen option for keeping something across a function call include:

  • store/reload it to local stack memory. (Or just reload it if an up-to-date copy is still in memory).
  • save/restore a call-preserved register at the start/end of your whole function, and copy the pointer to one of those registers before the function call.
  • the function returns the value in a register for you. (Of course, this only works if the C source is written to use the return value instead of the input variable. e.g. dst = strcpy(dst, src); if you aren't nesting it).

All calling conventions on all architectures I'm aware of return pointer-sized return values in a register, so having maybe one extra instruction in the library function can save code-size in all callers that want to use that return value.

You probably got better asm from primitive early C compilers by using the return value of strcpy (already in a register) than by making the compiler save the pointer around the call in a call-preserved register or spill it to the stack. This may still be the case.

BTW, on many ISAs, the return-value register is not the first arg-passing register. And unless you use base+index addressing modes, it does cost an extra instruction (and tie up another reg) for strcpy to copy the register for a pointer-increment loop.

PDP-11 toolchains normally used some kind of stack-args calling convention, always pushing args on the stack. I'm not sure how many call-preserved vs. call-clobbered registers were normal, but only 5 or 6 GP regs were available (R7 being the program counter, R6 being the stack pointer, R5 often used as a frame pointer). So it's similar to but even more cramped than 32-bit x86.

char *bar(char *dst, const char *str1, const char *str2)
{
    //return strcat(strcat(strcpy(dst, str1), "separator"), str2);

    // more readable to modern eyes:
    dst = strcpy(dst, str1);
    dst = strcat(dst, "separator");
//    dst = strcat(dst, str2);
    
    return dst;  // simulates further use of dst
}

  # x86 32-bit gcc output, optimized for size (not speed)
  # gcc8.1 -Os  -fverbose-asm -m32
  # input args are on the stack, above the return address

    push    ebp     #
    mov     ebp, esp  #,      Create a stack frame.

    sub     esp, 16   #,      This looks like a missed optimization, wasted insn
    push    DWORD PTR [ebp+12]      # str1
    push    DWORD PTR [ebp+8]       # dst
    call    strcpy  #
    add     esp, 16   #,

    mov     DWORD PTR [ebp+12], OFFSET FLAT:.LC0      # store new args over our incoming args
    mov     DWORD PTR [ebp+8], eax    #  EAX = dst.
    leave   
    jmp     strcat                  # optimized tailcall of the last strcat

This is significantly more compact than a version which doesn't use dst =, and instead reuses the input arg for the strcat. (See both on the Godbolt compiler explorer.)

The -O3 output is very different: gcc for the version that doesn't use the return value uses stpcpy (returns a pointer to the tail) and then mov-immediate to store the literal string data directly to the right place.

But unfortunately, the dst = strcpy(dst, src) -O3 version still uses regular strcpy, then inlines strcat as strlen + mov-immediate.


To C-string or not to C-string

C implicit-length strings aren't always inherently bad, and have interesting advantages (e.g. a suffix is also a valid string, without having to copy it).

But the C string library is not designed in a way that makes efficient code possible, because char-at-a-time loops typically don't auto-vectorize and the library functions throw away results of work they have to do.

gcc and clang never auto-vectorize loops unless the iteration count is known before the first iteration, e.g. for(int i=0; i<n ;i++). ICC can vectorize search loops, but it's still unlikely to do as well as hand-written asm.


strncpy and so on are basically a disaster. e.g. strncpy doesn't copy the terminating '\0' if it reaches the buffer size limit, so you need to manually arr[n] = 0; before or after. But if the source string is shorter, it pads with 0 bytes out to the specified length, potentially touching a page of memory that never needed to be touched. (Also making it very inefficient for copying short strings into a large buffer that still has lots of space left.) It appears to have been designed for writing into the middle of larger strings, not for avoiding buffer overflows.

A few functions like snprintf are usable and do always nul-terminate. Remembering which does which is hard, and a huge risk if you remember wrong, so you have to check every time in cases where it matters for correctness.

As Bruce Dawson says: Stop using strncpy already!. Apparently some MSVC extensions like _snprintf are even worse.

strncat also exists in POSIX.2001 and is unrelated to strcpy; it does what you'd hope, a bounds-checked strcpy which always 0-terminates. But like strcat it still returns the original pointer so is not useful for efficiently appending strings into a buffer; it has to re-scan the leading part every time to find the current end if you simply call it repeatedly on the same buffer. The man page mentions "Shlemiel the painter".

Munday answered 26/7, 2018 at 21:15 Comment(0)
H
6

I believe that your guess is correct, it makes it easier to nest the call.

Histone answered 24/8, 2010 at 22:10 Comment(0)
V
2

Its also extremely easy to code.

The return value is typically left in the AX register (it is not mandatory, but it is frequently the case). And the destination is put in the AX register when the function starts. To return the destination, the programmer needs to do.... exactly nothing! Just leave the value where it is.

The programmer could declare the function as void. But that return value is already in the right spot, just waiting to be returned, and it doesn't even cost an extra instruction to return it! No matter how small the improvement, it is handy in some cases.

Vasos answered 24/8, 2010 at 22:21 Comment(3)
Funny, I can find no mention of an AX register in the ISO C standards documents :-)Schottische
Because that detail belongs to the compiler implementation, something that the ISO standard does not cover. It is part of the x86 function call convention, as noted here: "Integer values and memory addresses are returned in the EAX register"Hashish
I think this is part of the reason; you probably got better asm from primitive early C compilers by using the return value of strcpy (already in a register) than by making the compiler save the pointer around the call in a call-preserved register or spill it to the stack. This may still be the case. BTW, on many ISAs, the return-value register is not the first arg-passing register. And unless you use base+index addressing modes, it does cost an extra instruction (and tie up another reg) for strcpy to copy the register for a pointer-increment loop.Munday
B
0

Same concept as Fluent Interfaces. Just making code quicker/easier to read.

Buxtehude answered 24/8, 2010 at 22:22 Comment(0)
C
-2

I don't think this is really set up this way for nesting purposes, but more for error checking. If memory serves none of the c standard library functions do much error checking on their own and therefor it makes more sense that this would be to determine if something went awry during the strcpy call.

if(strcpy(dest, source) == NULL) {
  // Something went horribly wrong, now we deal with it
}
Copyedit answered 29/2, 2016 at 13:34 Comment(2)
strcpy doesn't have any way to check errors itself. Also, it's required to always return dest, so it can only return NULL if it already tried to write into a NULL pointer, so a hypothetical system that catches SIGSEGV in strcpy and returns NULL would be violating that contract. Although UB has already happened so there would be room for that extension since ISO C doesn't have anything to say about a program where strcpy does something bad.Munday
strcpy would only return NULL if passed NULL, which is undefined behavior anywayHaygood

© 2022 - 2024 — McMap. All rights reserved.