Return a `struct` from a function in C
Asked Answered
S

11

233

Today I was teaching a couple of friends how to use C structs. One of them asked if you could return a struct from a function, to which I replied: "No! You'd return pointers to dynamically malloced structs instead."

Coming from someone who primarily does C++, I was expecting not be able to return structs by values. In C++ you can overload the operator = for your objects and makes complete sense to have a function to return your object by value. In C, however, you do not have that option and so it got me thinking what the compiler is actually doing. Consider the following:

struct MyObj{
    double x, y;
};

struct MyObj foo(){
    struct MyObj a;
    
    a.x = 10;
    a.y = 10;
    
    return a;
}        

int main () {

    struct MyObj a;
    
    a = foo();    // This DOES work
    struct b = a; // This does not work
      
    return 0;
}    

I understand why struct b = a; should not work -- you cannot overload operator = for your data type. How is it that a = foo(); compiles fine? Does it mean something other than struct b = a;? Maybe the question to ask is: What exactly does the return statement in conjunction to = sign do?

Spragens answered 11/3, 2012 at 6:59 Comment(3)
struct b = a; is a syntax error. What if you try struct MyObj b = a;?Nagari
@GregHewgill: You are absolutely right. Quite interestingly, however, struct MyObj b = a; does seem to work :)Spragens
It's arrays that you can't return from functions (or assign), since arrays are not first-class types in C. But a struct is a properly first-class type, and can be assigned, passed, and returned with impunity. You don't have to define your own operator= (as indeed you could in C++), because any struct is by definition POD, and a simple memcpy-like assignment, which the compiler is perfectly willing to perform, is sufficient. See also What does impossibility to return arrays actually mean in C?Coenocyte
S
274

You can return a structure from a function (or use the = operator) without any problems. It's a well-defined part of the language. The only problem with struct b = a is that you didn't provide a complete type. struct MyObj b = a will work just fine. You can pass structures to functions as well - a structure is exactly the same as any built-in type for purposes of parameter passing, return values, and assignment.

Here's a simple demonstration program that does all three - passes a structure as a parameter, returns a structure from a function, and uses structures in assignment statements:

#include <stdio.h>

struct a {
   int i;
};

struct a f(struct a x)
{
   struct a r = x;
   return r;
}

int main(void)
{
   struct a x = { 12 };
   struct a y = f(x);
   printf("%d\n", y.i);
   return 0;
}

The next example is pretty much exactly the same, but uses the built-in int type for demonstration purposes. The two programs have the same behaviour with respect to pass-by-value for parameter passing, assignment, etc.:

#include <stdio.h>

int f(int x) 
{
  int r = x;
  return r;
}

int main(void)
{
  int x = 12;
  int y = f(x);
  printf("%d\n", y);
  return 0;
}
Sausauce answered 11/3, 2012 at 7:1 Comment(18)
That's quite interesting. I was always under the impression you need pointers for these. I was wrong :)Spragens
You certainly don't need pointers. That said, most of the time you would want to use them - the implicit memory copies that take place flinging structures around by value can be a real waste of CPU cycles, not to mention memory bandwidth.Sausauce
Absolutely. I always pass variables as pointers or by reference (in C++) myself :)Spragens
@CarlNorum how large does a structure have to get that a copy costs more than malloc + free?Flapper
@josefx, a single copy? Probably huge. The thing is, normally if you're passing structures around by value you're copying them a lot. Anyway it's not really as simple as that. You could be passing around local or global structures, in which case thir allocation cost is pretty much free.Sausauce
It would be good to point out that the default special member functions were defined to be compatible with C. For example in C++ struct S { int i; }; gets a default assignment operator which has the same behavior as C specifies for the same definition. So the interesting this is not just that these things work in C, but that C++'s behavior is based on C. In fact it's been argued that C++ would be better off if it didn't have to be compatible with C so it could force people to explicitly request special member functions rather than implicitly generate them with sometimes harmful results.Sheepskin
You need pointers and allocation of memory for the returned value outside the function body as soon as the amount of memory allocated for a value isn't known at compile time. It is for structs, so C functions have no problem returning them.Doglike
Return Value Optimization seams to only be discussed with respect to C++ but isn't it just as valid for C? If I define a struct in a function and return it will me C compiler use RVO to avoid the copy.Etymon
@Zboson According to this answer, RVO and NRVO are implicitly allowed by the C standard.Antimonous
@JustinTime, that's an answer to my question.Etymon
It's shocking that this is allowed. When I tried it, I fully expected the memory to be overwritten with garbage by the time the function exited, just like a regular char[] would, seeing as a struct has to be allocated on the stack. But instead, I managed to get this to compile and work: struct wtf{int array[42];}; ... struct wtf WHAT(){return (struct wtf){{1, 2, 5, 7}};} ... int *array = WHAT().array; ... printf("array[2] = %i", array[2]);, which brings me here, looking for reasons why returning structs would be a horrible idea. (I do kinda like it, though)Vincenz
@BradenBest: calling WHAT().array is not allowed by standard C, although most compilers will handle it by placing the result of WHAT() in an implicit stack variable, ensuring that the pointer to the array points to valid stack data until you leave the scope. But sometimes it makes perfect sense to return structs rather than pointers to structs, e.g. when dealing with a large number of small structs (like struct Point{int x; int y;}).Telegraphic
@BradenBest, right, and if you're using known sizes, like working with data from a database where the maximum sizes are predefined, there's no need to use pointers, malloc, etc.Produce
@DavidPetersonHarvey There is one thing; if sizeof (struct mystruct) > sizeof (struct mystruct *), then passing mystruct objects around by value would be inherently slower than passing them around by address, and you'd need a good reason for the tradeoff, lest you bottleneck your program. Which is probably why the majority of use cases involve passing struct pointers around instead of raw structs. Like Groo said, raw structs make sense if they're small and you have a lot of them--sizeof (struct Point) == sizeof (struct Point *) is not guaranteed, but is more likely to be true than not.Vincenz
Also, I've read that said bottleneck can be optimized away with copy elision (which is a thing in C but is never talked about), but compilers aren't all that consistent when it comes to this optimization, that means it depends on the compiler, so if performance is a concern, it's safer to pass an object that has constant size (struct ... *) than an object that may or may not vary in size (struct ...) and where the performance hit of copying said object may or may not be optimized away. Doing so also preserves predictable C semantics, so it may prevent bugs in the long run as well.Vincenz
Please provide a citation supporting, "It's a well-defined part of the language." In N2176, I read, "The return type of a function shall be void or a complete object type other than array type." However, I have not yet found (or do not understand) a definition of "complete object type" therein that would guarantee that a function may return a struct. Perhaps I should be reading a more definitive reference.Obedience
The C FAQ has some references and citations here: c-faq.com/struct/firstclass.html Is that what you were looking for?Sausauce
In my opinion, this is an important detail. @CarlNorum Isn't it worth to be added to your answer because some people still consider K&R (The C Programming Language) as a sort of “standard” while it was in fact just a proto standard. Before finding this comment, I read about it here: The Development of the C Language* by Dennis M. RitchieNucleoside
N
43

When making a call such as a = foo();, the compiler might push the address of the result structure on the stack and passes it as a "hidden" pointer to the foo() function. Effectively, it could become something like:

void foo(MyObj *r) {
    struct MyObj a;
    // ...
    *r = a;
}

foo(&a);

However, the exact implementation of this is dependent on the compiler and/or platform. As Carl Norum notes, if the structure is small enough, it might even be passed back completely in a register.

Nagari answered 11/3, 2012 at 7:2 Comment(7)
That's totally implementation dependent. For example, armcc will pass small enough structures in the regular parameter passing (or return value) registers.Sausauce
Wouldn't that be returning a pointer to a local variable? The memory for the returned structure can't be part of foo stack frame. It has to be in a place that survives past the return of foo.Clamper
@AndersAbel: I think what Greg means is that compiler takes a pointer to the variable in the main function and passes it to the function foo. Inside the function foo, you just do the assignmentSpragens
@AndersAbel: The *r = a at the end would (effectively) do a copy of the local variable to the caller's variable. I say "effectively" because the compiler might implement RVO and eliminate the local variable a entirely.Nagari
Of course you're right, I misread it as an assignment of the pointer. Too long since I actually wrote any C code I guess...Clamper
Although this does not directly answer the question, this is the reason why many people will fall here via google c return struct: they know that in cdecl eax is returned by value and that structs in general don't fit inside eax. This is what I was looking for.Changsha
Thank you so much. I was about to pull my hair out working on a legacy project. btw, I am bald already. Again, THANK YOU very much. Your example is so elegantly simple. Its all that is needed.Status
A
19

As far as I can remember, the first versions of C only allowed to return a value that could fit into a processor register, which means that you could only return a pointer to a struct. The same restriction applied to function arguments.

More recent versions allow to pass around larger data objects like structs. I think this feature was already common during the eighties or early nineties.

Arrays, however, can still be passed and returned only as pointers.

Artiste answered 11/3, 2012 at 9:53 Comment(5)
You can return an array by value if you put it inside a struct. What you can't return by value is a variable-length array.Ive
Yes, I can put an array inside a struct, but I cannot e.g. write typedef char arr[100]; arr foo() { ... } An array cannot be returned, even if the size is known.Artiste
Not that it probably maters any more, but I would guess that any initial downvotes were for the speculation in the first paragraph. Stack Overflow likes facts, not speculation. (Also I believe the speculation is somewhat incorrect. My memory is that the very first versions of C couldn't return structs at all, and the text of K&R I said so, although an improved version of the compiler that could pass and return structs — any size structs — was released at just about the same time K&R I actually hit bookstores.)Coenocyte
@SteveSummit The first official standard, ANSI C in 1989, included assignment between, return of, and passing of struct types. See c-faq.com/struct/firstclass.html or csapp.cs.cmu.edu/3e/docs/chistory.html -- So it was not a trivial decision to use these features before 1989.Nucleoside
@Artiste I know exactly what you mean. Maybe it would be better to speak of C compilers here instead of versions of C.Nucleoside
H
18

The struct b line doesn't work because it's a syntax error. If you expand it out to include the type it will work just fine

struct MyObj b = a;  // Runs fine

What C is doing here is essentially a memcpy from the source struct to the destination. This is true for both assignment and return of struct values (and really every other value in C)

Hearst answered 11/3, 2012 at 7:3 Comment(8)
+1, in fact, many compilers will actually emit a literal call to memcpy in this case - at least, if the structure is reasonably large.Sausauce
So, during initialization of a datatype, the memcpy function works??Boyne
@Boyne I'm not quite sure what you're asking here. Could you elaborate a bit?Hearst
If struct MyObj b = a calls the memcpy function and u said it happens for any other value in C. Then I mean does this code int b = a also calls the memcpy function, I am quite confused here??Boyne
@Boyne i don't mean it literally calls the memcpy function. I was suggesting that you visualize the assignment as if memcpy was used because it has roughly the same behavior.Hearst
@Hearst - compilers often do literally call the memcpy function for the structure situations. You can make a quick test program and see GCC do it, for example. For built-in types that won't happen - they're not large enough to trigger that kind of optimization.Sausauce
@CarlNorum after reading your initial comment I was planning on playing around tonight to see when it would get called.Hearst
It's definitely possible to make it happen - the project I'm working on doesn't have a memcpy symbol defined, so we often run into "undefined symbol" linker errors when the compiler decides to spit one out on its own.Sausauce
D
12

There is no issue in passing back a struct. It will be passed by value

But, what if the struct contains any member which has a address of a local variable

struct emp {
    int id;
    char *name;
};

struct emp get() {
    char *name = "John";

    struct emp e1 = {100, name};

    return (e1);
}

int main() {

    struct emp e2 = get();

    printf("%s\n", e2.name);
}

Now, here e1.name contains a memory address local to the function get(). Once get() returns, the local address for name would have been freed up. SO, in the caller if we try to access that address, it may cause segmentation fault, as we are trying a freed address. That is bad..

Where as the e1.id will be perfectly valid as its value will be copied to e2.id

So, we should always try to avoid returning local memory addresses of a function.

Anything malloced can be returned as and when wanted

Dogfight answered 28/3, 2017 at 19:26 Comment(3)
This is wrong, assigning a string literal to a pointer forces the string to be static and it lives for the whole program. In fact this static string is not allowed to be written to, so it should be const (char const *name). What you want is a local array.Foilsman
It's not a matter of returning a struct or a pointer. The member name still point to a local variable that is not available outside the get() function even if you malloc e1 and return its pointerPearlypearman
@RamonLaPietra string literals are static, see e.g. #2590449 i.e. the value of name (pointer) is copied to .name (struct offset) and points to a static char array.Erratum
S
9

Yes, it is possible we can pass structure and return structure as well. You were right but you actually did not pass the data type which should be like this struct MyObj b = a.

Actually, I also came to know when I was trying to find out a better solution to return more than one values for function without using pointer or global variable.

Now below is the example for the same, which calculate the deviation of a student marks about average.

#include<stdio.h>
struct marks{
    int maths;
    int physics;
    int chem;
};

struct marks deviation(struct marks student1 , struct marks student2 );

int main(){
    
    struct marks student;
    student.maths= 87;
    student.chem = 67;
    student.physics=96;
    
    struct marks avg;
    avg.maths= 55;
    avg.chem = 45;
    avg.physics=34;
    //struct marks dev;
    struct marks dev= deviation(student, avg );
    printf("%d %d %d" ,dev.maths,dev.chem,dev.physics);

    return 0;
 }

struct marks deviation(struct marks student , struct marks student2 ){
    struct marks dev;

    dev.maths = student.maths-student2.maths;
    dev.chem = student.chem-student2.chem;
    dev.physics = student.physics-student2.physics; 

    return dev;
}
Scallion answered 30/12, 2014 at 13:26 Comment(0)
V
4

You can assign structs in C. a = b; is valid syntax.

You simply left off part of the type -- the struct tag -- in your line that doesn't work.

Viewless answered 11/3, 2012 at 7:3 Comment(0)
P
3
struct emp {
    int id;
    char *name;
};

struct emp get() {
    char *name = "John";

    struct emp e1 = {100, name};

    return (e1);
}

int main() {

    struct emp e2 = get();

    printf("%s\n", e2.name);
}

works fine with newer versions of compilers. Just like id, content of the name gets copied to the assigned structure variable.

Polygamous answered 14/5, 2017 at 19:2 Comment(3)
Even simpler: struct emp get() { return {100, "john"}; }Thorite
@ChrisReid I'd expect this to be valid C++, not C. Even with C17, I get a compiler error here: error: expected expression before ‘{’ tokenNucleoside
perhaps return (struct emp) {100, "john};Thorite
G
1

struct var e2 address pushed as arg to callee stack and values gets assigned there. In fact, get() returns e2's address in eax reg. This works like call by reference.

Gelatinous answered 21/6, 2019 at 19:24 Comment(0)
S
1

C does not have reference types, everthing is a value!

Bad Example

This code works, but its not the way you should do it as the function might obscure whether the struct is allocated on the stack or manually on the heap. It also might be costly performancewize as the struct might get copied during the return process.

#include <stdio.h>

struct MyStruct{
    int a;
};

struct MyStruct MyFunction(){
    struct MyStruct someData;
    someData.a = 10;
    return someData;
}

int main(){

    struct MyStruct firstData;
    struct MyStruct secondData;

    firstData = MyFunction();
    printf("first %d\n", firstData.a);

    secondData = firstData;
    printf("second %d\n", secondData.a);
    
    return 1;
}

Output:

first 10
second 10

Good Example

Instead of creating a struct inside a function and return it, it recommended to have pass a pointer to a struct inside the function. This allows the consumer of the function to implement fully transparent memory management of this field. Usually structs hold larger amounts of data, and such often allocated dynamically. This also has the advantage that you can use the return value of the function to give the caller additional information about the outcome of the execution in the style of a status flag (for example 0=ok, 1=error, etc). Further more, working with pointers is also more performant as there is no need to copy the whole struct to pass it into the function as a value.

Beside the performance impact of passing structs around by value, there is also a pitfall where the function would be changed to dynamically allocated the struct, with the calling code not being changes to released the memory manually, and so introducing a memory leak.

#include <stdio.h>
#include <stdlib.h>

struct MyStruct{
    int a;
};

int MyFunction(struct MyStruct * pointer){
    pointer->a = 10;
    return 0;
}

int main(){

    struct MyStruct* firstData;
    int statusFlag;

    firstData = malloc(sizeof(struct MyStruct));

    statusFlag = MyFunction(firstData);
    
    if(statusFlag == 0){
        printf("OKAY %d\n", firstData->a);
    }
    else{
        printf("ERROR!\n");
    }

    free(firstData);

    return 0;
}

Output:

OKAY 10

Schottische answered 12/6, 2023 at 15:33 Comment(2)
You know what I learnt? Creating and Initializing and Copying structures takes less than 10 milli-seconds. If I remember well, I can even say less than 3 milli-seconds. And this is on a rather mature desktop computer. Furthermore, it is much neater alot of times to declare and initialize the variable.Bourgogne
Whether to initialize inline is a question of the standard, style, and semantics. I have found listing all variables at the beginning archives consistency and transparency. It also implicitly ensures that functions are crisp because otherwise, there is a temptation to declare subroutines inside a current function. Expressing computational costs in time is very subjective to the individual application and system you run on. The actual time to copy an array depends heavily on its size and can not be generalized. Whan can be said is that stack allocations are faster than heap allocations.Schottische
R
0
#include <stdio.h>

struct emp {
    int id;
    char *name; /* This must point to valid memory, or replace with an array  that holes the data, like this char name[128] */
};

struct emp bad() {
    static char name[] = {'J', 'o', 'h', 'n', '\0'}; /* static enforces this array to be stored globally and not in the local stack which would not be valid after the function returns */
    struct emp e1 = {404, name};
    return (e1);
}

int main() {
    struct emp e2 = bad();
    printf("%s\n", e2.name);
}
Ramify answered 19/1, 2023 at 12:41 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Ramrod

© 2022 - 2024 — McMap. All rights reserved.