What do linkers actually do with multiply-defined `inline` functions?
Asked Answered
S

5

10

In both C and C++, inline functions with external linkage can of course have multiple definitions available at link-time, the assumption being that these definitions are all (hopefully) identical. (I am of course referring to functions declared with the inline linkage specification, not to functions that the compiler or link-time-optimizer actually inlines.)

So what do common linkers typically do when they encounter multiple definitions of a function? In particular:

  • Are all definitions included in the final executable or shared-library?
  • Do all invocations of the function link against the same definition?
  • Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?

P.S. Yes, I know C and C++ are separate languages, but they both support inline, and their compiler-output can typically be linked by the same linker (e.g. GCC's ld), so I believe there cannot be any difference between them in this aspect.

Stipendiary answered 5/2, 2016 at 21:3 Comment(9)
Can you make clear whether you are referring to functions with external or internal linkage? Obviously names with internal linkage mean something else in every translation unit.Pannell
Though C and C++ both support inline it is not safe to suppose (and I do not know whether) the semantics are equivalent. That it may be the case that the same linker can link object files derived from C sources and from C++ sources does not imply that the rules for inline must be the same in C and C++, as standalone linkers do not take either C source code or C++ source code as input.Fresh
@500-InternalServerError Does the "compiler-theory" tag really fit here? There's no tag wiki, but I would expect "compiler theory" to be about formal theory regarding compilers, e.g. the limitations of different parsing models.Stipendiary
@KerrekSB Edited to clarify that I'm talking about functions with external linkage.Stipendiary
@JohnBollinger But regardless of the source language, the linker must be able to handle multiply-defined functions.Stipendiary
@KyleStrand, no that does not follow. Linkers may indeed be able to do that, but the existence of inline functions does not require it, not any more than C++ function overloading requires it. It is typical for C++ systems to use name mangling specifically to avoid multiply-defined functions. There is no reason why a C or C++ system could not do the same to support internal and inline functions.Fresh
Do note, by the way, that your premise "inline functions with external linkage can of course have multiple definitions available at link-time" is deceptive, at least for C. C inline functions are explicitly not external, even if they meet the standard's criteria for being designated as having external linkage. As detailed in my answer, there are never more than two choices of function definition available for satisfying a C function call: one external and one internal.Fresh
@KyleStrand: Perhaps not - I was just trying to give your question more exposure to people who actually have this kind of expertise.Sunset
@500-InternalServerError Thanks, but I think I'll remove the tag since I don't think it's appropriate.Stipendiary
V
8

If the function is, in fact, inlined, then there's nothing to link. It's only when, for whatever reason, the compiler decides not to expand the function inline that it has to generate an out-of-line version of the function. If the compiler generates an out-of-line version of the function for more than one translation unit you end up with more than one object file having definitions for the same "inline" function.

The out-of-line definition gets compiled into the object file, and it's marked so that the linker won't complain if there is more than one definition of that name. If there is more than one, the linker simply picks one. Usually the first one it saw, but that's not required, and if the definitions are all the same, it doesn't matter. And that's why it's undefined behavior to have two or more different definitions of the same inline function: there's no rule for which one to pick. Anything can happen.

Varick answered 5/2, 2016 at 21:15 Comment(1)
Re: your first sentence, can the compiler (for either language) simply discard the definition if every call is inlined, then, unless something else (such as taking the address of the function) comes into play? Also, you've answered the second question but not the first: when you say "the compiler picks one," do you mean that the other definition is completely elided from the executable?Stipendiary
P
3

The linker just has to figure out how to deduplicate all the definitions. That is of course provided that any function definitions have been emitted at all; inline functions may well be inlined. But should you take the address of an inline function with external linkage, you always get the same address (cf. [dcl.fct.spec]/4).

Inline functions aren't the only construction which require linker support; templates are another, as are inline variables (in C++17).

Pannell answered 5/2, 2016 at 21:9 Comment(6)
I thought templates were just implicitly-inline, and that template instantiations aren't different at link-time from non-template functions?Stipendiary
@KyleStrand: No, template functions and inline function both allow multiple identical definitions, but the behavior is not the same. In particular, "An inline function shall be defined in every translation unit in which it is odr-used and shall have exactly the same definition in every case." but template functions do not have to be defined in every translation unit.Significative
@BenVoigt Template explicit specializations don't need to be defined in every translation unit, because as I said they're not different at link-time from non-template functions, but how would it be possible to use a template function if its definition isn't present at the call/instantiation site?Stipendiary
This is the correct answer... it's worth noting that the linker support is often called weak symbols. In Microsoft's implementation, there are two kinds: noduplicates and selectany.Significative
@KyleStrand: With a forward-declaration, same as any other function which is defined in another TU. Specialization not required, it works perfectly well for unspecialized cases if some other TU performs instantiation (maybe explicitly).Significative
@BenVoigt Okay, that makes sense. I hadn't realized that was possible, but I realize it makes sense to permit that.Stipendiary
F
3

inline or no inline, C does not permit multiple external definitions of the same name among the translation units contributing to the same program or library. Furthermore, it does not permit multiple definitions of the same name in the same translation unit, whether internal, external, or inline. Therefore, there can be at most two available definitions of a given function in scope in any given translation unit: one internal and/or inline, and one external.

C 2011, 6.7.4/7 has this to say:

Any function with internal linkage can be an inline function. For a function with external linkage, the following restrictions apply: If a function is declared with an inline function specifier, then it shall also be defined in the same translation unit. If all of the file scope declarations for a function in a translation unit include the inline function specifier without extern , then the definition in that translation unit is an inline definition . An inline definition does not provide an external definition for the function, and does not forbid an external definition in another translation unit. An inline definition provides an alternative to an external definition, which a translator may use to implement any call to the function in the same translation unit. It is unspecified whether a call to the function uses the inline definition or the external definition.

(Emphasis added.)

In specific answer to your questions, then, as they pertain to C:

Are all definitions included in the final executable or shared-library?

Inline definitions are not external definitions. They may or may not be included as actual functions, as inlined code, both, or neither, depending on the foibles of the compiler and linker and on details of their usage. They are not in any case callable by name by functions from different translation units, so whether they should be considered "included" is a bit of an abstract question.

Do all invocations of the function link against the same definition?

C does not specify, but it allows for the answer to be "no", even for different calls within the same translation unit. Moreover, inline functions are not external, so no inline function defined in one translation unit is ever called (directly) by a function defined in a different translation unit.

Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?

My answers are based on the current C standard to the extent that it addresses the questions, but as you will have seen, those answers are not entirely prescriptive. Moreover, the standard does not directly address any question of object code or linking, so you may have noticed that my answers are not, for the most part, couched in those terms.

In any case, it is not safe to assume that any given C system is consistent even with itself in these regards for different functions or in different contexts. Under some circumstances it may inline every call to an internal or inline function, so that that function does not appear as a separate function at all. At other times it may indeed emit a function with internal linkage, but that does not prevent it from inlining some calls to that function anyway. In any case, internal functions are not eligible to be linked to functions from other translation units, so the linker is not necessarily involved with linking them at all.

Fresh answered 5/2, 2016 at 22:16 Comment(7)
This is helpful, but I'm going to need some time to process it. I'm also not sure how "abstract" the question about functions definitions being included multiple times in the final executable is; I simply meant, are multiple copies of the actual binary code included in the final output, not including inlined code at various call-sites? (Can I word this more clearly?) I'm now confused about why nsilent's code fails to compile with gcc, though, since it looks like it follows the rules in the bit of the standard you quoted.Stipendiary
@KyleStrand, I called the question "abstract" in the sense that considering whether having code in a binary or library for a function that cannot be called means that function is "included" seems more an exercise in semantics than one of any practical value.Fresh
@KyleStrand, in any event, it is impossible for any function definition to appear multiple times, as any appearance of any function, whatever its name, is a separate function from every other one. If you're asking whether separate functions with identical code might appear then yes, it's possible, and you don't need to bring inline functions into the discussion to get there -- static ones are sufficient to introduce the possibility. How any particular compiler or linker will behave in this regard, however, is rather idiosyncratic.Fresh
It's bloat if inaccessible code is present in the final binary. I realize this isn't much of a concern in most cases, but I think it's still a valid question.Stipendiary
@KyleStrand, who said anything about inaccessible code? Compilers and linkers may indeed put inaccessible code into binaries, and the standard has nothing to say about the matter, but good ones don't. There is nothing about inline functions in particular that makes them any more likely to be included as dead code than any other function is.Fresh
I don't understand why you're acting like that's such a weird question. You said, in italics, "a function that cannot be called". Isn't that "inaccessible code"?Stipendiary
@KyleStrand, I see now how we got off on this tangent. If what you're really worried about is the physical size of binaries -- and that's not entirely clear to me from your question alone -- then the standard has no answers for you whatever, and, as I already answered, you should not assume that there is a general rule even within a given implementation. Moreover, there is nothing special in this regard to distinguish (C) inline functions from any other internal function. If anything makes your question weird, it is your apparent assumption to the contrary.Fresh
C
1

I think the correct answer to your question is "it depends".

Consider following pieces of code:

File x.c (or x.cc):

#include <stdio.h>

void otherfunction(void);

inline void inlinefunction(void) {
    printf("inline 1\n");
}

int main(void) {
    inlinefunction();
    otherfunction();
    return 0;
}

File y.c (or y.cc)

#include <stdio.h>

inline void inlinefunction(void) {
    printf("inline 2\n");
}

void otherfunction(void) {
    printf("otherfunction\n");
    inlinefunction();
}

As inline keyword is only a "suggestion" for the compile to inline the function different compilers with different flags behave differently. E.g. looks like C compiler always "exports" inline functions and does not allow for multiple definitions:

$ gcc x.c y.c && ./a.out 
/tmp/ccy5GYHp.o: In function `inlinefunction':
y.c:(.text+0x0): multiple definition of `inlinefunction'
/tmp/ccQkn7m4.o:x.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status

while C++ allows it:

$ g++ x.cc y.cc && ./a.out 
inline 1
otherfunction
inline 1

More interesting - let's try to switch order of files (and so - switch the order of linking):

$ g++ y.cc x.cc && ./a.out 
inline 2
otherfunction
inline 2

Well... it looks that first one counts! But... let's add some optimization flags:

$ g++ y.cc x.cc -O1 && ./a.out 
inline 1
otherfunction
inline 2

And that's the behavior we'd expect. Function got inlined. Different order of files changes nothing:

$ g++ x.cc y.cc -O1 && ./a.out 
inline 1
otherfunction
inline 2

Next we can extend our x.c (x.cc) source with prototype of void anotherfunction(void) and call it in our main function. Let's place anotherfunction definition in z.c (z.cc) file:

#include <stdio.h>

void inlinefunction(void);

void anotherfunction(void) {
    printf("anotherfunction\n");
    inlinefunction();
}

We do not define the body of inlinefunction this time. Compilation/execution for c++ gives following results:

$ g++ x.cc y.cc z.cc && ./a.out 
inline 1
otherfunction
inline 1
anotherfunction
inline 1

Different order:

$ g++ y.cc x.cc z.cc && ./a.out 
inline 2
otherfunction
inline 2
anotherfunction
inline 2

Optimization:

$ g++ x.cc y.cc z.cc -O1 && ./a.out 
/tmp/ccbDnQqX.o: In function `anotherfunction()':
z.cc:(.text+0xf): undefined reference to `inlinefunction()'
collect2: ld returned 1 exit status

So conclusion is: the best is to declare inline together with static, which narrows the scope of the function usage, because "exporting" the function which we'd like to be used inline makes no sense.

Confirmand answered 5/2, 2016 at 21:46 Comment(6)
You're invoking undefined behavior here, because you violate the rule that "An inline function shall be defined in every translation unit in which it is odr-used and shall have exactly the same definition in every case"Significative
@BenVoigt: Of course. But as you can see g++ allows to do it. That's the "problem". If several programmers work on a project and write a piece of code as shown, the results could be undefined.Confirmand
It's not really "allowing" it, it is not generating a diagnostic (Standard doesn't require one). The result is still undefined behavior.Significative
That's what I wrote.Confirmand
@BenVoigt I think the point is to use the specific consequences of the undefined behavior in various configurations (C, C++, optimization, no optimization, etc) to gain some insight about the compiler/linker behavior. This seems perfectly valid to me, since no conclusions are drawn about what the Standards require, nor, for the most part, are there guesses about how consistent/regular this behavior is (except for the comment saying "looks like C compiler always...").Stipendiary
By the way, playing around with GCC 5, it appears that the "multiple definition" error is only triggered by the default mode and -std=gnu90. Other modes either don't support inline functions (e.g. -ansi) or give undefined reference errors (which I don't really understand, since there are at least two valid definitions for inlinefunction).Stipendiary
D
1

When inline functions don't end up being inlined, behavior differs between C++ and C. In C++ they behave like regular functions, but with additional symbol flag that allows for duplicate definitions, and the linker can select any one of them. In C, the actual function body gets ignored, and they behave just like external functions.

On ELF targets, linker behavior needed for C++ is implemented with weak symbols.

Note that weak symbols are often used in combination with regular (strong) symbols where strong symbols would override weak symbols (this is the main use case mentioned in the Wikipedia article on weak symbols). They can also be used for implementing optional references (linker would insert null value for weak symbol reference if a definition is not found). But for C++ inline functions, they provide exactly what we need: given multiple weak symbols defined with the same name, linker will select one of them, in my tests always the one from the file appearing first in the list of files passed to the linker.

Here are some examples showing the behavior in C++ and then in C:

$ cat c1.cpp
void __attribute__((weak)) func_weak() {}

void func_regular() {}

void func_external();

void inline func_inline() {}

void test() {
  func_weak();
  func_regular();
  func_external();
  func_inline();
}
                              
$ g++ -c c1.cpp
$ readelf -s c1.o | c++filt  | grep func
11: 0000000000000000    11 FUNC    WEAK   DEFAULT    2 func_weak()
12: 000000000000000b    11 FUNC    GLOBAL DEFAULT    2 func_regular()
13: 0000000000000000    11 FUNC    WEAK   DEFAULT    6 func_inline()
16: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND func_external()

We're compiling without optimization flag, causing inline function not to get inlined. We see that inline function func_inline gets emitted as weak symbol, the same as func_weak which is defined explicitly as weak using GCC attribute.

Compiling the same program in C, we see that func_inline is a regular external function, same as func_external:

$ cp c1.cpp c1.c
$ gcc -c c1.c
$ readelf -s c1.o | grep func
 9: 0000000000000000    11 FUNC    WEAK   DEFAULT    1 func_weak
10: 000000000000000b    11 FUNC    GLOBAL DEFAULT    1 func_regular
13: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND func_external
14: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND func_inline

So in C, in order to resolve this external reference, one has to designate a single file that contains the actual function definition.

When we use optimization flag, we cause inline function to actually get inlined, and no symbol is emitted at all:

$ g++ -O1 -c c1.cpp
$ readelf -s c1.o | c++filt | grep func_inline
$ gcc -O1 -c c1.c
$ readelf -s c1.o | grep func_inline
$
Dumb answered 24/6, 2021 at 18:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.