What does mean for a name or type to have a certain language linkage?
Asked Answered
M

7

40

According to (c) ANSI ISO/IEC 14882:2003, page 127:

Linkage specifications nest. When linkage specifications nest, the innermost one determines the language. A linkage specification does not establish a scope. A linkage-specification shall occur only in namespace scope (3.3). In a linkage-specification, the specified language linkage applies to the function types of all function declarators, function names, and variable names introduced by the declaration(s).

extern "C" void f1(void(*pf)(int));
// the name f1 and its function type have C language
// linkage; pf is a pointer to a C function

extern "C" typedef void FUNC();
FUNC f2;
// the name f2 has C++ language linkage and the
// function's type has C language linkage

extern "C" FUNC f3;
// the name of function f3 and the function's type
// have C language linkage

void (*pf2)(FUNC*);
// the name of the variable pf2 has C++ linkage and
// the type of pf2 is pointer to C++ function that
// takes one parameter of type pointer to C function

What does all this mean? For example, what linkage does the f2() function have, C or C++ language linkage?

As pointed out by @Johannes Schaub, there is no real explanation of what this means in the Standard so it can be interpreted differently in different compilers.

Please explain the differences in the object file:

  • a function's name with C language linkage and C++ language linkage.
  • a function's type with C language linkage and C++ language linkage.
Michele answered 23/4, 2011 at 11:29 Comment(11)
related topic: #5589881Elyseelysee
@artyom.stv:: Read this as well. https://mcmap.net/q/14733/-what-is-the-effect-of-extern-quot-c-quot-in-c/1041880#1041880Romansh
@Acme, that answer says "When you state that a function has extern "C" linkage in C++, the C++ compiler does not add argument/parameter type information to the name used for linkage.", but that appears to be an educated guess. The Standard doesn't say such thing anywhere. It just doesn't define how one can give a function said linkage (unless I'm missing it, but that's what I'm asking you about, since you are recommending that answer, you must have an opinion about its correctness).Elyseelysee
@litb:: Frankly i am not that well-acquainted with the Standard as you, however, can you go through this article as well - publib.boulder.ibm.com/infocenter/comphelp/v8v101/… - and tell me if it has some missing links? Coz i feel the answer whose link i gave before is kinda Ok considering what i read at the IBM site.Romansh
@Acme it doesn't seem to contain an answer to the question "Which linkage has f2() function, C language linkage or C++ language linkage?", at least how I interpret the question. Perhaps it was meant to ask about a different thing.Elyseelysee
@Johannes Schaub - litb: As I do understand, my question was not correct ("Which linkage has f2() function ...?"). It seems to be no difference (for C and C++ language linkage) in object files except calling convention and name mangling. I didn't understand this issue when I was asking the question (that's why the question was not enough correct) :).Michele
@artyom since even the Standard itself claim functions to have certain linkage, but do not describe what that means, the question was entirely valid. Implementations make name linkage affect mangling, and type linkage affect calling convention. But type linkage has no effect on name mangling, and name linkage has no effect on calling convention. Saying a "function has linkage" or even "a template has linkage" seems non-sensical, as the Standard does not seem to say anywhere what that means. The most intuitive (IMO) interpretation is that it means the name has a certain linkage.Elyseelysee
But which is not how some implementations interpret it, according to the tests I made and exposed in my other question I linked to.Elyseelysee
I realize that there might not be a definitive answer here found in the standard, but personally, I'd be ok with a convincing explanation backed by concrete info. I'll probably be following suit with you @artyom. If I can't decide who to give the bounty to, show your support for any decent answer that comes up and you'll probably be my tiebreaker. :)Sublingual
Thank you @Jeff for the bounty!! It greatly increased the interest in this issue :) But I still don't understand the point of view of some people, that there are differences between C and C++ calling conventions :| (they don't give examples with explaination of their point of view)Michele
@artyom: In the answers and comments, there's a couple of mentions of calling conventions on what it is. The calling conventions used are defined by the your CPU's architecture (e.g., x86) and operating system. So there's no real difference between C and C++ except maybe the default calling convention they use. As it turns out, the calling convention by default is the "C calling convention" which is what it is usually called and not necessarily belonging to C. Likewise, there isn't a "C++ calling convention."Sublingual
D
18

Language linkage is the term used for linkage between C++ and non-C++ code fragments. Typically, in a C++ program, all function names, function types and even variable names have the default C++ language linkage.

A C++ object code can be linked to another object code which is produced using some other source language (like C) using a predefined linkage specifier.

As you must be aware of the concept of name mangling, which encodes function names, function types and variable names so as to generate a unique name for them. This allows the linker to differentiate between common names (as in the case of function overloading). Name mangling is not desirable when linking C modules with libraries or object files compiled with a C++ compiler. To prevent name mangling for such cases, linkage specifiers are used. In this case, extern "C" is the linkage specifier. Let's take an example (c++ code mentioned here):

typedef int (*pfun)(int);  // line 1
extern "C" void foo(pfun); // line 2
extern "C" int g(int)      // line 3
...
foo( g ); // Error!        // line 5

Line 1 declares pfun to point to a C++ function, because it lacks a linkage specifier.

Line 2 therefore declares foo to be a C function that takes a pointer to a C++ function.

Line 5 attempts to call foo with a pointer to g, a C function, a type mis-match.

Diff in function name linkage:

Let's take two different files:

One with extern "c" linkage (file1.cpp):

#include <iostream>
using namespace std;

extern "C"
{
void foo (int a, int b)
{
    cout << "here";
}
}

int main ()
{
    foo (10,20);
    return 0;
}

One without extern "c" linkage (file2.cpp):

#include <iostream>
using namespace std;

void foo (int a, int b)
{
    cout << "here";
}

int main ()
{
    foo (10,20);
    return 0;
}

Now compile these two and check the objdump.

# g++ file1.cpp -o file1
# objdump -Dx file1

# g++ file2.cpp -o file2
# objdump -Dx file2

With extern "C" linkage, there is no name mangling for the function foo. So any program that is using it (assuming we make a shared lib out of it) can directly call foo (with helper functions like dlsym and dlopen) with out considering any name mangling effects.

0000000000400774 <foo>:
  400774:   55                      push   %rbp
  400775:   48 89 e5                mov    %rsp,%rbp
....
....
  400791:   c9                      leaveq 
  400792:   c3                      retq   

0000000000400793 <main>:
  400793:   55                      push   %rbp
  400794:   48 89 e5                mov    %rsp,%rbp
  400797:   be 14 00 00 00          mov    $0x14,%esi
  40079c:   bf 0a 00 00 00          mov    $0xa,%edi
  4007a1:   e8 ce ff ff ff          callq  400774 <foo>
  4007a6:   b8 00 00 00 00          mov    $0x0,%eax
  4007ab:   c9                      leaveq 

On the other hand, when no extern "C" is being used, func: foo is mangled with some predefined rules (known to compiler/linker being used) and so an application can not directly call it from it specifying the name as foo. You can however call it with the mangled name (_Z3fooii in this case) if you want, but nobody use it for the obvious reason.

0000000000400774 <_Z3fooii>:
  400774:   55                      push   %rbp
  400775:   48 89 e5                mov    %rsp,%rbp
 ...
...
  400791:   c9                      leaveq 
  400792:   c3                      retq   

0000000000400793 <main>:
  400793:   55                      push   %rbp
  400794:   48 89 e5                mov    %rsp,%rbp
  400797:   be 14 00 00 00          mov    $0x14,%esi
  40079c:   bf 0a 00 00 00          mov    $0xa,%edi
  4007a1:   e8 ce ff ff ff          callq  400774 <_Z3fooii>
  4007a6:   b8 00 00 00 00          mov    $0x0,%eax
  4007ab:   c9                      leaveq 
  4007ac:   c3                      retq   

This page is also a good read for this particular topic.

A nice and clearly explained article about calling convention: http://www.codeproject.com/KB/cpp/calling_conventions_demystified.aspx

Disproportionate answered 19/5, 2011 at 14:56 Comment(1)
That definitely makes sense. It never occurred to be that a type name needs its name mangled too.Sublingual
C
2

"the name f2 has C++ language linkage" In C++ language linkage not only the name of the function defines it but also the type of it arguments and the return value. in this case you have: void f2(void); but you can define with it: void f2(int a); without conflict because the linkage will see them as different types, a thing you wouldn't be able to do in C language.

"the function's type has C language linkage" I don't know the details but I know the high level of it. Basically it makes a C++ compiled function linkable from C. If I remember correctly In C and in C++ the way the parameters are passed to a function is different. In this case the function f2 will pass the parameters as C compiler does this. this way the function will be linkable both from C and C++.

Catnip answered 18/5, 2011 at 8:22 Comment(7)
What is the difference between C and C++ cdecl call convention? (I thought that cdecl is standardized and has the same behavior in C and C++)Michele
@Michele The extern "C" line tells the compiler that the external information sent to the linker should use C calling conventions and name mangling (e.g., preceded by a single underscore).Catnip
@Roee Gavirel: Mangling - yes. But Calling convension - that's how it is written often. What does "C calling convension" mean? Is there any difference between C and C++ calling convension?Michele
@artyom: Calling conventions are all platform specific, and yes many platforms do use different calling conventions (by default) for C and C++.Vertigo
@Michele calling convention is the term used to define how a function/method/etc. is called - it deals with how the parameters are passed and returned - which register or stack position hold what parameter and return value, how non-primitives(structs etc.), are passed to/returned from functions, who cleans up the stack. They can be different between C and C++ on a given platform. This is different from name mangling, which deals more with how a function/method is named and looked up when it's needed (e.g. during linking).Antitrades
@nos: It is very interesting to see an example (with different cdecl for C and C++, which [cdecl] is standardized). Could you show one, please?Michele
I think, we really need example to show differences between C and C++ calling convensions. Because I still don't see any. If it is possible on stackoverflow, I'll give special bounty for such example.Michele
M
2
extern "C" typedef void FUNC();
FUNC f2;
// the name f2 has C++ language linkage and the
// function's type has C language linkage

The name FUNC is declared with "C" linkage because it says extern "C" on the first line.

The name f2 has C++ linkage because that is the default, and no other linkage is given on line two.

The fact that the name f2 is used to refer to a function with C linkage doesn't change the linkage of the name.

Mothy answered 19/5, 2011 at 15:21 Comment(0)
M
2

It has to do with the ABI (Application Binary Interface) of the program.

As an API specifies the the external interface of the source code of a program, an ABI specifies the external interface of the binary code of the program (the compiled version).


Originally, C functions simply had a few different forms. Something like

int foo(int);

would be prefixed by an underscore by the compiler, to form _foo, and then exported to be made available to other applications.

However, that wasn't enough. If you look at the Windows API, for instance, you will see things like:

DWORD CreateWindowW(...);        //Original parameters
DWORD CreateWindowExW(..., ...); //More parameters

This is because there's no way to distinguish between the overloads of a function simply by looking at the name of the function, so people started changing them by adding an Ex suffix (or the like).

This grew to be pretty ugly, and it still didn't allow for operator overloading, which was featured in C++. Because of this, C++ came up with name mangling, to put extra information in the name of the function, like the data types of its parameters, and making it something cryptic with lots of @ symbols.

It was all well, except that it wasn't completely standardized.

Of course, as new languages and compilers came about, each came up with its own scheme, some incompatible with others. So if you need to import or export an external function, you need to specify what kind of ABI the compiler should look for, hence the extern "C++" you have there.

Medication answered 19/5, 2011 at 19:20 Comment(0)
C
2

What does all this mean? For example, what linkage does the f2() function have, C or C++ language linkage?

extern "C" typedef void FUNC();
FUNC f2;
// the name f2 has C++ language linkage and the 
// function's type has C language linkage 

What you're calling the "f2() function" has two aspects to its linkage:

  • the mangling or not of its name in the symbol table (which has C++ language linkage), and
  • the C or C++ calling convention necessary should the function be called (C).

To call f2() you find its name aka symbol in the object file, which will be a mangled version of "function named f2 taking no arguments". You can verify this trivially by compiling the above code and inspecting the object (e.g. w/ GNU tools nm --demangle).

But to call the function, the conventions for pre- and post-conditions re register usage, stack setup etc. are those of a C functions. It is legal for C and C++ functions to have different calling conventions, and might be done - for example - to facilitate C++ exception handling.

Please explain the differences in the object file: a function's name with C language linkage and C++ language linkage.

  • for C linkage, "f2" would be the symbol in the object file resulting from f2()
  • for C++ linkage, some mangled version of "function named f2 taking no arguments" (for GNU, _Z2f2v which demangles to f2())

a function's type with C language linkage and C++ language linkage.

As discussed above, this is about the register/stack usage convention for calling the code at the function's address. This meta-information is not necessarily stored in the symbol table information of the object (and certainly isn't part of the symbol name key itself).

Further, because each function adopts one of the calling conventions, a compiler needs to know the calling convention to use when following a pointer to a function: with that insight, I think the remaining code in the question becomes clear.

There's an excellent discussion at http://developers.sun.com/solaris/articles/mixing.html - in particular I recommend the section Working with Pointers to Functions.

Chargeable answered 20/5, 2011 at 2:41 Comment(0)
T
1

As we all know in C/C++ code translation is composed of two principal phases: compilation and linking. When compiler generates object files it passes information to linker specifying in which object files given function is called or referenced. In C it is just like that, function has a name and matching definition.

// file1.c
void foo(void) {}

And after compilation file1.obj stores the code and information about the definition of the foo symbol.

But when C++ comes in the symbol names become more complicated. A function may be overloaded or be a member of a class. But the linker does not want to know it. To preserve simplicity and re-usability of older linkers it needs a single name whether foo is:

void foo(void) {}
void foo(int) {}
void ClassA::foo(void) {}

But it cannot be called just foo anymore so here comes name mangling. And we may get from the compiler some variations like foo_void, foo_int, foo_void_classa. And finally the linker is happy as all those look to it like simple symbols.

When we want to call the foo function compiled with the C compiler in C++ code we must tell the compiler that we want foo to be C style foo and not foo_void as C++ compiler might assume. It is done using:

extern "C" void foo();

Now the compiler knows that foo is compiled using C compiler and will pass information to the linker that this code calls foo. The linker will match it with the foo definition in file1.obj. So it's all I think.

Some other directives like cdecl or stdcall are Windows specific and tell how parameters in function calls are being passed. Yes, for C and C++ it is cdecl. But Windows API functions use stdcall - Pascal convention (simplicity and historically Microsoft once provided Windows dev environment in Pascal).

Tricyclic answered 19/5, 2011 at 15:3 Comment(0)
A
0

Every function, function type, and object has a language linkage, which is specified as a simple character string. By default, the linkage is "C++". The only other standard language linkage is "C". All other language linkages and the properties associated with different language linkages are implementation-defined.

Anglicanism answered 10/4, 2012 at 16:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.