Why doesn't including headers with templates cause linker errors?
Asked Answered
S

4

16

If I include <string> or <vector> in multiple translation units (different .cpp files), why doesn't it break the ODR?

As far as I know, each .cpp is compiled differently, so std::vector's member functions will be generated for each object file separately, right?

The linker should detect it and raise an error. Even if it doesn't (I suspect it's special case for templates), will it be reusing the same machine code, or a different set of cloned code in each translation unit, when I link all together?

Sanguine answered 31/12, 2015 at 23:23 Comment(3)
Essentially, the compiler and linker conspire to make it work, using the same mechanism that inline functions use.Nearsighted
As you suspect it is a special case for templates, like for inline functions. Definition in different files should be exactly right to not violate ODR.Diacaustic
barney: Why don't you try picking out a specific phrase in the ODR that you think is violated, and why the stated exceptions don't apply?Carse
S
22

The same way any template definitions don't break the ODR — the ODR specifically says that template definitions may be duplicated across translation units, as long as they are literally duplicates (and, since they are duplicates, no conflict or ambiguity is possible).

There can be more than one definition of a class type (Clause [class]), enumeration type ([dcl.enum]), inline function with external linkage ([dcl.fct.spec]), class template (Clause [temp]), non-static function template ([temp.fct]), static data member of a class template ([temp.static]), member function of a class template ([temp.mem.func]), or template specialization for which some template parameters are not specified ([temp.spec], [temp.class.spec]) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. [...]

- C++14 Standard, [basic.def.odr] p6

Multiple inclusions of <vector> within the same translation unit are expressly permitted and effectively elided, more than likely by "#ifndef" header guards.

Submerged answered 31/12, 2015 at 23:30 Comment(11)
I see... But what about code duplication?? vector class contains code (well, templatized and abstract enough but still code...). So it would generate code for each inclusion in each translation unit where I use it, right? So, even if I use std::vector<int> everywhere, very identical code should be generated separately for each module... that looks unoptimal...Sanguine
@barney: Yes, it is sub-optimal. C++'s compilation model (mostly inherited from C) has its flaws, and the addition of templates made them worse. That's one of the big reasons that C++ compilation is seen as being so remarkably slow — it has to parse each definition for every compilation unit. And then it has to resolve all those duplicates and elide them at link time. Nobody's saying that this is the best way it can be done, only that it is the way C++ does it. :)Submerged
@barney: The linker removes all the duplicates during the link.Pearle
The compiler will in fact generate a vector<int> implementation in every compilation unit where it's needed. Sometimes that means inline code (most methods in vector are quite lightweight) and other times it means standalone functions. The linker will sort it out and elide any duplicate copies of standalone functions.Adenovirus
@LightnessRacesinOrbit They keep trying to come up with a workable "modules" implementation...no luck yet.Pearle
@barney: the linker is supposed to eliminate those duplicates, so that if you take an address &f where f is a class member function of some template like vector<int> or something, it should give the same address no matter what compilation unit you are inCarse
I see... so the compiler makes (theoretically) redundant code and linker removes all the duplicates, right? Yes, translation modules would be awesome feature to resolve it. Anyway I love c++ for its zero cost abstractions approach! )Sanguine
@LightnessRacesinOrbit btw if templates are not classes but "instructions for compiler to generate classes's code" why not to add full fledged meta-code there? current template syntax is weird and cryptic (especially overloaded/templated/specialized versions matching and type traits tricks). Woudn't it be nice to have syntax for compilation time runable c++ code that will explicitly define all template generation rules in source code. Maybe crazy idea :) maybe not C++ but special DSL language for code generation. :)Sanguine
@barney: Well then that would be a different language wouldn't itSubmerged
Formally, the last sentence is not 100% correct. Multiple inclusion of the same standard header in the same translation unit is covered by § 17.6.2.2/2, which says: "Each may be included more than once, with no effect different from being included exactly once (...)". Header guards are but an implementation of this rule :)Carboni
@ChristianHackl: GrantedSubmerged
P
7

The standard has a special exception for templates that allows for duplication of functions that otherwise would violate ODR (such as functions with external linkage and non-inline member functions). from C++11 3.2/5:

If D is a template and is defined in more than one translation unit, then the preceding requirements shall apply both to names from the template’s enclosing scope used in the template definition (14.6.3), and also to dependent names at the point of instantiation (14.6.2). If the definitions of D satisfy all these requirements, then the program shall behave as if there were a single definition of D. If the definitions of D do not satisfy these requirements, then the behavior is undefined.

Pinite answered 31/12, 2015 at 23:57 Comment(0)
A
1

The ODR doesn't state that a struct will only be declared one time across all compilation units--it states that if you declare a struct in multiple compilation units, it has to be the same struct. Violating the ODR would be if you had two separate vector types with the same name but different contents. At that point the linker would get confused and you'd get mixed up code and/or errors.

Adenovirus answered 31/12, 2015 at 23:32 Comment(1)
And this is exactly what happens when you modify a definition in a header but don't rebuild all translation units including that header.Submerged
E
0

The ODR is relaxed for templates

The ODR is "relaxed" for templates and for inline functions/variables. It is possible for a template to appear in multiple translation units, such as:

// a.cpp
template <int N> int foo() { return N; }
// b.cpp
template <int N> int foo() { return N; }

Normally, this is not the result of copy/paste, but a consequence of including headers. The relevant wording is in [basic.def.odr] p14:

For any definable item D with definitions in multiple translation units,

  • if D is a non-inline non-templated function or variable, or
  • if the definitions in different translation units do not satisfy the following requirements,

the program is ill-formed; [...]

ODR violations are still possible

Note that the definition can appear in multiple translation units without violating the ODR, but this definition needs to be identical everywhere.

// a.cpp
template <int N> int foo() { return N; }
// b.cpp
template <int N> int foo() { return 0; } // IFNDR

The program is ill-formed, no diagnostic required because the definitions are not the same. This is also why it's important to use headers; it makes sure that the same symbols are copied/pasted into every translation unit.

What about code duplication?

This issue is resolved by the linker. A program would be ill-formed if any of the definitions weren't the same. It is certain that foo<0> in a.cpp and foo<0> in b.cpp must be exactly the same. As a result, the linker can arbitrarily pick one of the two, and remove the other from the executable.

This feature is also called weak symbols.

Exoenzyme answered 2/9, 2023 at 19:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.