Why does the one definition rule exist in C/C++
Asked Answered
V

6

6

In C and C++, you can't have a function with two definitions. For example, say we have the following two files:

1.c:

int main(){ return 0;}

2.c:

int main(){ return 0;}

Issuing the command gcc 1.c 2.c will give you a duplicate symbol linker error. Why doesn't the same happen with structs and classes? Why are we allowed to have multiple definitions of the same struct as long as they have the same tokens?

Vociferant answered 5/2, 2020 at 23:13 Comment(10)
class/struct definitions are not functions. They are types. The ODR rule applies to functions and objects, not types.Previse
but why? when would you want to have the same definition of a class in another file? aren't header files used for that?Vociferant
And what do you think header files are doing? #include directive reads a file and copies the whole content directly at place of this directive. By using header file, you get a copy of this struct definition in every file that includes this header.Acicula
When a header file is included by .cpp files (or translation units), they are in effect making the definitions of class/structs available in multiple source files.Previse
You don't fully define a class in a header file tho, you do that in one cpp file with class_name::function_nameVociferant
@Josh Those are (member) function definitions. The class definition is the part between {}; after class/struct keyword and a name.Acicula
The key point is that it's a linker error. A C compiler is not intended to be a "whole program" compiler. It is intended to compile each "translation unit" (i.e. .c file) separately. So the compiler won't "remember" that main is defined in 1.c when it's compiling 2.c. And the linker doesn't see the source code of main, so it doesn't know that those two definitions are identical. So if the linker sees duplicate symbols, it throws an error.Pass
Short answer: Without static, the functions are visible to compilation units, while the structs aren't.Saccharo
C and C++ are diff languages with diff exact rules (but mostly the same for the common subset).Sargeant
Try to define struct with the same name twice in one file.Metaphrase
A
5

To answer this question, one has to delve into compilation process and what is needed in each part (question why these steps are perfomed is more historical, going back to beginning of C before it's standardization)

C and C++ programs are compiled in multiple steps:

  1. Preprocessing
  2. Compilation
  3. Linkage

Preprocessing is everything that starts with #, it's not really important here.

Compilation is performed on each and every translation unit (typically a single .c or .cpp file plus the headers it includes). Compiler takes one translation unit at a time, reads it and produces an internal list of classes and their members, and then assembly code of each function in given unit (basing on the structures list). If a function call is not inlined (e.g. it is defined in different TU), compiler produces a "link" - "please insert function X here" for the linker to read.

Then linker takes all of the compiled translation units and merges them into one binary, substituting all the links specified by compiler.


Now, what is needed at each phase?

For compilation phase, you need the

  • definition of every class used in this file - compiler needs to know the size and offset of each class member to produce assembly
  • declaration of every function used in this file - to produce those "links".

Since function definitions are not needed for producing assembly (as long as they are compiled somewhere), they are not needed in compilation phase, only in linking phase.


To sum up:

One Definition Rule is there to protect programmers from theselves. If they'd accidentally define a function twice, linker will notice that and executable is not produced.

However, class definitions are required in every translation unit, and therefore such a rule cannot be set up for them. Since it cannot be forced by language, programmers have to be responsible beings and not define the same class in different ways.

ODR has also other limitations, e.g. you have to define template functions (or template class methods) in header files. You can also take the responsibility and say to the compiler "Every definition of this function will be the same, trust me dude" and make the function inline.

Acicula answered 5/2, 2020 at 23:42 Comment(2)
Thanks for your answer, now when you say class definitions are required in every translation unit,, that's just how c/c++ is designed right?, and since classes and structs cannot be just declared. They are declared and defined at the same time (unlike functions). am I correct?Vociferant
You can declare a class without defining it. class myclass;. Then subsequent functions can have references or pointers, but they can't have values of that type, nor access any of its members/methods. It's usually not useful.Culbert
R
3

There is no use case for a function with 2 definitions. Either the two definitions would have to be the same, making it useless, or the compiler wouldn't be able to tell which one you meant.

This is not the case with classes or structures. There is also a large advantage to allowing multiple definitions of them, i.e. if we want to use a class or struct in multiple files. (This leads indirectly to multiple definitions because of includes.)

Rance answered 5/2, 2020 at 23:18 Comment(0)
O
0

Structures, classes, unions and enumerations define types that can be used in several compilation units to define objects of these types. So each compilation unit need to know how the types are defined, for example to allocate correctly memory for an object or to be sure that specified member of a class does indeed exist.

For functions (if they are not inline functions) it is enough to have their declaration without their definition to generate for example a function call.

But the function definition shall be single. Otherwise the compiler will not know what function to call or the object code will be too big due to duplication and will be error prone..

Ogilvy answered 5/2, 2020 at 23:38 Comment(0)
S
0

It's quite simple: It's a question of scope. Non-static functions are seen (callable) by every compilation unit linked together, while structures are only seen in the compilation unit where they are defined.

For example, it's valid to link the following together because it's clear which definition of struct Foo and which definition of f is being used:

1.c:

struct Foo { int x; };
static void f(void) { struct Foo foo; ... }

2.c:

struct Foo { double d; };
static void f(void) { struct Foo foo; ... }
int main(void) { ... }

But it isn't valid to link the following together because the linker wouldn't know which f to call.

1.c:

void f(void) { ... }

2.c:

void f(void) { ... }
int main(void) { f(); }
Saccharo answered 6/2, 2020 at 0:8 Comment(0)
B
0

Actually every programming element is associated with a scope of its applicability. And within this scope you cannot have the same name associated with multiple definitions of an element. In compiled world:

  1. You cannot have more than one class definition with the same name within a single file. But you can have it in different compilation units.
  2. You cannot have the same function or global variable name within a single link unit (library or executable), but you can potentially have functions named the same within different libraries.
  3. you cannot have shared libraries with the same name situated in the same directory, but you can have them in different directories.

C/C++ compilation is very much after the compilation performance. Checking 2 objects like function or classes for identity is a time-consuming task. So, it is not done. Only names are considered for comparison. It is better to consider that 2 types are different and error out then checking them for identity. The only exception from this rule are text macros.

Macros are a pre-processor concept and historically it is allowed to have multiple identical macro definitions. If a definition changes, a warning gets generated. Comparing macro context is easy, just a simple string comparison, but some macro definitions could be huge.

Types are the compiler concept and they are resolved by the compiler. Types do not exist in object libraries and are represented by the sizes of corresponding variables. So, there is no reason for checking type name collisions at this scope.

Functions and variables on the other hand are named pointers to executable codes or data. They are the building blocks of applications. Applications are assembled from the codes and libraries coming from all around the world in some cases. In order to use someone else's function you'd better now its name and you do not want the same name to be used by some one else. Within a shared library names of functions and variables are usually stored in a hash table. There is no place for duplicates there.

And as I already mention checking functions for identical contents is seldom done, however there are some cases, but not in c or c++.

Bankston answered 6/2, 2020 at 0:40 Comment(0)
E
0

The reason of impeding two different definitions for the same thing to be used in programming is to avoid the ambiguity of deciding which definition to use at run time.

If you have two different implementations to the same thing to coexist in a program, then there's the possibility of aliasing them (with a different name each) into a common reference to decide at runtime which one of the two to use.

Anyway, in order to distinguish both, you have to be able to indicate the compiler which one you want to use. In C++ you can overload a function, giving it the same name and different lists of parameters, so you can distinguish which one of both you want to use. But in C, the compilers only preserve the name of the function to be able to solve at link time which definition matches the name you use in a different compilation unit. In case the linker ends with two different definitions with the same name, it is uncapable of deciding for you which one to use, so it emits an error and gives up the building process.

What should be the intention of using this ambiguity in a productive way? this is the question you have actually to ask to yourself.

Edema answered 6/2, 2020 at 10:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.