Introduction
There are three forms of declaration of concern here:1
extern int x; // Declares x but does not define it.
int x; // Tentative definition of x.
int x = 0; // Defines x.
A declaration makes an identifier (a name, like x
) known.
A definition creates an object (such as an int
).2 A definition is also a declaration, since it makes the name known.
A tentative definition without a regular definition in the same translation unit (the source file being compiled, with all of its included files) acts like a definition with an initializer of zero.
The way you should use these normally is:
- For an object you will access by name in multiple files, write exactly one definition of it in one source file. (It can be a tentative definition3 if you would like it to be initialized with zero, or it can be a regular definition with an initializer you choose.)
- In an associated header file (such as
foo.h
for the source file foo.c
), declare the name, using extern
as shown above.
- Include the header file in each file that uses the name, including its associated source file. (The latter is important; when
foo.c
includes foo.h
, the compiler will see both the declaration and the definition in the same compilation and give you a warning if there is a typo that makes the two declarations incompatible.)
Actually, the way you should use these normally is not to use them at all. Programs generally do not need external identifiers for objects, so you should design the program without them. The rules above are for when you do use them.
Tentative Definitions
In Unix and some other systems, it has been possible to put a tentative definition, int x;
, in a header file and include it in multiple source files. Because a tentative definition acts like a definition in the absence of a regular definition, this results in there being multiple definitions in multiple translation units. The C standard does not define the behavior of this. So how does it work in Unix?
Until recently, when you compiled with GCC (as built by default), it created an object file that marked tentatively defined identifiers differently from regularly defined identifiers. The tentatively defined identifiers were marked as “common.” When the linker found multiple definitions of a “common” identifier, it coalesced them into a single definition. Remember, the C standard does not define the behavior. But Unix tools4 did. So you could put int x;
in a header and include it in lots of places, and you would get one int x
out of it when linking the entire program.
In version 10 and later, GCC does not do this by default. Tentative definitions are, in the absence of regular definitions, treated more like regular definitions, and linking with multiple definitions of the same identifier will result in an error, even if the definitions arose from tentative definitions. GCC has a switch to select the old behavior, -fcommon
.
This is information you should know so that you can understand old source files and headers that took advantage of the “common” behavior. It is not needed in new source code, and you should write only non-definition declarations (using extern
) in headers and regular definitions in source files.
Miscellaneous
You do not need extern
with a function declaration because a function declaration without a body (the compound statement that contains the code for the function) is automatically a declaration and behaves the same as if it had extern
. Functions do not have tentative definitions.
Footnote
1 This answer addresses only external declarations and external definitions for identifiers of objects, with external linkage. The full rules for C declarations are somewhat complicated, partly due to the history of C’s evolution.
2 This is for definitions of identifiers that refer to objects. For other kinds of identifiers, what is a definition may be different. For example, typedef int foo
is said to define foo
as an alias for the type int
, but no object is created.
3 It may be preferable to also include an initializer, even if it is zero, as this will make it a regular definition and avoid a potential problem where the same name is used tentative definitions in two different source files for two different things, resulting in the linker not complaining even though this is an error.
4 I may be being sloppy with terminology here; somebody else could identify precisely where this behavior was specified and what tools it applied to.
extern int a
is only a declaration, butint b;
is a tentative definition. You can look it up. – Oroextern
used in header files that makes sense is with the global variables defined in the corresponding C file, which should also be accessible in other C files. – Peripheralint b;
appears in this file (translation unit), andint b=5;
appears in another file, then the former acts as a definition. So you have two definitions of the same object in different files, and thus the behavior is undefined. Am I mistaken? – Oroint b;
is a tentative definition whetherint b = 5;
appears later in the file or not. If it does, thenint b;
acts only as a declaration, but if it does not, then the tentative definition becomes a definition. – Oroint b;
meets all those criteria. It does not become an actual definition whenint b = 5;
is elsewhere in the translation unit. – Oroint b = 5;
is not in the same translation unit asint b;
but in a different one, which OP also wanted to address ("... or another file"). Do we agree that this produces undefined behavior? And that ifint b;
were changed toextern int b;
the behavior would be defined? – Oroextern
does what the standard says: introduces a declaration of an object that is defined elsewhere, so that code in the translation unit can access it without having to see the actual definition. – Oneself.c
files haveint b;
the.o
files have aC
type symbol [common]. If they're linked with a fourth file that hasint b = 5;
, that.o
puts it in.data
(i.e. typeD
). Links fine and final executable has a D symbol in the.data
section. If nobody doesint b = 5;
, the symbol in the final is.bss
(type B). This linking mechanism is to support Fortran style common declarations.extern int b;
is similar but the.o
has a type U symbol. If nobody doesint b;
orint b = 5;
it produces a link error. – Erasion