Extern in multiple files and possible double definition
Asked Answered
S

4

10

I was running the following codes compiled together as: gcc A.c B.c -o combined

Program A:

#include<stdio.h>
int a=1;
int b;
int main()
{
extern int a,b;
fun();
printf("%d %d\n",a,b);
}

Program B:

int a;
int b=2;
int fun()
{
printf("%d %d\n",a,b); 
return 0;
}

On running the "combined" program the output was:

1 2
1 2

Now, I've a few doubts about this one:

  1. Why isn't the output:

    0 2

    1 0

  2. Aren't a and b defined twice?

Please explain these clearly, I've had a lot of problems understanding extern and few of these doubts keep coming from time to time.

Thanks in Advance.

Squier answered 1/7, 2013 at 4:58 Comment(2)
you try to trick the compiler and the compiler tricked you.Anemology
It's not about tricking the compiler, It's about getting the concepts right.Squier
S
4

So, I am answering my own question after a long time. Although the statement:

int b; is a decalaration and int b = 2; is the definition

is correct but the reason everyone is giving is not clear.

Had there not been a int b = 2;, int b; was a definition, so what is the difference?

The difference lies in the way the linker handles multiple symbol definitions. There is a concept of weak and strong symbols.

The assembler encodes this information implicitly in the symbol table of the relocatable object file. Functions and initialized global variables get strong symbols. Uninitialized global variables get weak symbols.

So in Program A, int a = 1 is a strong symbol while int b; is a weak symbol, similarly in Program B, int b = 2 is a strong symbol and while int a is weak.

Given this notion of strong and weak symbols, Unix linkers use the following rules for dealing with multiply defined symbols:

  1. Multiple strong symbols are not allowed.
  2. Given a strong symbol and multiple weak symbols, choose the strong symbol.
  3. Given multiple weak symbols, choose any of the weak symbols.

So, now we can argue about what is happening in the above case.

  1. Among int b = 2 and int b, the former is a strong symbol while the latter is weak so b is defined with value 2.
  2. Among int a = 1 and int a, a is defined as 1 (same reasoning).

Hence, the output 1 2.

Squier answered 29/6, 2016 at 4:43 Comment(0)
U
5

A variable may be declared many times, as long as the declarations are consistent with each other and with the definition. It may be declared in many modules, including the module where it was defined, and even many times in the same module.

An external variable may also be declared inside a function. In this case the extern keyword must be used, otherwise the compiler will consider it a definition of a local variable, which has a different scope, lifetime and initial value. This declaration will only be visible inside the function instead of throughout the function's module.

Now let me repeat again definition of extern which says "external variable is a variable DEFINED outside any function block"(Please read carefully word given in BOLD). So for the Programe A a have definition but b is just declaration so extern will look for its definition of 'b' which is given in Programe B.So print from Programe A is 1 2.Now lets Talk about Programe B which have declaration for a and definition for b so it is priting value of a from programe A and value of b from current file.

Unfamiliar answered 1/7, 2013 at 5:53 Comment(4)
As far as I know int a; is both declaration and definition and infact outside any function it also becomes initialization with value 0. Consider: int main() { int a; //both declaration and definition... int a = 5; //gives error multiple definition of a. } You can try the same for outside variables as well.Squier
@TapanAnand default initialization occurs when definition is not available.But in your case a and b both are declared and defined properly so compiler will not initialize them again and all this is to avoid multiple definition error.Unfamiliar
Ok, so int b; in Prog A may not be an initialization, but it has to be definition, so why int b = 2 in prog B doesn't cause an error?Squier
@TapanAnand int b is not initialization in programe A it is just declaration.External variables are allocated and initialized when the program starts, and the memory is only released when the program ends.And as soon as program starts it finds the definition of b given in progeme B.Unfamiliar
S
4

So, I am answering my own question after a long time. Although the statement:

int b; is a decalaration and int b = 2; is the definition

is correct but the reason everyone is giving is not clear.

Had there not been a int b = 2;, int b; was a definition, so what is the difference?

The difference lies in the way the linker handles multiple symbol definitions. There is a concept of weak and strong symbols.

The assembler encodes this information implicitly in the symbol table of the relocatable object file. Functions and initialized global variables get strong symbols. Uninitialized global variables get weak symbols.

So in Program A, int a = 1 is a strong symbol while int b; is a weak symbol, similarly in Program B, int b = 2 is a strong symbol and while int a is weak.

Given this notion of strong and weak symbols, Unix linkers use the following rules for dealing with multiply defined symbols:

  1. Multiple strong symbols are not allowed.
  2. Given a strong symbol and multiple weak symbols, choose the strong symbol.
  3. Given multiple weak symbols, choose any of the weak symbols.

So, now we can argue about what is happening in the above case.

  1. Among int b = 2 and int b, the former is a strong symbol while the latter is weak so b is defined with value 2.
  2. Among int a = 1 and int a, a is defined as 1 (same reasoning).

Hence, the output 1 2.

Squier answered 29/6, 2016 at 4:43 Comment(0)
G
2

Because the variables aren't defined twice here; they are declared twice though. The functions take the values from the definition of the variables not from the declaration of the variables.

A declaration introduces an identifier and describes its type.Through declaration we assure to the complier that this variable or function has been defined somewhere else in the program and will be provided at the time of linking. As for example the declaration is:

extern int a;

A definition actually instantiates/implements this identifier. The definition is : int a=5; OR int a;

Just read on this link for further reference.

there is this wonderful post on stackoverflow too .

extern tells the compiler that variable is defined outside so it looks outside the function and there it finds:

int a=1 in program A and int b=2 in program B

For AUTO variables :

int a;//both definition and declaration

For further knowledge of STORAGE CLASSES you can follow this link

int a outside the main or any other function is declaration (i.e GLOBAL) only inside any function its called definition.

Gono answered 1/7, 2013 at 5:52 Comment(5)
As far as I know int a; is both declaration and definition and infact outside any function it also becomes initialization with value 0. Consider: int main() { int a; //both declaration and definition... int a = 5; //gives error multiple definition of a. } You can try the same for outside variables as well.Squier
Ya, I Know the difference, but what I'm saying is writing int a; both declares as well as defines i.e. allocates memory for a. Outside any function it also initializes it with a well defined value(0) and not a garbage value.Squier
Sorry for the late there was power problem.... What you are saying may be true for Auto variables but not the extern variables,just take a look at the link in my answer.Gono
@TapanAnand int a outside the main or any other function is declaration; only inside any function its called definition.Gono
@TapanAnand Sorry I didn't pay attention to your example ---Consider: int main() { int a; //both declaration and definition... int a = 5; //gives error multiple definition of a. } In case of AUTO variable the definition and declaration are same but not in case of EXTERN variables.Gono
S
1

As far as I know: Output will be 1 2 and 1 2 because you are defining a and b as a external variables in main function.So it will try to take value from other files also. As far as 2nd question i think compiler is taking initialized values of variable and merging them because both a and b are defined as global variable in both file. Case can be different if both were defined inside function. Any suggestion or other inputs are welcome.

Stability answered 1/7, 2013 at 5:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.