bss segment in C
Asked Answered
O

4

10

In one of the answers to the question "Regarding the bss segment and data segment in Unix", I see the explanation on bss as follows:

Bss is special: .bss objects don't take any space in the object file, and by grouping all the symbols that are not specifically intialized together, they can be easily zeroed out at once.

But when I use size on the object file, generated out of the code:

#include <stdio.h>
int uninit_global_var;
int init_global_var=5;

int main()
{
   int local_var;
   return 0;
}

I have the following

text    data      bss    dec     hex filename
1231     280      12    1523     5f3 a.out

and see the bss growing based on the uninitialized data members with global scope. So can anyone justify the mentioned statement?

Osseous answered 9/10, 2012 at 10:52 Comment(0)
S
10

If you remove stdio.h your output will probably be more meaningful. Lets ignore that library, since it contains internal variables.

In your specific case, the following happens:

int uninit_global_var;

Since this is a variable allocated at file scope in has static storage duration, just as any variable declared as static. The C standard requires that if a variable with static storage duration is not initialized explicitly by the programmer, as in this case, it must be set to zero, before the program starts. All such variables are put in the .bss segment.

int init_global_var=5;

This variable is also allocated at file scope, so it will also have static storage duration. But in this case it is initialised by the programmer. The C standard demands that such variables are set to the value given, before the program starts. Such variables are placed in the .data segment.

   int local_var;

This variable has automatic storage duration (local). The compiler will most likely optimize away this variable as it fills no purpose. But lets assume that such optimization doesn't take place. The variable will then be allocated in runtime, when the scope (block) it resides in is executed and then cease to exist once that scope is finsihed (it goes out of scope). It will be allocated either on the stack or in a CPU register. In other words, at link time this variable only exists as program code, in the form of some assembler instruction saying "push an int on the stack" and then later "pop an int from the stack".

How these different kind of variables are initialized depends on the system. But typically there will be some code injected by the compiler before main is called. This is an over-simplification, but for pedagogy's sake, you can imagine that your program actually looks like this:

bss
{
  int uninit_global_var;
}

data
{
  int init_global_var;
}

rodata
{
  5;
}


int start_of_program (void) // called by OS
{
  memset(bss, 0, bss_size);
  memcpy(data, rodata, data_size);

  return main(); 
}

data:4 bss:4

Embedded systems with true non-volatile memory will work exactly like the above code, while RAM-based systems may solve the data initialization part differently. bss works the same on all systems.


You can easily verify that they are stored in different segments by running the following program:

char uninit1;
char uninit2;
char init1 = 1;
char init2 = 2;

int main (void)
{
  char local1 = 1;
  char local2 = 2;

  printf("bss\t%p\t%p\n", &uninit1, &uninit2);
  printf("data\t%p\t%p\n", &init1, &init2);
  printf("auto\t%p\t%p\n", &local1, &local2);
}

You will see that "uninit" variables are allocated at adjacent addresses, but at different addresses from the other variables. Same with "init" variables. "local" variables can be allocated anywhere so you get any kind of strange address as result from those two.

Sucker answered 9/10, 2012 at 11:48 Comment(4)
%p requires an argument of type void*. Cast those char*s into void*.Pogge
@CoolGuy Yes, every pointer to data can be safely cast to void*. In fact, casts between void* and another pointer type should not require an explicit cast.Sucker
Then why do the answers of this post tell to cast it?Pogge
@CoolGuy Apparently printf is some special case. I very much doubt you'll get any actual problems on any platform if you don't cast to void*.Sucker
P
7

I don't know the answer for sure, but my educated guess is:

The SIZE of the bss segment is in the object file, and shown by size -> it must be allocated, after all.

But the object file won't grow when the bss segment grows.

Patsis answered 9/10, 2012 at 10:57 Comment(2)
exactly right! Note that the memory referred to by the BSS is of course allocated as data pages (and initialized to zeros) in the process by the operating system when the program is first started. So it's just a single number in the object file or program executable, but it takes up real space in the process when it runs.Corbitt
Exactly. If you declare a huge array without initializing it, the size command would show the space required, but the real size of the elf file wouldn't change.Indubitability
D
3

bss segment grows, but you don't need this segment in your binary (see objcopy).

So eventually if you were to put this code into some kind of ROM, it would take no space there, but would require space in RAM (and code to initialize it to 0).

Diathermic answered 9/10, 2012 at 11:56 Comment(0)
B
2

a.out is probably not an object file, it is probably an ELF - full executable. Relocatable objects, generaly named name.o, are intermediate files before the link occurs. See the -c option to gcc.

Blacklist answered 9/10, 2012 at 10:58 Comment(2)
Cdarke. You are right. Based on your inputs I tried the following experiment int a[10000]={5}; int b[10000]={10}; In this case the size of .o file is 80698 but whereas int a[10000]; int b[10000]; the size of .o file is only 698. So basically the object file size does not grow dependent on bss data. Can you please elobarate few more points on this.Osseous
I see that others (@Lundin) have elaborated on this far better than I can.Blacklist

© 2022 - 2024 — McMap. All rights reserved.