Difference between data section and the bss section in C
Asked Answered
C

2

45

When checking the disassembly of the object file through the readelf, I see the data and the bss segments contain the same offset address. The data section will contain the initialized global and static variables. BSS will contain un-initialized global and static variables.

  1 #include<stdio.h>
  2 
  3 static void display(int i, int* ptr);
  4 
  5 int main(){
  6  int x = 5;
  7  int* xptr = &x;
  8  printf("\n In main() program! \n");
  9  printf("\n x address : 0x%x x value : %d  \n",(unsigned int)&x,x);
 10  printf("\n xptr points to : 0x%x xptr value : %d \n",(unsigned int)xptr,*xptr);
 11  display(x,xptr);
 12  return 0;
 13 }
 14 
 15 void display(int y,int* yptr){
 16  char var[7] = "ABCDEF";
 17  printf("\n In display() function  \n");
 18  printf("\n y value : %d y address : 0x%x  \n",y,(unsigned int)&y);
 19  printf("\n yptr points to : 0x%x yptr value : %d  \n",(unsigned int)yptr,*yptr);
 20 }

output:

   SSS:~$ size a.out 
   text    data     bss     dec     hex filename
   1311     260       8    1579     62b a.out

Here in the above program I don't have any un-initialized data but the BSS has occupied 8 bytes. Why does it occupy 8 bytes? Also when I disassemble the object file,

EDITED :

  [ 3] .data             PROGBITS        00000000 000110 000000 00  WA  0   0  4
  [ 4] .bss              NOBITS          00000000 000110 000000 00  WA  0   0  4
  [ 5] .rodata           PROGBITS        00000000 000110 0000cf 00   A  0   0  4

data, rodata and bss has the same offset address. Does it mean the rodata, data and bss refer to the same address? Do Data section, rodata section and the bss section contain the data values in the same address, if so how to distinguish the data section, bss section and rodata section?

Conglobate answered 15/5, 2013 at 5:42 Comment(2)
In C, pointers are printed with %p and an argument cast to (void*).Fanchet
"I dont have any un-intialised data but the BSS has occupied 8 bytes" - the libraries that get linked in also have data. In fact, since your program has only local variables and literals, I expect that all the data and bss sections are from libraries.Visby
B
87

The .bss section is guaranteed to be all zeros when the program is loaded into memory. So any global data that is uninitialized, or initialized to zero is placed in the .bss section. For example:

static int g_myGlobal = 0;     // <--- in .bss section

The nice part about this is, the .bss section data doesn't have to be included in the ELF file on disk (ie. there isn't a whole region of zeros in the file for the .bss section). Instead, the loader knows from the section headers how much to allocate for the .bss section, and simply zero it out before handing control over to your program.

Notice the readelf output:

[ 3] .data PROGBITS 00000000 000110 000000 00 WA 0 0 4
[ 4] .bss NOBITS 00000000 000110 000000 00 WA 0 0 4

.data is marked as PROGBITS. That means there are "bits" of program data in the ELF file that the loader needs to read out into memory for you. .bss on the other hand is marked NOBITS, meaning there's nothing in the file that needs to be read into memory as part of the load.


Example:

// bss.c
static int g_myGlobal = 0;

int main(int argc, char** argv)
{
   return 0;
}

Compile it with $ gcc -m32 -Xlinker -Map=bss.map -o bss bss.c

Look at the section headers with $ readelf -S bss

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
   :
  [13] .text             PROGBITS        080482d0 0002d0 000174 00  AX  0   0 16
   :
  [24] .data             PROGBITS        0804964c 00064c 000004 00  WA  0   0  4
  [25] .bss              NOBITS          08049650 000650 000008 00  WA  0   0  4
   :

Now we look for our variable in the symbol table: $ readelf -s bss | grep g_myGlobal

37: 08049654     4 OBJECT  LOCAL  DEFAULT   25 g_myGlobal

Note that g_myGlobal is shown to be a part of section 25. If we look back in the section headers, we see that 25 is .bss.


To answer your real question:

Here in the above program I dont have any un-intialised data but the BSS has occupied 8 bytes. Why does it occupy 8 bytes ?

Continuing with my example, we look for any symbol in section 25:

$ readelf -s bss | grep 25
     9: 0804825c     0 SECTION LOCAL  DEFAULT    9 
    25: 08049650     0 SECTION LOCAL  DEFAULT   25 
    32: 08049650     1 OBJECT  LOCAL  DEFAULT   25 completed.5745
    37: 08049654     4 OBJECT  LOCAL  DEFAULT   25 g_myGlobal

The third column is the size. We see our expected 4-byte g_myGlobal, and this 1-byte completed.5745. This is probably a function-static variable from somewhere in the C runtime initialization - remember, a lot of "stuff" happens before main() is ever called.

4+1=5 bytes. However, if we look back at the .bss section header, we see the last column Al is 4. That is the section alignment, meaning this section, when loaded, will always be a multiple of 4 bytes. The next multiple up from 5 is 8, and that's why the .bss section is 8 bytes.


Additionally We can look at the map file generated by the linker to see what object files got placed where in the final output.

.bss            0x0000000008049650        0x8
 *(.dynbss)
 .dynbss        0x0000000000000000        0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib/crt1.o
 *(.bss .bss.* .gnu.linkonce.b.*)
 .bss           0x0000000008049650        0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib/crt1.o
 .bss           0x0000000008049650        0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib/crti.o
 .bss           0x0000000008049650        0x1 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/32/crtbegin.o
 .bss           0x0000000008049654        0x4 /tmp/ccKF6q1g.o
 .bss           0x0000000008049658        0x0 /usr/lib/libc_nonshared.a(elf-init.oS)
 .bss           0x0000000008049658        0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/32/crtend.o
 .bss           0x0000000008049658        0x0 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib/crtn.o

Again, the third column is the size.

We see 4 bytes of .bss came from /tmp/ccKF6q1g.o. In this trivial example, we know that is the temporary object file from the compilation of our bss.c file. The other 1 byte came from crtbegin.o, which is part of the C runtime.


Finally, since we know that this 1 byte mystery bss variable is from crtbegin.o, and it's named completed.xxxx, it's real name is completed and it's probably a static inside some function. Looking at crtstuff.c we find the culprit: a static _Bool completed inside of __do_global_dtors_aux().

Bulbar answered 15/5, 2013 at 5:50 Comment(6)
I would not initialize the example static int g_myGlobal; -to be sure it goes to .bss- and it should be used in the code, otherwise the compiler could optimize by removing it entirely.Bullyrag
More likely, the .bss comes from some internal variable in the included standard libraries, probably from printf. Which is why printf et al aren't thread safe.Prothalamion
@Prothalamion printf is going to be part of libc, which is dynamically linked - so its bss won't show up in this executable. Also, I can't think of any reason printf in particular would keep any state.Bulbar
@Prothalamion Besides, I explicitly showed that the other 1 byte was from crtbegin.o. Some minimal CRT startup code is always going to be statically linked.Bulbar
@ Jonathon Reinhart: stdout is used by printf and obviously has some state.Bullyrag
Perhaps I'm being pedantic, but that's stdout, not printf. Regardless this is all libc so it doesn't matter.Bulbar
B
6

By definition, the bss segment takes some place in memory (when the program starts) but don't need any disk space. You need to define some variable to get it filled, so try

int bigvar_in_bss[16300];
int var_in_data[5] = {1,2,3,4,5};

Your simple program might not have any data in .bss, and shared libraries (like libc.so) may have "their own .bss"

File offsets and memory addresses are not easily related.

Read more about the ELF specification, also use /proc/ (eg cat /proc/self/maps would display the address space of the cat process running that command). Read also proc(5)

Bullyrag answered 15/5, 2013 at 5:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.