Here's the answer:
TLDR;
.interp + .note.ABI-tag + .note.gnu.build-id + .gnu.hash + .dynsym + .dynstr
+ .gnu.version + .gnu.version_r + .rela.dyn + .rela.plt + .init + .plt
+ .plt.got + .text + .fini + .rodata + .eh_frame_hdr + .eh_frame
= text
.init_array + .fini_array + .data.rel.ro + .dynamic + .got + .data
= data
.bss = bss
See also the image with yellow, blue, and red boxes at the end, for a quick visual summary.
Details:
First, let's print the berkeley size information in hex, with size -x --format=berkeley /bin/ls
or size -x /bin/ls
(same thing, since berkeley is the default format):
$ size -x /bin/ls
text data bss dec hex filename
0x1e48a 0x1278 0x12e0 133602 209e2 /bin/ls
And here's the sysv size output in hex, obtained with size -x --format=sysv /bin/ls
:
$ size -x --format=sysv /bin/ls
/bin/ls :
section size addr
.interp 0x1c 0x238
.note.ABI-tag 0x20 0x254
.note.gnu.build-id 0x24 0x274
.gnu.hash 0xec 0x298
.dynsym 0xdf8 0x388
.dynstr 0x682 0x1180
.gnu.version 0x12a 0x1802
.gnu.version_r 0x70 0x1930
.rela.dyn 0x1350 0x19a0
.rela.plt 0xa68 0x2cf0
.init 0x17 0x3758
.plt 0x700 0x3770
.plt.got 0x18 0x3e70
.text 0x124d9 0x3e90
.fini 0x9 0x1636c
.rodata 0x4e1d 0x16380
.eh_frame_hdr 0x884 0x1b1a0
.eh_frame 0x2cc0 0x1ba28
.init_array 0x8 0x21eff0
.fini_array 0x8 0x21eff8
.data.rel.ro 0xa38 0x21f000
.dynamic 0x200 0x21fa38
.got 0x3c8 0x21fc38
.data 0x268 0x220000
.bss 0x12e0 0x220280
.gnu_debuglink 0x34 0x0
Total 0x20a16
Next, if you run objdump -h /bin/ls
, you get the following, which shows all output sections in the /bin/ls
object file, or executable. These output sections match the output from the size -x --format=sysv /bin/ls
command, but have more-detailed information such as the VMA (Virtual Memory Address) and LMA (Load Memory Address), among other things:
$ objdump -h /bin/ls
/bin/ls: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .interp 0000001c 0000000000000238 0000000000000238 00000238 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .note.ABI-tag 00000020 0000000000000254 0000000000000254 00000254 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .note.gnu.build-id 00000024 0000000000000274 0000000000000274 00000274 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .gnu.hash 000000ec 0000000000000298 0000000000000298 00000298 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .dynsym 00000df8 0000000000000388 0000000000000388 00000388 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .dynstr 00000682 0000000000001180 0000000000001180 00001180 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .gnu.version 0000012a 0000000000001802 0000000000001802 00001802 2**1
CONTENTS, ALLOC, LOAD, READONLY, DATA
7 .gnu.version_r 00000070 0000000000001930 0000000000001930 00001930 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
8 .rela.dyn 00001350 00000000000019a0 00000000000019a0 000019a0 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
9 .rela.plt 00000a68 0000000000002cf0 0000000000002cf0 00002cf0 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
10 .init 00000017 0000000000003758 0000000000003758 00003758 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
11 .plt 00000700 0000000000003770 0000000000003770 00003770 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
12 .plt.got 00000018 0000000000003e70 0000000000003e70 00003e70 2**3
CONTENTS, ALLOC, LOAD, READONLY, CODE
13 .text 000124d9 0000000000003e90 0000000000003e90 00003e90 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
14 .fini 00000009 000000000001636c 000000000001636c 0001636c 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
15 .rodata 00004e1d 0000000000016380 0000000000016380 00016380 2**5
CONTENTS, ALLOC, LOAD, READONLY, DATA
16 .eh_frame_hdr 00000884 000000000001b1a0 000000000001b1a0 0001b1a0 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
17 .eh_frame 00002cc0 000000000001ba28 000000000001ba28 0001ba28 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
18 .init_array 00000008 000000000021eff0 000000000021eff0 0001eff0 2**3
CONTENTS, ALLOC, LOAD, DATA
19 .fini_array 00000008 000000000021eff8 000000000021eff8 0001eff8 2**3
CONTENTS, ALLOC, LOAD, DATA
20 .data.rel.ro 00000a38 000000000021f000 000000000021f000 0001f000 2**5
CONTENTS, ALLOC, LOAD, DATA
21 .dynamic 00000200 000000000021fa38 000000000021fa38 0001fa38 2**3
CONTENTS, ALLOC, LOAD, DATA
22 .got 000003c8 000000000021fc38 000000000021fc38 0001fc38 2**3
CONTENTS, ALLOC, LOAD, DATA
23 .data 00000268 0000000000220000 0000000000220000 00020000 2**5
CONTENTS, ALLOC, LOAD, DATA
24 .bss 000012e0 0000000000220280 0000000000220280 00020268 2**5
ALLOC
25 .gnu_debuglink 00000034 0000000000000000 0000000000000000 00020268 2**2
CONTENTS, READONLY
A Google search for "vma and lma meaning" brings me to this site, which has a useful quote from the GNU ld
linker manual. Searching for that quote leads me here, which conveniently has the source for the quote. So let's just cite the quote directly from its original source:
Every loadable or allocatable output section has two addresses. The first is the VMA, or virtual memory address. This is the address the section will have when the output file is run. The second is the LMA, or load memory address. This is the address at which the section will be loaded. In most cases the two addresses will be the same. An example of when they might be different is when a data section is loaded into ROM, and then copied into RAM when the program starts up (this technique is often used to initialize global variables in a ROM based system). In this case the ROM address would be the LMA, and the RAM address would be the VMA.
You can see the sections in an object file by using the objdump program with the ‘-h’ option.
(Source: GNU linker script ld
manual)
This means that any output section shown by objdump -h
which does NOT have a VMA is not part of the program. That eliminates the .gnu_debuglink
section.
Next, we can see that the .bss
section has the exact same size (0x12e0) as the berkeley bss
section, so that's a match:
.bss = bss
bss
contains the zero-initialized global and static variables.
So, what about the data
output section, which contains all NON-zero-initialized (ie: initialized with some non-zero value) global and static variables?
And, what about the text
output section, which contains all program code and constant (read only) static and global variables?
Well, through logical deduction and analysis, and using my prior knowledge about which sections go into Flash vs RAM vs both on microcontrollers, I determined that all sections which are marked READONLY
in the objdump -h
output sections (which contains some DATA
(non-zero-initialized, const
(read-only) static and global variables) and some CODE
(the actual program logic) (also read-only)) are stored into the text
output section.
So:
.interp + .note.ABI-tag + .note.gnu.build-id + .gnu.hash + .dynsym + .dynstr
+ .gnu.version + .gnu.version_r + .rela.dyn + .rela.plt + .init + .plt
+ .plt.got + .text + .fini + .rodata + .eh_frame_hdr + .eh_frame
= text
You can confirm that in the math by summing all their sizes. In hex:
1c + 20 + 24 + ec + df8 + 682 + 12a + 70 + 1350 + a68 + 17 + 700 + 18 + 124d9 + 9 + 4e1d
+ 884 + 2cc0 = 1e48a
...which is the size of the text
section shown in the berkeley size output.
You can see them boxed in yellow in the image below.
So, the remainder, which are marked DATA
and NOT READONLY
, are the data
sections:
.init_array + .fini_array + .data.rel.ro + .dynamic + .got + .data
= data
Again, the hex size summation confirms this:
8 + 8 + a38 + 200 + 3c8 + 268 = 1278
...which is the size of the data
section in the berkeley size output.
You can see them boxed in blue in the image below.
In this image, you can see all 3 berkely output sections boxed in different colors:
- The berkeley-format
text
output sections (read-only, program logic and const static and global variables) are boxed in yellow.
- The berkeley-format
data
output sections (non-zero-initialized [ie: other-than-zero initialized] static and global variables) are boxed in blue.
- The berkeley-format
bss
output sections (zero-initialized static and global variables) are boxed in red.
In the case of looking at a microcontroller object file, such as for an STM32 mcu:
- Flash memory usage =
text
+ data
, and
- RAM memory usage from static and global variables =
bss
+ data
.
- That means the RAM left over for stack (local variables) and heap (dynamic memory allocation) =
RAM_total - (bss + data)
.
Primary References:
- GNU Linker (
ld
) manual, section "3.1 Basic Linker Script Concepts": https://sourceware.org/binutils/docs/ld/Basic-Script-Concepts.html#Basic-Script-Concepts
- [my own question here] https://electronics.stackexchange.com/questions/363931/how-do-i-find-out-at-compile-time-how-much-of-an-stm32s-flash-memory-and-dynami