I am trying to understand the ELF format and right now there are some thing that I don't get about the segments defined in the program header. I have this little code that I convert to an ELF file with g++ (x86_x64 on Linux):
#include <stdlib.h>
#include <iostream>
using namespace std;
int main(int argc, char *argv[])
{
if (argc == 1)
{
cout << "Hello world!" << endl;
}
return 0;
}
With g++ -c -m64 -D ACIS64 main.cpp -o main.o
and g++ -s -O1 -o Main main.o
.
Now, with readelf I get this list of segments:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000000afc 0x0000000000000afc R E 200000
LOAD 0x0000000000000df8 0x0000000000600df8 0x0000000000600df8
0x0000000000000270 0x00000000000003a0 RW 200000
DYNAMIC 0x0000000000000e18 0x0000000000600e18 0x0000000000600e18
0x00000000000001e0 0x00000000000001e0 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x00000000000009a4 0x00000000004009a4 0x00000000004009a4
0x0000000000000044 0x0000000000000044 R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x0000000000000df8 0x0000000000600df8 0x0000000000600df8
0x0000000000000208 0x0000000000000208 R 1
With Bless Hex Editor I am looking at the code and try to find each one of these segments.
I find the PHDR segment just after the ELF header and having the size of this entire program header. It has an alignment of 8 bytes and is readable/executable. [!]I don't understand why executable.
I find the segment where the interpreter is declared, just after the PHDR. It has the size of the interpreter's path and an alignment of 1 byte. Correct
Now I have a segment that is readable and executable, which [!]I suppose is the code segment. I don't understand why does it start at 0x0000000000000000. Shouldn't this start where the entry point is located? Why does it have a size of 0xafc bytes? Isn't the size only the size of the code? How much of the file is executable? Also, I don't understand why the alignment is 0x200000 bytes. Is that how much space is reserved for a LOAD segment in memory?. This is where this segment ends and an amout of 764
0x0
bytes follows it:
- The next one (readable and writable) [!]I suppose is a segment where variables are stored. It ends just where something like the sections header might be starting.
- Now the next one is a DYNAMIC header. It starts at 0xe18, which is inside the one above. [!]I thought this was a segment where references to external functions and variables are stored but I am not sure. It is readable and writable. I just don't know what segment is this and why it is "inside" the LOAD segment above
- A NOTE segment, containing some info that I suppose is not important right now
- GNU specific segments, one of them having any offsets and sizes equal to
0x0000000000000000
, others interfering with other segments, which I don't get, either.
I come from the PE world, where each thing has its own well defined offset and size and here I see these weird addresses and sizes and I am confused.