I'm trying to develop some very low-level x86 code following this document. I wrote the following C program:
void main()
{
char* video_memory = (char*) 0xb8000;
*video_memory = 'X';
}
I compile and link it like so:
gcc -m32 -fno-pie -c main.c -o main.o
ld -m elf_i386 -o main.bin -Ttext 513 --oformat binary main.o
This produces a binary called main.bin
which is over a hundred megabytes. I disassembled that binary and it's basically my code (ten or so lines), then a hundred meg of zeros, and then some kind of footer.
The extra bytes are all unnecessary, because I used head
to snip off the ones that weren't my code and it still ran fine.
I'm using 32-bit flags because my test machine is an old 32-bit laptop, but you can get similar (but less extreme) behavior in 64-bit. This script:
gcc -fno-pie -c main.c -o main.o
ld -o main.bin -Ttext 513 --oformat binary main.o
produces a main.bin
of over 4 MB. Again the pattern is the same: my code, 4 meg of zeros, and then a footer. A little bit of noise in between my code and the zeros. Here's the disassembled 4MB file:
0: f3 0f 1e fa endbr64
4: 55 push %ebp
5: 48 dec %eax
6: 89 e5 mov %esp,%ebp
8: 48 dec %eax
9: c7 45 f8 00 80 0b 00 movl $0xb8000,-0x8(%ebp)
10: 48 dec %eax
11: 8b 45 f8 mov -0x8(%ebp),%eax
14: c6 00 58 movb $0x58,(%eax)
17: 90 nop
18: 5d pop %ebp
19: c3 ret
...
aea: 00 00 add %al,(%eax)
aec: 00 14 00 add %dl,(%eax,%eax,1)
aef: 00 00 add %al,(%eax)
af1: 00 00 add %al,(%eax)
af3: 00 00 add %al,(%eax)
af5: 01 7a 52 add %edi,0x52(%edx)
af8: 00 01 add %al,(%ecx)
afa: 78 10 js 0xb0c
afc: 01 1b add %ebx,(%ebx)
afe: 0c 07 or $0x7,%al
b00: 08 90 01 00 00 1c or %dl,0x1c000001(%eax)
b06: 00 00 add %al,(%eax)
b08: 00 1c 00 add %bl,(%eax,%eax,1)
b0b: 00 00 add %al,(%eax)
b0d: f3 f4 repz hlt
b0f: ff (bad)
b10: ff 1a lcall *(%edx)
b12: 00 00 add %al,(%eax)
b14: 00 00 add %al,(%eax)
b16: 45 inc %ebp
b17: 0e push %cs
b18: 10 86 02 43 0d 06 adc %al,0x60d4302(%esi)
b1e: 51 push %ecx
b1f: 0c 07 or $0x7,%al
b21: 08 00 or %al,(%eax)
...
3ffaeb: 00 00 add %al,(%eax)
3ffaed: 04 00 add $0x0,%al
3ffaef: 00 00 add %al,(%eax)
3ffaf1: 10 00 adc %al,(%eax)
3ffaf3: 00 00 add %al,(%eax)
3ffaf5: 05 00 00 00 47 add $0x47000000,%eax
3ffafa: 4e dec %esi
3ffafb: 55 push %ebp
3ffafc: 00 02 add %al,(%edx)
3ffafe: 00 00 add %al,(%eax)
3ffb00: c0 04 00 00 rolb $0x0,(%eax,%eax,1)
3ffb04: 00 03 add %al,(%ebx)
3ffb06: 00 00 add %al,(%eax)
3ffb08: 00 00 add %al,(%eax)
3ffb0a: 00 00 add %al,(%eax)
...
The giant binary files works, but it's ugly and I'd like to understand what's going on.
I'm doing the compilation/linking on Ubuntu 20.20 on a 64-bit machine. Tool versions:
gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)
GNU ld (GNU Binutils for Ubuntu) 2.34
--oformat binary
) and runobjdump -h
you see there is a section.note.gnu.property
which is located at address0x080480f4
which is your 130 MB. In binary format the only way to accomplish that is to write all the zeros in between. Why the section has that address, or whether it ought to be present at all, I don't know. – Impressuresite:stackoverflow.com linker script flat binary
– Advancementobjcopy
as suggested here. I'll be sure to leave an answer if I get to the bottom of this some day. – Bovevoid main()
is only valid in free-standing environments, so I guess you need the-ffreestanding
flag – Sundaysundberggcc -ffreestanding -c -fno-pie -m32
andld -m elf_i386 -Ttext 0x1000 --oformat binary
. However, right now I use a custom linker script, and it's just much better. – Theomachy0x0804????
would be a typical Linux address for 32-bit code. That address is part of the linker script used by your GCC. You can create your own linker script (what many people do) or you will have to exclude the.note.gnu.property
section from your binary. You should compile with-ffreestanding
and I recommend-fno-asynchronous-unwind-tables
. Rather than having LD output binary, I recommend outputting as ELF and then use OBJCOPY to convert the file to flat binary and have it remove the unneeded section. – Hyphenategcc -ffreestanding -fno-asynchronous-unwind-tables -m32 -fno-pie -c main.c -o main.o
ld -nostartfiles -m elf_i386 -o main.elf -Ttext 513 main.o
objcopy -O binary main.elf main.bin --remove-section .note.gnu.property
. On a side note-Ttext 513
is a rather peculiar VMA to use. I'd be curious how you chose it? – Hyphenate...0
, not...1
, and you left out a 0 in the offset relative to segment base=0. Therefore 513 leaves a gap of 1 byte. (Or if you're talking about segment bases, 0x7c0+513 leaves a gap of513*16 - 512
bytes because you forgot to scale down for the size of a paragraph (16 bytes)) – Advancement0x7c00 + 0x200
, then enters protected mode, and then jumps to the C code. – Bove-Ttext 0x7e00
– Hyphenateobjcopy
anyway so right now it's a moot point. The code works right now anyway because I got lucky this time and the compiler compiled my loops and ifs as IP relative jumps. – Bove