Starting from an answer of mine about the "real" entrypoint of an ELF executable on Linux and "raw" syscalls, we can strip it down to
bits 64
global _start
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall
I don't think you can get it to be any smaller without going out of specs - in particular, the psABI doesn't guarantee anything about the state of eax
. This gets assembled to precisely 10 bytes (as opposed to the 7 bytes of the 32 bit payload):
66 bf 2a 00 31 c0 b0 3c 0f 05
The straightforward way (assemble with nasm
, link with ld
) produces me a 352 bytes executable.
The first "real" transformation he does is building the ELF "by hand"; doing this (with some modifications, as the ELF header for x86_64 is a bit bigger)
bits 64
org 0x08048000
ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident
times 8 db 0
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
phdr: ; Elf64_Phdr
dd 1 ; p_type
dd 5 ; p_flags
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall
filesize equ $ - $$
we get down to 130 bytes. This is a tad bigger than the 91 bytes executable, but it comes from the fact that several fields become 64 bits instead of 32.
We can then apply some tricks similar to his; the partial overlap of phdr
and ehdr
can be done, although the order of fields in phdr
is different, and we have to overlap p_flags
with e_shnum
(which however should be ignored due to e_shentsize
being 0).
Moving the code inside the header is slightly more difficult, as it's 3 bytes larger, but that part of header is just as big as in the 32 bit case. We overcome this by starting 2 bytes earlier, overwriting the padding byte (ok) and the ABI version field (not ok, but still works).
So, we reach:
bits 64
org 0x08048000
ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, ; e_ident
_start:
mov di,42 ; only the low byte of the exit code is kept,
; so we can use di instead of the full edi/rdi
xor eax,eax
mov al,60 ; shorter than mov eax,60
syscall ; perform the syscall
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
phdr: ; Elf64_Phdr
dw 1 ; e_phnum p_type
dw 0 ; e_shentsize
dw 5 ; e_shnum p_flags
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
filesize equ $ - $$
which is 112 bytes long.
Here I stop for the moment, as I don't have much time for this right now. You now have the basic layout with the relevant modifications for 64 bit, so you just have to experiment with more audacious overlaps
int 0x80
system calls in your 64-bit executable? If so, your probably don't need to change much. I know there's some overlap of ELF header fields being interpreted as part of the machine code, so some change might be needed for ELF64. – Panicstricken