Avoiding 0xFF bytes in shellcode using CALL to read RIP?
Asked Answered
G

1

2

I'm trying to write a decoder stub and I'm running into a restriction on 0xFF as a bad character. I'm using the jmp-call-pop method to get the address of my encoded shellcode into a register. Here's the relevant snippet:

401012: e8 eb ff ff ff          call   0x401002

It seems like call will always use 0xFF in its bytes. Is there another instruction that, when executed, will push rip onto the stack and jump to another section of code? I've tried just pushing the address onto the stack manually, but that results in a null byte because my addresses are 3 bytes long and need to be padded.


Disallowed bytes in my machine code are:

  • 00
  • FF
Gawky answered 21/4, 2019 at 0:17 Comment(4)
What other bytes are forbidden?Returnable
@fuz: I'm assuming just 0 and FF based on the question not mentioning any others. But good point; I added a section to the question that the OP should edit if there are more. My lea/sub answer does use a couple 8x bytes that are outside the low-ASCII range.Alben
@PeterCordes Please don't add sections like your “disallowed bytes” section to questions based on assumptions. As long as OP does not state what exact bytes are allowed and what not, you cannot assume that all other characters are fine.Returnable
@fuz: I was thinking it would be better to make the implicit statement of disallowed bytes explicit, but I think you're right, that's more likely to just waste people's time answering if it turns out the real requirement is different. So if people do write answers based on that, and it's not what the OP needed, they'll probably edit and may invalidate the question. OTOH I thought this specific set of restrictions was interesting enough to answer, whether it helps the OP or not. If they had other requirements, they should have said so in the first place.Alben
A
6

call rel32 is the only relative encoding (and indirect or far jmp are rarely useful), so yes of course the high byte(s) will always be 00 or FF unless you're jumping very far away, because that's how 2's complement works.

Self-modifying code would be one option (but then you have a chicken/egg problem of getting a pointer to your code). Depending on the exploit mechanism, you might have a pointer to (near) your code in RSP. So you could maybe just lea rax, [rsp+44] / push rax / jmp ...

But x86-64 has no need for the jmp/call/pop idiom. Normally you can just jmp over your data and then use RIP-relative LEA with a negative rel32, but that will of course also have 0xFF bytes.


You can use RIP-relative LEA with a safe rel32 then correct it:

    lea    rsi, [rel anchor + 0x66666666]      ; or  [RIP + 0x66666666]
    sub    rsi, 0x66666666
    ;...
    xor    eax,eax
    mov    al,1        ; __NR_write = 1  x86-64 Linux
    mov    edi, eax
    lea    edx, [rax-1 + msglen]
    syscall            ; write(1, msg, msglen)

    lea    eax, [rdi-1 + 60]       ; __NR_exit
    syscall            ; sys_exit(1)

anchor:
    msg: db     "Hello World", 0xa
    msglen equ $-msg

machine code from assembling with NASM and disassembling with objdump -drwC -Mintel:

$ asm-link -dn rel.asm                   # a helper script to assmble+link and disassemble
+ nasm -felf64 -Worphan-labels rel.asm
+ ld -o rel rel.o
ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000

rel:     file format elf64-x86-64


Disassembly of section .text:

0000000000401000 <anchor-0x1e>:
  401000:       48 8d 35 7d 66 66 66    lea    rsi,[rip+0x6666667d]        # 66a67684 <__bss_start+0x66665684>
  401007:       48 81 ee 66 66 66 66    sub    rsi,0x66666666
  40100e:       31 c0                   xor    eax,eax
  401010:       b0 01                   mov    al,0x1
  401012:       89 c7                   mov    edi,eax
  401014:       8d 50 0b                lea    edx,[rax+0xb]
  401017:       0f 05                   syscall 
  401019:       8d 47 3b                lea    eax,[rdi+0x3b]
  40101c:       0f 05                   syscall 

000000000040101e <anchor>:
  40101e:       48                      rex.W
   ... ASCII data that isn't real machine code
  401029:       0a                      .byte 0xa

peter@volta:/tmp$ ./rel 
Hello World

$ strace ./rel 
execve("./rel", ["./rel"], 0x7ffd09467720 /* 55 vars */) = 0
write(1, "Hello World\n", 12Hello World
)           = 12
exit(1)                                 = ?
+++ exited with 1 +++

Amusingly, 0x66 is the ASCII code for the letter 'f'. I didn't intentionally pick 'f' when trying to avoid 0xFF :P But anyway, choose whatever 4-byte string you like.

The low byte of the rel32 will be higher depending on how far it has to reach, so choose wisely.


Actually doing a call to somewhere nearby:

You can use the above RIP-relative LEA + fixup trick to create self-modifying code, e.g. inc byte [rax] to turn 0xFE into 0xFF. Or a dword sub-immediate with 0x11111111 or something could be useful to fixup a rel32

call r/m64 and jmp r/m64 are both unusable directly, because the opcodes themselves are FF /2 and FF /4

If you want to return, it's probably easiest to fixup a call rel32 or call rax. But it would be possible to also use RIP-relative LEA to calculate a return address in a register and push it, then jmp rel8 or jmp rax or whatever.

Alben answered 21/4, 2019 at 13:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.