x86 Assembly: Data in the Text Section

I

2

7

I don't quite understand how variables can be stored in the text section and how they can be manipulated. Shouldn't all variables be in the .data section and aren't all part of the .text section read-only? How does this code work then?

[Code taken from Shellcoder's Handbook]

Section .text
global _start

_start:
    jmp short GotoCall

shellcode:
    pop esi
    xor eax, eax
    mov byte [esi + 7], al
    lea ebx, [esi]
    mov long [esi + 8], ebx
    mov long [esi + 12], eax
    mov byte al, 0x0b
    mov ebx, esi
    lea ecx, [esi + 8]
    lea edx, [esi + 12]
    int 0x80

GotoCall:
    call shellcode
    db '/bin/shJAAAAKKKK'

Incunabula answered 13/9, 2017 at 16:58 Comment(5)

Common attack vector for exploits is stack memory of target application. So this code is very likely stored on the stack through some buffer-overflow situation over buffer allocated on stack as local variable. So then writing into that part of memory is not a problem, executing stack data can be more problematic, some OS mark stack memory as non-executable (at least for apps which ask for it), then such exploit would need to break that first. If this would land into ordinary .text on modern OS, it would fail on mov [esi + 7], al. – Frankfrankalmoign 13/9, 2017 at 17:28

but again the app itself (or exploit payload working without writing anything) may ask OS for write access over [executable] part of memory, especially complex "corporate" applications love to load additional modules in weird manners, sometimes even dynamically downloading pieces of code through network, so poisoning those is usually a bit easier, than straightforward "hello world" example doing just what it is asked for. – Frankfrankalmoign 13/9, 2017 at 17:30

And final note.. that code can be easily rewritten to copy that string into stack first and patch+use it there, so all you need is your payload delivered in executable memory, stack is 100% writeable and ready to be used for temporary data = problem solved. – Frankfrankalmoign 13/9, 2017 at 17:33

Related: segmentation fault with .text .data and main (main in .data section) for the opposite problem. (And note that unless you're careful in using NASM, .data might actually be executable). – Phenacaine 25/9, 2020 at 4:31

Related: x86_64 Assembly - Segfault when trying to edit a byte within an array in x64 assembly for segfaults trying to write into arrays put in the .text section. – Phenacaine 15/10, 2021 at 22:14

F

4

The top level answer is, that x86 machine is not aware of ".text" and ".data" sections. Modern x86 CPU provides OS with tools to create virtual address space with specific rights (like read-only, no-exec and read-write).

But the content of memory is just bytes, and those can be either read, written, or executed, the CPU has no means to guess which part of memory are data and what is code, and will happily execute anything what you point it to.

Those .text/.data/... sections are logical construct supported by compiler, linker, and OS (executable loader), which together cooperate to prepare the runtime environment for the code in such way, that .text is read-only nowadays, and you need to put writeable variables into .data or .bss or similar. Also non-executable stack may be provided by some OS and configurations.

The OS usually also has API, so application can change the rights or memory mapping, or allocate further memory with the attributes it needs (for example JIT compilers would get nowhere, if they would be unable to first write compiled code into memory, and then execute it).

So if you will use your code example on common linux in default config, it will very likely segfault as the .text will be read-only. Many of those "exploits" books have whole dedicated chapter how to compile + set up runtime environment for their examples in such way, that several protections (ASLR, NX, ...) are switched OFF, thus allowing their samples to work.

Then a real exploit in the wild will usually use some bug/weak spot in application to inject its payload somewhere. Depending on the hostility of "somewhere" the real exploit may have to first elevate its rights to get writeable+executable memory (or it must be written in a way to not write into code parts and use other memory for variables), unless the app itself already has some friendly environment for exploit due to its internal needs.

Keep in mind the OS and applications are not written in a way to make sure the exploits will work, quite opposite. Each exploit is usually targetting particular version of application on particular version of OS, which is vulnerable, and it is expected that it will break with the security update later. So if you know you have writeable and executable memory, you just exploit it as is, without bothering what will happen in next version, when they will fix the app to keep their code memory RO.

Frankfrankalmoign answered 13/9, 2017 at 17:56 Comment(1)

x86 might not be aware of sections, but it is aware of segments (in the ELF sense, not the x86 sense) and can allow different things for different segments. Since sections are mapped to segments by the linker, the effect is virtually as if the processor is aware of sections. – Koressa 19/9, 2017 at 19:22

W

7

Well, the data & code are just bytes. Only how you interpret them makes them what they are. Code can be interpreted as data and vice versa. In most case it will produce the something that's invalid but anyway it's possible.

Attributes of the section are dependant on the linker and most of them by default make the .text section RO, but it doesn't mean it can't be changed.

The whole example is a clever way to obtain the address of /bin/sh just by using the call. Basically the call places on the stack the address of the next instruction (next bytes) and in this case it will be the address of this string so pop esi will get that address from the stack and use it.

Wonted answered 13/9, 2017 at 17:5 Comment(2)

Compilers put read-only data (like static const char[] = "/bin/sh";) in the .rodata section, which the linker places in the text segment of the executable, along with the .text section. You don't normally want to mix code and constants (because that wastes space in the split L1I/L1D caches), but putting them in the same page of the same segment is good, as you point out. – Phenacaine 14/9, 2017 at 23:32

Update on that: GNU ld changed recently to put .rodata in its own segment with read but not exec permission, for better defence against ROP and Spectre attacks. – Phenacaine 15/6, 2020 at 13:15

F

4