How to get c code to execute hex machine code?

Asked 31/3, 2012 at 23:53 Answered 29/7, 2021 at 13:27

I want a simple C method to be able to run hex bytecode on a Linux 64 bit machine. Here's the C program that I have:

char code[] = "\x48\x31\xc0";
#include <stdio.h>
int main(int argc, char **argv)
{
        int (*func) ();
        func = (int (*)()) code;
        (int)(*func)();
        printf("%s\n","DONE");
}

The code that I am trying to run ("\x48\x31\xc0") I obtained by writting this simple assembly program (it's not supposed to really do anything)

.text
.globl _start
_start:
        xorq %rax, %rax

and then compiling and objdump-ing it to obtain the bytecode.

However, when I run my C program I get a segmentation fault. Any ideas?

Ludivinaludlew answered 31/3, 2012 at 23:53 Comment(6)

Even if your data segment is executable or you don't have NX enabled, what do you expect this to do? It executes one instruction and then the instruction afterwards (which you don't control) and then the instruction after that, until it reaches memory which doesn't represent legitimate code or code that triggers a segfault. – Janiculum 31/3, 2012 at 23:55

You will need to add byte code for a ret because the indirect function call you do should be a call which pushes the return address onto the stack. Atleast, this is my best educated guess, I have never seen anything like this. – Supersensible 31/3, 2012 at 23:56

I expect this to do nothing, but I want it to be able to run without crashing. – Ludivinaludlew 31/3, 2012 at 23:58

Do you mind about the \0 at the end of the string ? – Copestone 1/4, 2012 at 0:4

char code[] = "\x48\x31\xc0\xc3\0"; – Mundy 23/7, 2014 at 12:14

don't use xorq %rax, %rax. Use `xor eax, eax instead – Dewitt 24/7, 2018 at 15:10

Machine code has to be in an executable page. Your char code[] is in the read+write data section, without exec permission, so the code cannot be executed from there.

Here is a simple example of allocating an executable page with mmap:

#include <stdio.h>
#include <string.h>
#include <sys/mman.h>

int main ()
{
  char code[] = {
    0x8D, 0x04, 0x37,           //  lea eax,[rdi+rsi]
    0xC3                        //  ret
  };

  int (*sum) (int, int) = NULL;

  // allocate executable buffer                                             
  sum = mmap (0, sizeof(code), PROT_READ|PROT_WRITE|PROT_EXEC,
              MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

  // copy code to buffer
  memcpy (sum, code, sizeof(code));
  // doesn't actually flush cache on x86, but ensure memcpy isn't
  // optimized away as a dead store.
  __builtin___clear_cache (sum, sum + sizeof(sum));  // GNU C

  // run code
  int a = 2;
  int b = 3;
  int c = sum (a, b);

  printf ("%d + %d = %d\n", a, b, c);
}

See another answer on this question for details about __builtin___clear_cache.

Sawmill answered 1/4, 2012 at 12:43 Comment(22)

Yes, the ret is important to return back into the calling function. – Supersensible 1/4, 2012 at 19:22

Thanks for the help. I just want to add that objdump -d <filename> can get you the byte code for an executable. – Nadenenader 24/10, 2013 at 19:0

static const char code[] is normally linked into the text segment of your executable, which is already mapped read-only + executable. You don't actually need to copy it. (Making it non-const would be a problem, though; the data segment isn't always executable.) The important part of this answer vs. the question is the ret. See also Why does const int main = 195 result in a working program but without the const it ends in a segmentation fault?. (195 = 0xC3 = ret). – Standford 24/7, 2018 at 15:24

This technique is independent of where the machine code comes from. I included a fragment known at compile time for simplicity, but the concept works with machine code generated at runtime, which is the main use case. You are quick to comment but I think you missed the point. – Sawmill 24/7, 2018 at 18:1

I removed the 'static const' in order not to give the impression that the code has to be known at compile time. – Sawmill 24/7, 2018 at 18:12

@PeterCordes Are you sure it is not in data segment, which is read-only and not executable? – Erepsin 24/7, 2018 at 19:38

@MaximEgorushkin: data is read/write. That's where the .data section goes, which contains things like int global_var = 2; And yes, I'm sure. Look at compiler output yourself, and / or use readelf -a on binaries produced by gcc + ld. – Standford 24/7, 2018 at 19:44

@PeterCordes I meant the read-only data segment. Will double check. – Erepsin 24/7, 2018 at 19:53

@MaximEgorushkin: Current toolchains don't use a separate segment for that. section .rodata (or section .rdata on Windows) gets linked as part of the text segment, so it's part of the same read+exec mapping as the code. What's the difference of section and segment in ELF file format – Standford 24/7, 2018 at 19:57

@MaximEgorushkin and Antoine: update: This answer is still fine, but my static const code[] = ... suggestion is no longer sufficient. Current (2019) GNU Binutils ld now links .rodata into a separate segment that is read only without exec permission. It's still easier to use gcc -z execstack or mprotect() than to mmap+memcpy, though. I added an answer of my own with full details + working examples. – Standford 28/4, 2019 at 19:23

@AntoineMathys I thought __builtin___clear_cache() isn't required because sum is called and so there is a dependency to the result of memcpy(), which depends on the result of mmap. Thought? – Tragopan 19/11, 2019 at 12:43

@AntoineMathys: Yes, it's true that GCC doesn't "know" that mmap is being used to get anonymous memory, where writing + freeing it has no permanent side effect. GCC does do dead-store elimination if you use malloc and compile with gcc -z execstack. In theory it could recognize mmap(MAP_ANONYMOUS) and behave the same way. Also, the compiler doesn't "see" the dependency between actually doing the memcpy and calling sum(), so if you later stored other bytes to the same buffer you'd get only one memcpy, potentially after the first call! – Standford 19/11, 2019 at 18:28

And yes __builtin___clear_cache() expands to zero instructions once we finally get to that stage of code-gen. As I explained in my edit and comments, and in my own answer; it's there to make other code compile correctly by preventing compile-time reordering of things, not to emit any instructions. It's like a compile-time-only barrier for that region of code. Does my answer not explain that clearly enough? Would that example of two memcpy into the same buffer help? – Standford 19/11, 2019 at 18:36

@HCSF: That is not correct. See my answer on this question. Part of the point of __builtin___clear_cache() (and the only point on ISAs like x86) is that sum() does not get treated as having a dependency on the memcpy(sum, ...). This is like strict-aliasing where you can think of machine code as having a different type from any kind of data, and even char* alias-anything isn't sufficient to define the behaviour. – Standford 19/11, 2019 at 18:41

@AntoineMathys: That would be sufficient, yes. If we're talking about mixing multiple memcpy and sum() calls, I'd do it after every memcpy instead of before each sum(), in case it had any deleterious effects on optimization of anything else. (Like in case its memory-barrier effect applies generally). I think you meant one clear_cache before all the subsequent sum() calls, not one before each, but "every" might imply the other meaning. – Standford 19/11, 2019 at 18:46

@PeterCordes I meant adding __builtin__clear_cache() before the first call to sum(), and after that, for each subsequent call to sum(), adding it before it if and only if the memory has been written to since the last call to sum(). – Sawmill 19/11, 2019 at 19:3

AntoineMathys and @HCSF: I tried to reproduce the effect I was suggested with modifying the buffer again after the first call. godbolt.org/z/nEpaQX. But GCC has to assume that sum() reads or modifies arbitrary global memory as data, it can't optimize away the store between the first and second call. GCC doesn't know about mmap(MAP_ANONYMOUS) so it has to assume that mmap return value might be a pointer to memory that's accessible via other methods. That's also why order is respected for the first memcpy and call to sum(). – Standford 19/11, 2019 at 19:3

@AntoineMathys: I'd suggest rolling back to revision 10; I think the English text explaining that you need an executable page is at least as useful as this convoluted recipe using mmap. Also, I agree that __builtin___clear_cache() is distracting clutter for people who really just to test shellcode, not actually write a JIT. Especially when non-inline-function potential data dependence on writes to non-local memory in practice mean there's no problem unless you use malloc + mprotect. Using #ifdef __GNUC__ would make the clutter even worse, and would let it work on non-GNU with mmap – Standford 19/11, 2019 at 19:9

@HCSF: and Antoine: I was able to provoke GCC into breaking something by using __attribute__((const)) to tell the optimizer it's a pure function (that only reads its args, not global memory). Then dead-store elimination can happen on a memcpy before and after the call, resulting in no store before the first call = 00 00 add [rax], al = segfault. godbolt.org/z/6VNsav (Godbolt's ./a.out option to run the program still seems to fail, but it works on my desktop with __clear_cache and crashes without.) However attribute(const) still CSEs calls over modifying the machine code. – Standford 19/11, 2019 at 19:41

I expanded the __builtin___clear_cache section in my answer with what I had added as a footnote to this answer. I included a link for readers of this answer that wonder why removing __clear_cache doesn't actually break this code. – Standford 19/11, 2019 at 20:18

@PeterCordes This answer is now just the way I like it: simple and correct. If you have yet other suggestions, please leave a comment. – Sawmill 19/11, 2019 at 22:0

Looks good to me, and good improvement to the comments in the code to remove the word "necessary", which might be taken as implying it would break in practice. I tried to sneak in a mention of less convoluted ways to do this (gcc -z execstack), but if you really don't want that then I don't insist. A pointer to another answer for details on __clear_cache is more than fine, and what I should have done in the first place instead of inlining a big footnote. – Standford 19/11, 2019 at 22:7

Until recent Linux kernel versions (sometime before 5.4), you could simply compile with gcc -z execstack - that would make all pages executable, including read-only data (.rodata), and read-write data (.data) where char code[] = "..." goes.

Now -z execstack only applies to the actual stack, so it currently works only for non-const local arrays. i.e. move char code[] = ... into main. Modern systems make as few pages executable as possible as hardening against exploits.

See Linux default behavior against `.data` section for the kernel change, and Unexpected exec permission from mmap when assembly files included in the project for the old behaviour: enabling Linux's READ_IMPLIES_EXEC process for that program. (In Linux 5.4, that Q&A shows you'd only get READ_IMPLIES_EXEC for a missing PT_GNU_STACK, like a really old binary; modern GCC -z execstack would set PT_GNU_STACK = RWX metadata in the executable, which Linux 5.4 would handle as making only the stack itself executable. At some point before that, PT_GNU_STACK = RWX did result in READ_IMPLIES_EXEC.)

The other option is to make system calls at runtime to copy into an executable page, or change permissions on the page it's in. That's still more complicated than using a local array to get GCC to copy code into executable stack memory.

(I don't know if there's an easy way to enable READ_IMPLIES_EXEC under modern kernels. Having no GNU-stack attribute at all in an ELF binary does that for 32-bit code, but not 64-bit.)

Yet another option is __attribute__((section(".text"))) const char code[] = ...;
Working example: https://godbolt.org/z/draGeh.
If you need the array to be writeable, e.g. for shellcode that inserts some zeros into strings, you could maybe link with ld -N. But probably best to use -z execstack and a local array.

Two problems in the question:

exec permission on the page, because you used an array that will go in the noexec read+write .data section.
your machine code doesn't end with a ret instruction so even if it did run, execution would fall into whatever was next in memory instead of returning.

And BTW, the REX prefix is redundant. "\x31\xc0" xor eax,eax has exactly the same effect as xor rax,rax.

You need the page containing the machine code to have execute permission. x86-64 page tables have a separate bit for execute separate from read permission, unlike legacy 386 page tables.

The easiest way to get static arrays to be in read+exec memory was to compile with gcc -z execstack. (Used to make the stack and other sections executable, now only the stack).

typedef int (*intfunc_int)(int);

int main(void)
{
    unsigned char execbuf[] = {   // compile with -zexecstack
        0x8d, 0x47, 0x01,     // lea 0x1(%rdi),%eax
        0xc3                  // ret
    };
    // a string initializer like  char execbuf[] = "\xc3"; also works

    // Tell GCC we're about to run this data as code.  x86 has coherent I-cache,
    // but this also stops optimization from removing the initialization as dead stores.
    __builtin___clear_cache (execbuf, execbuf+sizeof(execbuf)-1);
    // Without this, the store disappears

    intfunc_int fptr = (intfunc_int) execbuf;  // cast to function pointer.
    int res = fptr(2);           // deref the function pointer
    
    return res;    // returns 3 on non-Windows ISAs where the first arg is in EDI
}

Compiles to simple asm (Godbolt - also showing that it's broken without the __builtin___clear_cache - it will skip the store and just jump to uninitialized stack space.) This runs correctly with -z execstack, will segfault without it.

# GCC -O3 for x86-64
main:
    sub     rsp, 24              # GCC reserves 16 bytes more stack space than it needed
    mov     edi, 2               # function arg
    mov     DWORD PTR [rsp+12], -1023326323  # store 4 bytes of machine code
    lea     rax, [rsp+12]        # pointer into a register
    call    rax                  # call through the function pointer
    add     rsp, 24
    ret

Older GNU `ld` linker used to make `.rodata` read+exec

Until recently (2018 or 2019), the standard toolchain (binutils ld) would put section .rodata into the same ELF segment as .text, so they'd both have read+exec permission. Thus using const char code[] = "..."; was sufficient for executing manually-specified bytes as data, without execstack.

But on my Arch Linux system with GNU ld (GNU Binutils) 2.31.1, that's no longer the case. readelf -a shows that the .rodata section went into an ELF segment with .eh_frame_hdr and .eh_frame, and it only has Read permission. .text goes in a segment with Read + Exec, and .data goes in a segment with Read + Write (along with the .got and .got.plt). (What's the difference of section and segment in ELF file format)

I assume this change is to make ROP and Spectre attacks harder by not having read-only data in executable pages where sequences of useful bytes could be used as "gadgets" that end with the bytes for a ret or jmp reg instruction.

// See above for char code[] = {...} inside main with -z execstack, for current Linux

// This is broken on recent Linux, used to work without execstack.
#include <stdio.h>

// can be non-const if you use gcc -z execstack.  static is also optional
static const char code[] = {
  0x8D, 0x04, 0x37,           //  lea eax,[rdi+rsi]       // retval = a+b;                    
  0xC3                        //  ret                                         
};

static const char ret0_code[] = "\x31\xc0\xc3";   // xor eax,eax ;  ret
                     // the compiler will append a 0 byte to terminate the C string,
                     // but that's fine.  It's after the ret.

int main () {
  // void* cast is easier to type than a cast to function pointer,
  // and in C can be assigned to any other pointer type.  (not C++)

  int (*sum) (int, int) = (void*)code;
  int (*ret0)(void) = (void*)ret0_code;

  // run code                                                                   
  int c = sum (2, 3);
  return ret0();
}

On older Linux systems: gcc -O3 shellcode.c && ./a.out (Works because of const on global/static arrays)

On Linux before 5.5 (or so) gcc -O3 -z execstack shellcode.c && ./a.out (works because of -zexecstack regardless of where your machine code is stored). Fun fact: gcc allows -zexecstack with no space, but clang only accepts clang -z execstack.

These also work on Windows, where read-only data goes in .rdata instead of .rodata.

The compiler-generated main looks like this (from objdump -drwC -Mintel). You can run it inside gdb and set breakpoints on code and ret0_code

(I actually used   gcc -no-pie -O3 -zexecstack shellcode.c  hence the addresses near 401000
0000000000401020 <main>:
  401020:       48 83 ec 08             sub    rsp,0x8           # stack aligned by 16 before a call
  401024:       be 03 00 00 00          mov    esi,0x3
  401029:       bf 02 00 00 00          mov    edi,0x2           # 2 args
  40102e:       e8 d5 0f 00 00          call   402008 <code>     # note the target address in the next page; that's where .rodata goes
  401033:       48 83 c4 08             add    rsp,0x8
  401037:       e9 c8 0f 00 00          jmp    402004 <ret0_code>    # optimized tailcall

Or use system calls to modify page permissions

Instead of compiling with gcc -zexecstack, you can instead use mmap(PROT_EXEC) to allocate new executable pages, or mprotect(PROT_EXEC) to change existing pages to executable. (Including pages holding static data.) You also typically want at least PROT_READ and sometimes PROT_WRITE, of course.

Using mprotect on a static array means you're still executing the code from a known location, maybe making it easier to set a breakpoint on it.

On Windows you can use VirtualAlloc or VirtualProtect.

Telling the compiler that data is executed as code

Normally compilers like GCC assume that data and code are separate. This is like type-based strict aliasing, but even using char* doesn't make it well-defined to store into a buffer and then call that buffer as a function pointer.

In GNU C, you also need to use __builtin___clear_cache(buf, buf + len) after writing machine code bytes to a buffer, because the optimizer doesn't treat dereferencing a function pointer as reading bytes from that address. Dead-store elimination can remove the stores of machine code bytes into a buffer, if the compiler proves that the store isn't read as data by anything. https://codegolf.stackexchange.com/questions/160100/the-repetitive-byte-counter/160236#160236 and https://godbolt.org/g/pGXn3B has an example where gcc really does do this optimization, because gcc "knows about" malloc. Also the first code block in this answer, where we use a local array in executable stack space.

(And on non-x86 architectures where I-cache isn't coherent with D-cache, it actually will do any necessary cache syncing. On x86 it's purely a compile-time optimization blocker and doesn't expand to any instructions itself, because a jump or call is sufficient on paper for JIT or self-modifying code, and in practice it's completely impossible to observe stale code after a store on real x86 CPUs.)

Re: the weird name with three underscores: It's the usual __builtin_name pattern, but name is __clear_cache.

My edit on @AntoineMathys's answer added this.

In practice GCC/clang don't "know about" mmap(MAP_ANONYMOUS) the way they know about malloc. So in practice the optimizer will assume that the memcpy into the buffer might be read as data by the non-inline function call through the function pointer, even without __builtin___clear_cache(). (Unless you declared the function type as __attribute__((const)).)

On x86, where I-cache is coherent with data caches, having the stores happen in asm before the call is sufficient for correctness. On other ISAs, __builtin___clear_cache() will actually emit special instructions as well as ensuring the right compile-time ordering.

It's good practice to include it when copying code into a buffer because it doesn't cost performance, and stops hypothetical future compilers from breaking your code. (e.g. if they do understand that mmap(MAP_ANONYMOUS) gives newly-allocated anonymous memory that nothing else has a pointer to, just like malloc.)

With current GCC, I was able to provoke GCC into really doing an optimization we don't want by using __attribute__((const)) to tell the optimizer sum() is a pure function (that only reads its args, not global memory). GCC then knows sum() can't read the result of the memcpy as data.

With another memcpy into the same buffer after the call, GCC does dead-store elimination into just the 2nd store after the call. This results in no store before the first call so it executes the 00 00 add [rax], al bytes, segfaulting.

// demo of a problem on x86 when not using __builtin___clear_cache
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>

int main ()
{
  char code[] = {
    0x8D, 0x04, 0x37,           //  lea eax,[rdi+rsi]
    0xC3                        //  ret                                         
  };

  __attribute__((const)) int (*sum) (int, int) = NULL;

  // copy code to executable buffer                                             
  sum = mmap (0,sizeof(code),PROT_READ|PROT_WRITE|PROT_EXEC,
              MAP_PRIVATE|MAP_ANON,-1,0);
  memcpy (sum, code, sizeof(code));
  //__builtin___clear_cache(sum, sum + sizeof(code));

  int c = sum (2, 3);
  //printf ("%d + %d = %d\n", a, b, c);

  memcpy(sum, (char[]){0x31, 0xc0, 0xc3, 0}, 4);  // xor-zero eax, ret, padding for a dword store
  //__builtin___clear_cache(sum, sum + 4);
  return sum(2,3);
}

Compiled on the Godbolt compiler explorer with GCC9.2 -O3

main:
        push    rbx
        xor     r9d, r9d
        mov     r8d, -1
        mov     ecx, 34
        mov     edx, 7
        mov     esi, 4
        xor     edi, edi
        sub     rsp, 16
        call    mmap
        mov     esi, 3
        mov     edi, 2
        mov     rbx, rax
        call    rax                  # call before store
        mov     DWORD PTR [rbx], 12828721    #  0xC3C031 = xor-zero eax, ret
        add     rsp, 16
        pop     rbx
        ret                      # no 2nd call, CSEd away because const and same args

Passing different args would have gotten another call reg, but even with __builtin___clear_cache the two sum(2,3) calls can CSE. __attribute__((const)) doesn't respect changes to the machine code of a function. Don't do it. It's safe if you're going to JIT the function once and then call many times, though.

Uncommenting the first __clear_cache results in

        mov     DWORD PTR [rax], -1019804531    # lea; ret
        call    rax
        mov     DWORD PTR [rbx], 12828721       # xor-zero; ret
       ... still CSE and use the RAX return value

The first store is there because of __clear_cache and the sum(2,3) call. (Removing the first sum(2,3) call does let dead-store elimination happen across the __clear_cache.)

The second store is there because the side-effect on the buffer returned by mmap is assumed to be important, and that's the final value main leaves.

Godbolt's ./a.out option to run the program still seems to always fail (exit status of 255); maybe it sandboxes JITing? It works on my desktop with __clear_cache and crashes without.

`mprotect` on a page holding existing C variables.

You can also give a single existing page read+write+exec permission. This is an alternative to compiling with -z execstack

You don't need __clear_cache on a page holding read-only C variables because there's no store to optimize away. You would still need it for initializing a local buffer (on the stack). Otherwise GCC will optimize away the initializer for this private buffer that a non-inline function call definitely doesn't have a pointer to. (Escape analysis). It doesn't consider the possibility that the buffer might hold the machine code for the function unless you tell it that via __builtin___clear_cache.

#include <stdio.h>
#include <sys/mman.h>
#include <stdint.h>

// can be non-const if you want, we're using mprotect
static const char code[] = {
  0x8D, 0x04, 0x37,           //  lea eax,[rdi+rsi]       // retval = a+b;                    
  0xC3                        //  ret                                         
};

static const char ret0_code[] = "\x31\xc0\xc3";

int main () {
  // void* cast is easier to type than a cast to function pointer,
  // and in C can be assigned to any other pointer type.  (not C++)
  int (*sum) (int, int) = (void*)code;
  int (*ret0)(void) = (void*)ret0_code;

   // hard-coding x86's 4k page size for simplicity.
   // also assume that `code` doesn't span a page boundary and that ret0_code is in the same page.
  uintptr_t page = (uintptr_t)code & -4095ULL;                  // round down
  mprotect((void*)page, 4096, PROT_READ|PROT_EXEC|PROT_WRITE);  // +write in case the page holds any writeable C vars that would crash later code.

  // run code                                                                   
  int c = sum (2, 3);
  return ret0();
}

I used PROT_READ|PROT_EXEC|PROT_WRITE in this example so it works regardless of where your variable is. If it was a local on the stack and you left out PROT_WRITE, call would fail after making the stack read only when it tried to push a return address.

Also, PROT_WRITE lets you test shellcode that self-modifies, e.g. to edit zeros into its own machine code, or other bytes it was avoiding.

$ gcc -O3 shellcode.c           # without -z execstack
$ ./a.out 
$ echo $?
0
$ strace ./a.out
...
mprotect(0x55605aa3f000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
exit_group(0)                           = ?
+++ exited with 0 +++

If I comment out the mprotect, it does segfault with recent versions of GNU Binutils ld which no longer put read-only constant data into the same ELF segment as the .text section.

If I did something like ret0_code[4] = 0xc3;, I would need __builtin___clear_cache(ret0_code+2, ret0_code+2) after that to make sure the store wasn't optimized away, but if I don't modify the static arrays then it's not needed after mprotect. It is needed after mmap+memcpy or manual stores, because we want to execute bytes that have been written in C (with memcpy).

Standford answered 28/4, 2019 at 19:20 Comment(1)

@AntoineMathys: The other reason is to test out a snippet of shellcode to make sure you've correctly turned an assembly program into part of an exploit payload. If you were JITing, you wouldn't have a C string like the OP's "\x48\x31\xc0", you'd just have bytes. – Standford 29/4, 2019 at 17:15

You need to include the assembly in-line via a special compiler directive so that it'll properly end up in a code segment. See this guide, for example: http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html

Democracy answered 31/3, 2012 at 23:57 Comment(0)

Your machine code may be all right, but your CPU objects.

Modern CPUs manage memory in segments. In normal operation, the operating system loads a new program into a program-text segment and sets up a stack in a data segment. The operating system tells the CPU never to run code in a data segment. Your code is in code[], in a data segment. Thus the segfault.

Heiskell answered 31/3, 2012 at 23:57 Comment(0)

This will take some effort.

Your code variable is stored in the .data section of your executable:

$ readelf -p .data exploit

String dump of section '.data':
  [    10]  H1À

H1À is the value of your variable.

The .data section is not executable:

$ readelf -S exploit
There are 30 section headers, starting at offset 0x1150:
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
[...]
  [24] .data             PROGBITS         0000000000601010  00001010
       0000000000000014  0000000000000000  WA       0     0     8

All 64-bit processors I'm familiar with support non-executable pages natively in the pagetables. Most newer 32-bit processors (the ones that support PAE) provide enough extra space in their pagetables for the operating system to emulate hardware non-executable pages. You'll need to run either an ancient OS or an ancient processor to get a .data section marked executable.

Because these are just flags in the executable, you ought to be able to set the X flag through some other mechanism, but I don't know how to do so. And your OS might not even let you have pages that are both writable and executable.

Gutsy answered 1/4, 2012 at 0:3 Comment(0)

You may need to set the page executable before you may call it. On MS-Windows, see the VirtualProtect -function.

URL: http://msdn.microsoft.com/en-us/library/windows/desktop/aa366898%28v=vs.85%29.aspx

River answered 5/12, 2012 at 12:6 Comment(0)

-1

Sorry, I couldn't follow above examples which are complicated. So, I created an elegant solution for executing hex code from C. Basically, you could use asm and .word keywords to place your instructions in hex format. See below example:

asm volatile(".rept 1024\n"
             CNOP
           ".endr\n");

where CNOP is defined as below: #define ".word 0x00010001 \n"

Basically, c.nop instruction was not supported by my current assembler. So, I defined CNOP as the hex equivalent of c.nop with proper syntax and used inside asm, with which I was aware of. .rept <NUM> .endr will basically, repeat the instruction NUM times.

This solution is working and verified.

Winter answered 29/7, 2021 at 13:27 Comment(1)

Normally when people want to test shellcode, they have a string like "\x31\xc0\xc3", not broken up into .byte 0x??, 0x??, ... or .word chunks. But yes, if you do that, that's another way to turn it into machine code in an executable page. (In the middle of a function, so most instructions other than NOP will need clobber declarations to tell the compiler what it does to registers, unless it exits or does execve so execution doesn't leave the asm statement). – Standford 29/7, 2021 at 20:35

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Older GNU ld linker used to make .rodata read+exec

Or use system calls to modify page permissions

Telling the compiler that data is executed as code

mprotect on a page holding existing C variables.

Recommended topics

Hot tags

Older GNU `ld` linker used to make `.rodata` read+exec

`mprotect` on a page holding existing C variables.