How to find load relocation for a PIE binary?
Asked Answered
A

2

5

I need to get base address of stack inside my running process. This would enable me to print raw stacktraces that will be understood by addr2line (running binary is stripped, but addr2line has access to symbols). I managed to do this by examining elf header of argv[0]: I read entry point and substract it from &_start:

#include <stdio.h>
#include <execinfo.h>
#include <unistd.h>
#include <elf.h>
#include <stdio.h>
#include <string.h>
void* entry_point = NULL;
void* base_addr = NULL;
extern char _start;

/// given argv[0] will populate global entry_pont
void read_elf_header(const char* elfFile) {
  // switch to Elf32_Ehdr for x86 architecture.
  Elf64_Ehdr header;
  FILE* file = fopen(elfFile, "rb");
  if(file) {
    fread(&header, 1, sizeof(header), file);
    if (memcmp(header.e_ident, ELFMAG, SELFMAG) == 0) {
        printf("Entry point from file: %p\n", (void *) header.e_entry);
        entry_point = (void*)header.e_entry;
        base_addr = (void*) ((long)&_start - (long)entry_point);
    }
    fclose(file);
  }
}

/// print stacktrace
void bt() {
    static const int MAX_STACK = 30;
    void *array[MAX_STACK];
    auto size = backtrace(array, MAX_STACK);
    for (int i = 0; i < size; ++i) {
        printf("%p ", (long)array[i]-(long)base_addr );
    }
    printf("\n");
}

int main(int argc, char* argv[])
{
    read_elf_header(argv[0]);
    printf("&_start = %p\n",&_start);
    printf("base address is: %p\n", base_addr);
    bt();

    // elf header is also in memory, but to find it I have to already have base address
    Elf64_Ehdr * ehdr_addr = (Elf64_Ehdr *) base_addr;
    printf("Entry from memory: %p\n", (void *) ehdr_addr->e_entry);

    return 0;
}

Sample output:

Entry point from file: 0x10c0
&_start = 0x5648eeb150c0
base address is: 0x5648eeb14000
0x1321 0x13ee 0x29540f8ed09b 0x10ea 
Entry from memory:  0x10c0

And then I can

$ addr2line -e a.out 0x1321 0x13ee 0x29540f8ed09b 0x10ea
/tmp/elf2.c:30
/tmp/elf2.c:45
??:0
??:?

How can I get base address without access to argv? I may need to print traces before main() (initialization of globals). Turning of ASLR or PIE is not an option.

Anatomist answered 8/3, 2019 at 15:55 Comment(7)
What's your C library? Which link editor do you use?Dimeter
You can at least get the program name with getauxval(AT_EXECFN) on LinuxSarilda
Possibly dladdr?Sarilda
Oddly I can't find [quality] similar questions. It seems like this should have a canonical answer on Stack Overflow. Get the size of heap and stack per process in Linux and current_thread_info() inline function in Linux kernel? are of interest, but it is not as mature as it could be. I think another problem you may have is which stack? Do you want the primary stack, a shadow stack, or the stack of a different thread?Foreshow
@Foreshow good point, I have no idea. I want above pair bt() + addr2line to work on any threadAnatomist
If all you want is a stack trace then see How to automatically generate a stacktrace when my program crashes, How to get a stack trace for C++ using gcc with line number information? and friends. The stack trace is a different problem then the top of the stack and its address, however.Foreshow
Answers I found so far all need debug symbols present during execution. My solution didn't, but it broke down when we started doing ASLRAnatomist
H
6

How can I get base address without access to argv? I may need to print traces before main()

There are a few ways:

  1. If /proc is mounted (which it almost always is), you could read the ELF header from /proc/self/exe.
  2. You could use dladdr1(), as Antti Haapala's answer shows.
  3. You could use _r_debug.r_map, which points to the linked list of loaded ELF images. The first entry in that list corresponds to a.out, and its l_addr contains the relocation you are looking for. This solution is equivalent to dladdr1, but doesn't require linking against libdl.

Could you provide sample code for 3?

Sure:

#include <link.h>
#include <stdio.h>

extern char _start;
int main()
{
  uintptr_t relocation = _r_debug.r_map->l_addr;
  printf("relocation: %p, &_start: %p, &_start - relocation: %p\n",
         (void*)relocation, &_start, &_start - relocation);
  return 0;
}

gcc -Wall -fPIE -pie t.c && ./a.out
relocation: 0x555d4995e000, &_start: 0x555d4995e5b0, &_start - relocation: 0x5b0

Are both 2 and 3 equally portable?

I think they are about equally portable: dladdr1 is a GLIBC extension that is also present on Solaris. _r_debug predates Linux and would also work on Solaris (I haven't actually checked, but I believe it will). It may work on other ELF platforms as well.

Homeward answered 10/3, 2019 at 0:37 Comment(1)
Could you provide sample code for 3? Are both 2 and 3 equally portable?Anatomist
H
4

This piece of code produces the same value as your base_addr on Linux:

#define _GNU_SOURCE
#include <dlfcn.h>
#include <link.h>

Dl_info info;
void *extra = NULL;
dladdr1(&_start, &info, &extra, RTLD_DL_LINKMAP);
struct link_map *map = extra;
printf("%#llx", (unsigned long long)map->l_addr);

The dladdr1 manual page says the following of RTLD_DL_LINKMAP:

RTLD_DL_LINKMAP

Obtain a pointer to the link map for the matched file. The extra_info argument points to a pointer to a link_map structure (i.e., struct link_map **), defined in as:

  struct link_map {
      ElfW(Addr) l_addr;  /* Difference between the
                             address in the ELF file and
                             the address in memory */
      char      *l_name;  /* Absolute pathname where
                             object was found */
      ElfW(Dyn) *l_ld;    /* Dynamic section of the
                             shared object */
      struct link_map *l_next, *l_prev;
                          /* Chain of loaded objects */
      /* Plus additional fields private to the
         implementation */
  };

Notice that -ldl is required to link against the dynamic loading routines.

Haag answered 8/3, 2019 at 23:57 Comment(1)
I am not sure there is any advantage to using dladdr1 with RTLD_DL_LINKMAP over just using _r_debug directly.Homeward

© 2022 - 2024 — McMap. All rights reserved.