An alternative for the deprecated __malloc_hook functionality of glibc
Asked Answered
P

3

39

I am writing a memory profiler for C and for that am intercepting calls to the malloc, realloc and free functions via malloc_hooks. Unfortunately, these are deprecated because of their poor behavior in multi threaded environments. I could not find a document describing the alternative best practice solution to achieve the same thing, can someone enlighten me?

I've read that a simple #define malloc(s) malloc_hook(s) would do the trick, but that does not work with the system setup I have in mind, because it is too intrusive to the original code base to be suitable for use in a profiling / tracing tool. Having to manually change the original application code is a killer for any decent profiler. Optimally, the solution I am looking for should be enabled or disabled just by linking to an optional shared library. For example, my current setup uses a function declared with __attribute__ ((constructor)) to install the intercepting malloc hooks.

Thanks

Portend answered 23/7, 2013 at 7:2 Comment(2)
Was it deprecated? sourceware.org/ml/libc-alpha/2011-07/msg00136.html 2011 glibc malloc hook deprecation considered harmful. Hooks are still here: sourceware.org/git/?p=glibc.git;a=blob;f=malloc/hooks.c;hb=HEAD. Only __malloc_initialize_hook variable was marked as deprecated since glibc 2.24, check actual man man7.org/linux/man-pages/man3/malloc_hook.3.html, section NOTESOrientate
the manpage states that these functions are deprecated and only __malloc_initialize_hook was removed since then.Portend
P
62

After trying some things, I finally managed to figure out how to do this.

First of all, in glibc, malloc is defined as a weak symbol, which means that it can be overwritten by the application or a shared library. Hence, LD_PRELOAD is not necessarily needed. Instead, I implemented the following function in a shared library:

void*
malloc (size_t size)
{
  [ ... ]
}

Which gets called by the application instead of glibcs malloc.

Now, to be equivalent to the __malloc_hooks functionality, a couple of things are still missing.

1.) the caller address

In addition to the original parameters to malloc, glibcs __malloc_hooks also provide the address of the calling function, which is actually the return address of where malloc would return to. To achieve the same thing, we can use the __builtin_return_address function that is available in gcc. I have not looked into other compilers, because I am limited to gcc anyway, but if you happen to know how to do such a thing portably, please drop me a comment :)

Our malloc function now looks like this:

void*
malloc (size_t size)
{
  void *caller = __builtin_return_address(0);
  [ ... ]
}

2.) accessing glibcs malloc from within your hook

As I am limited to glibc in my application, I chose to use __libc_malloc to access the original malloc implementation. Alternatively, dlsym(RTLD_NEXT, "malloc") can be used, but at the possible pitfall that this function uses calloc on its first call, possibly resulting in an infinite loop leading to a segfault.

complete malloc hook

My complete hooking function now looks like this:

extern void *__libc_malloc(size_t size);

int malloc_hook_active = 0;

void*
malloc (size_t size)
{
  void *caller = __builtin_return_address(0);
  if (malloc_hook_active)
    return my_malloc_hook(size, caller);
  return __libc_malloc(size);
}

where my_malloc_hook looks like this:

void*
my_malloc_hook (size_t size, void *caller)
{
  void *result;

  // deactivate hooks for logging
  malloc_hook_active = 0;

  result = malloc(size);

  // do logging
  [ ... ]

  // reactivate hooks
  malloc_hook_active = 1;

  return result;
}

Of course, the hooks for calloc, realloc and free work similarly.

dynamic and static linking

With these functions, dynamic linking works out of the box. Linking the .so file containing the malloc hook implementation will result of all calls to malloc from the application and also all library calls to be routed through my hook. Static linking is problematic though. I have not yet wrapped my head around it completely, but in static linking malloc is not a weak symbol, resulting in a multiple definition error at link time.

If you need static linking for whatever reason, for example translating function addresses in 3rd party libraries to code lines via debug symbols, then you can link these 3rd party libs statically while still linking the malloc hooks dynamically, avoiding the multiple definition problem. I have not yet found a better workaround for this, if you know one,feel free to leave me a comment.

Here is a short example:

gcc -o test test.c -lmalloc_hook_library -Wl,-Bstatic -l3rdparty -Wl,-Bdynamic

3rdparty will be linked statically, while malloc_hook_library will be linked dynamically, resulting in the expected behaviour, and addresses of functions in 3rdparty to be translatable via debug symbols in test. Pretty neat, huh?

Conlusion

the techniques above describe a non-deprecated, pretty much equivalent approach to __malloc_hooks, but with a couple of mean limitations:

__builtin_caller_address only works with gcc

__libc_malloc only works with glibc

dlsym(RTLD_NEXT, [...]) is a GNU extension in glibc

the linker flags -Wl,-Bstatic and -Wl,-Bdynamic are specific to the GNU binutils.

In other words, this solution is utterly non-portable and alternative solutions would have to be added if the hooks library were to be ported to a non-GNU operating system.

Portend answered 25/7, 2013 at 6:15 Comment(7)
Have you had any issues using this technique with valgrind? I've seen strange problems when the two are combined.Goles
@meowsqueak I didn't try that, but valgrind tends to do strange things.Portend
Hey @AndreasGrapentin, I've used your method to write a general purpose memory heap checker called, for lack of a better name, MEM_debug. Thanks for sharing it!Prober
@Prober hey, thanks! I'm happy to hear something useful came out of this :)Portend
@AndreasGrapentin "in glibc, malloc is defined as a weak symbol", Which document seems to be written? I see the code(malloc.c) does not seem to be like this.Gassman
I think int malloc_hook_active = 0; should be int malloc_hook_active = 1;, otherwise my_malloc_hook will never be called.Marasmus
See also #71882926Sensibility
A
2

You can use LD_PRELOAD & dlsym See "Tips for malloc and free" at http://www.slideshare.net/tetsu.koba/presentations

Armored answered 23/7, 2013 at 8:5 Comment(3)
this is cool, I'll definitely try it. Although the need to set LD_PRELOAD explicitly bugs me.Portend
Also, this does not seem to work for statically linked binaries :(Portend
Unfortunately, dlsym calls calloc in certain cases.Kayceekaye
S
1

Just managed to NDK build code containing __malloc_hook.

Looks like it's been re-instated in Android API v28, according to https://android.googlesource.com/platform/bionic/+/master/libc/include/malloc.h, esp:

extern void* (*volatile __malloc_hook)(size_t __byte_count, const void* __caller) __INTRODUCED_IN(28);
Samalla answered 28/2, 2019 at 6:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.