GNU gcc/ld - wrapping a call to symbol with caller and callee defined in the same object file
Asked Answered
G

8

47

to clarify, my question refers to wrapping/intercepting calls from one function/symbol to another function/symbol when the caller and the callee are defined in the same compilation unit with the GCC compiler and linker.

I have a situation resembling the following:

/* foo.c */
void foo(void)
{
  /* ... some stuff */
  bar();
}

void bar(void)
{
  /* ... some other stuff */
}

I would like to wrap calls to these functions, and I can do that (to a point) with ld's --wrap option (and then I implement __wrap_foo and __wrap_bar which in turn call __real_foo and __real_bar as expected by the result of ld's --wrap option).

gcc -Wl,--wrap=foo -Wl,--wrap=bar ...

The problem I'm having is that this only takes effect for references to foo and bar from outside of this compilation unit (and resolved at link time). That is, calls to foo and bar from other functions within foo.c do not get wrapped.

calls from within the compilation unit get resolved before the linker's wrapping

I tried using objcopy --redefine-sym, but that only renames the symbols and their references.

I would like to replace calls to foo and bar (within foo.o) to __wrap_foo and __wrap_bar (just as they get resolved in other object files by the linker's --wrap option) BEFORE I pass the *.o files to the linker's --wrap options, and without having to modify foo.c's source code.

That way, the wrapping/interception takes place for all calls to foo and bar, and not just the ones taking place outside of foo.o.

Is this possible?

Greer answered 19/12, 2012 at 21:45 Comment(13)
You could probably solve your problem with find/replace in your editor, or using sed...Kornegay
Are you suggesting to simply hack the obj with an editor?Greer
I'm suggesting you bulk-modify the source code to replace the calls to the function with those to a wrapper, or with something that you can #define to be either the real function or the wrapper.Kornegay
Ok, so how would I go about using sed and an editor to modify the object file so the calls to, say, foo, get replaced with an offset to a symbol, say __real_foo that will be resolved later by the linker? I ask in earnest btw.Greer
I'd recommend that you modify the source code, not the object file.Kornegay
If you must do it to the object file, you'd probably need to over-write the start of the function with a call to a some wrapping logic, but this would requiring understanding the platform-specific function call, register save, etc sequence and hoping that it doesn't change. Just a find-and-replace on address won't work since they are often relative - you could pattern match whatever call instructions you think the compiler will use, work out their targets and change them, but this gets ugly fast.Kornegay
Sorry if I was being sarcastic (frustration got the best of me). If there are not ready-made tools like this (like objcopy), them I'm afraid I will have to follow this route (I will have to decide if the ROI is sufficient to justifying going this way.) Thanks.Greer
If you can modify the source code / build commands to implement the sort of fix you were hoping for, why can't you simply solve it at the level of the function name in the source? Or move the function to its own compilation unit?Kornegay
Contractual/process/red-tape problems. We need to perform black-box testing of a subsystem A to be linked with another subsystem B (with the later to be linked as-is, in pre-compiled form). And we need to get some tracing of calls a bit different from what we can get with gprof or callgrind. Changing the source is easy, but so procedural/red-tape painful that I'm actually considering if it is worth the trouble of hacking the objs in a manner that is automated and cheap. Let's just say that is not the type of thing normal-thinking people would do under a sensible, normal-looking process :PGreer
I'm not sure I see the difference between a script which automatically alters a working copy of the source and one that does a much harder to prove out modification of the object. #618054 presents some variations. If it's just for profiling, can you do something with breakpoint debugger functionality?Kornegay
I agree with you. It's just one of those client/contract combo that contractually demands things done a certain way even when it makes no sense :/ The debugger option might be a possible way to go (break at given points, print the stack, resume...)Greer
This is not exactly what you asked, but I came here looking for a slightly different problem: How do I replace a function in an already compiled object file so that callers inside the existing object file refer to a new function from another file? The answer is to use objcopy --weaken-symbol=called_function and link with a new object that defines called_function().Diarrhoea
It's interesting if someone managed to achieve the goal using --wrap? I din't. But I found that the goal may be achieved using a run-time function wrapping using LD_PRELOAD run-time function replacement technique.Camara
H
23

You have to weaken and globalize the symbol using objcopy.

-W symbolname
--weaken-symbol=symbolname
    Make symbol symbolname weak. This option may be given more than once.
--globalize-symbol=symbolname
    Give symbol symbolname global scoping so that it is visible outside of the file in which it is defined. This option may be given more than once.

This worked for me

bar.c:

#include <stdio.h>
int foo(){
  printf("Wrap-FU\n");
}

foo.c:

#include <stdio.h>

void foo(){
printf("foo\n");
}

int main(){
printf("main\n");
foo();
}

Compile it

$ gcc -c foo.c bar.c 

Weaken the foo symbol and make it global, so it's available for linker again.

$ objcopy foo.o --globalize-symbol=foo --weaken-symbol=foo foo2.o

Now you can link your new obj with the wrap from bar.c

$ gcc -o nowrap foo.o #for reference
$ gcc -o wrapme foo2.o bar.o

Test

$ ./nowrap 
main
foo

And the wrapped one:

$ ./wrapme 
main
Wrap-FU
Haubergeon answered 7/9, 2017 at 19:28 Comment(1)
I tried this trick in the following case: 1- I have an SDK for an embedded platform which has a function I need to replace by another deceleration. 2- I made the symbol weak and global again from the object file in the target library using gcc-objcopy after compilation. The problem that building process include making an archive file (called core.a) which include the old library object file. 3- I added a step to delete the object file and replace it with the new one (with weak symbol) using gcc-ar from cora.a. As a result The trick didn't succeed (multiple definition of ..) Help?Unchartered
V
10

You can use __attribute__((weak)) before the implementation of the callee in order to let someone reimplement it without GCC yelling about multiple definitons.

For example suppose you want to mock the world function in the following hello.c code unit. You can prepend the attribute in order to be able to override it.

#include "hello.h"
#include <stdio.h>

__attribute__((weak))
void world(void)
{
    printf("world from lib\n");
}

void hello(void)
{
    printf("hello\n");
    world();
}

And you can then override it in another unit file. Very useful for unit testing/mocking:

#include <stdio.h>
#include "hello.h"

/* overrides */
void world(void)
{
    printf("world from main.c"\n);
}

void main(void)
{
    hello();
    return 0;
}
Venn answered 9/11, 2017 at 12:5 Comment(2)
That's a nice idea. Will use next time. Unfortunately, at the time I asked the question, I was dealing with software that I could not modify to add such an attribute. This is good, however, and will certainly use in my toolbox in the future.Greer
Well yes, if you cannot modify the source then @PeterHuewe's answer is the solution using objcpy. If you can modify the source then this one seems easier to set up.Venn
D
7
#include <stdio.h>
#include <stdlib.h>

//gcc -ggdb -o test test.c -Wl,-wrap,malloc
void* __real_malloc(size_t bytes);

int main()
{
   int *p = NULL;
   int i = 0;

   p = malloc(100*sizeof(int));

   for (i=0; i < 100; i++)
       p[i] = i;

   free(p);
   return 0;
}

void* __wrap_malloc(size_t bytes)
{
      return __real_malloc(bytes);
}

And then just compile this code and debug. When you call the reall malloc, the function called will __wrap_malloc and __real_malloc will call malloc.

I think this is the way to intercept the calls.

Basically its the --wrap option provided by ld.

Drupe answered 27/2, 2015 at 22:23 Comment(2)
I know this option. It is pretty much what I use. This does not work in the scenario I mentioned. See my original question again.Greer
The example in this answer shows how to use --wrap, but it does not show the case where the wrapped function (malloc in this case), is defined in the same compilation unit as the call, which is the core of the original question. So it's not really an answer to the question and I'll downvote this answer.Haynes
R
6

This appears to be working as documented:

 --wrap=symbol
       Use a wrapper function for symbol. 
       Any undefined reference to symbol will be resolved to "__wrap_symbol". ...

Note the undefined above. When the linker processes foo.o, the bar() is not undefined, so the linker does not wrap it. I am not sure why it's done that way, but there probably is a use case that requires this.

Rocker answered 23/12, 2012 at 15:43 Comment(3)
I use this to wrap calls across compilation units (see my original question for an example). However, it does not work for intercept/wrap alls from within compilation units (which is what I'm interested in intercepting.) Apparently, within the compilation units, the references are resolved. By the time the linker comes in, it is already too late to wrap those calls using the --wrap linker option.Greer
@Greer "it is already too late" -- no, it isn't. The linker could easily change the call target; it just doesn't (for reasons I don't know).Rocker
Well, when I say "it is too late", I say so within the context of GNU ld (not within the context of linkers in general.) Yes, a linker could easily change that call target. But the linker in question (GNU ld) does not. And the reason is that it limits itself to replace/rewrite the references that are not resolved within the compilation unit. It is because of that last step that I say the linking stage is already too late for GN ld (though it would not be too late for a smarter linker.)Greer
H
5

I have tried the solution from @PeterHuewe and it works but it doesn't allow to call the original function from the wrapper. To allow this my solution is the following:

foo.c


#include <stdio.h>

void foo(){
    printf("This is real foo\n");
}

int main(){
    printf("main\n");
    foo();
}

foo_hook.c

#include <stdio.h>

void real_foo();

int foo(){
  printf("HOOK: BEFORE\n");
  real_foo();
  printf("HOOK: AFTER\n");
}

Makefile

all: link

link: hook
    gcc -o wo_hook foo.o
    gcc -o w_hook foo_hooked.o foo_hook.o

hook: build_o
    objcopy \
    foo.o \
    --add-symbol real_foo=.text:$(shell  objdump -t foo.o | grep foo | grep .text | cut -d' ' -f 1),function,global \
    --globalize-symbol=foo \
    --weaken-symbol=foo \
    foo_hooked.o

build_o:
    gcc -c foo.c foo_hook.c

clean:
    -rm w_hook wo_hook *.o

Example

virtualuser@virtualhost:~/tmp/link_time_hook$ make
gcc -c foo.c foo_hook.c
objcopy foo.o \
--add-symbol real_foo=.text:0000000000000000,function,global \
--globalize-symbol=foo \
--weaken-symbol=foo \
foo_hooked.o
gcc -o wo_hook foo.o
gcc -o w_hook foo_hooked.o foo_hook.o
virtualuser@virtualhost:~/tmp/link_time_hook$ ls
Makefile  foo.c  foo.o  foo_hook.c  foo_hook.o  foo_hooked.o  w_hook  wo_hook
virtualuser@virtualhost:~/tmp/link_time_hook$ ./w_hook
main
HOOK: BEFORE
This is real foo
HOOK: AFTER
virtualuser@virtualhost:~/tmp/link_time_hook$
virtualuser@virtualhost:~/tmp/link_time_hook$ ./wo_hook
main
This is real foo
virtualuser@virtualhost:~/tmp/link_time_hook$
Hunkydory answered 9/3, 2020 at 16:1 Comment(2)
Thanks! I haven't touched this problem in ages :)Greer
This script has a bug since it was tested with only a function at 0. Namely objcopy will interpret the value as decimal while objdump gives hexidecimal so "0x" must be prepended e.g. --add-symbol real_foo=.text:0x$(shell objdump -t foo.o | grep foo | grep .text | cut -d' ' -f 1),function,global and --add-symbol real_foo=.text:0x0000000000000000,function,global would make this function outside of this special zero case.Braxy
J
4

You can achieve what you want if you use --undefined with --wrap

  -u SYMBOL, --undefined SYMBOL
                              Start with undefined reference to SYMBOL
Judsonjudus answered 18/11, 2013 at 2:43 Comment(1)
Where would you add this option? Could you maybe show a more complete example? I did a quick try adding -u bar on the linker commandline along with -Wl,--wrap=bar, but that did not seem to change anything? It probably makes foo undefined at the start, but not inside foo.c...Haynes
S
1

Although this thread was initiated over 11 years ago, I wanted to share a solution for anyone who might encounter a similar issue. I've wrote a tool that can wrap all symbols, regardless of whether they are undefined in the current compilation unit or not.

The tool is doing it utilizing the LIEF Python module (https://github.com/lief-project/LIEF), and modifying symbols and relocations in the object files.

Here the link to the Github Repo: https://github.com/wafgo/WrapMaster/tree/master

Saari answered 10/4 at 12:59 Comment(0)
G
0

With linker

$ /usr/bin/ld --version
GNU ld (GNU Binutils for Ubuntu) 2.30

I was able to solve the problem using the defsym option:

--defsym SYMBOL=EXPRESSION  Define a symbol`

Instead of

gcc -Wl,--wrap=foo -Wl,--wrap=bar ...

try

gcc -Wl,--defsym,foo=__wrap_foo -Wl,--defsym,bar=__wrap_bar ...

I did not also try to define __real_* symbols.

Gallinaceous answered 11/8, 2020 at 2:40 Comment(1)
Interesting, it seems like --defsym just allows overriding existing symbols in the .o files (i.e. there's no multiple definition error here from the --defsym and the foo defined in the .o file). It seems like --defsym is essentially handled the same as assignments in the linker script, which may behave the same. However, I believe this approach does not allow also defining __real_* symbols: As soon as you override the foo symbol, I think you'll loose access to the original symbol...Haynes

© 2022 - 2024 — McMap. All rights reserved.