Interposing part of a shared object by soname
Asked Answered
C

3

8

I’ve written a shared object that modifies the arguments to FreeType’s FT_Load_Glyph and FT_Render_Glyph functions, currently by interposing it with LD_PRELOAD and dlsym.

This works fine, but I’m curious to know whether or not there’s a way to make these changes:

  • to all programs that use FreeType on a given host (running e.g. Debian);
  • without clobbering any programs that aren’t actually linked to FreeType;
  • without simply applying an LD_PRELOAD to all programs on the host;
  • without requiring any maintenance unless FreeType’s soname is changed; and
  • without modifying any of FreeType’s files, nor those of any programs on the host.

The only two “solutions” that I’ve been able to come up with are ugly hacks:

  • to LD_PRELOAD all programs, all of the time, which seems slow and fragile; or
  • to copy e.g. libfreetype.so.6.12.3 to libxxxxtype.so.6.12.3; then
    • patch the soname in libxxxxtype.so.6.12.3 to libxxxxtype.so.6;
    • link the interposing shared object against libxxxxtype.so.6; and
    • install the shared object as e.g. libfreetype.so.6.999.

I’d essentially like to transparently patch a couple of functions in a shared object, while letting the remaining functions through, without necessarily having access to the source of the shared object or the programs that use it, but if I make a fake shared object with the soname libfreetype.so.6, I can’t see a clean way to link it to (or dlopen) the real libfreetype.so.6.

This is my first real experiment with shared libraries, so please bear with me if this question makes some incorrect assumptions, or just makes no sense.

Ced answered 10/5, 2016 at 21:0 Comment(2)
The solution based on renaming libfreetype.so.x.y.z seems to be the right way of doing this. Why do you describe it as ugly?Gonick
I think my reasons were (a) I would have to maintain a copy of the real libfreetype.so that had a patched soname, especially keeping it up to date when a new (or the same) version of the libfreetype6 package is installed, and (b) I would be polluting the global “sonamespace” with a libxxxxtype.so, which is at least theoretically fragile because it would be impossible to come up with a name that absolutely no library author would ever use. (a) is mitigated by glorpen’s answer (in exchange for relying on an absolute path), and (b) can be mitigated to the point where it’s only theoretical.Ced
B
3

Can you try to use uprobes to dynamically steal control from some functions?

Check http://www.brendangregg.com/blog/2015-06-28/linux-ftrace-uprobe.html

uprobes: user-level dynamic tracing, which was added to Linux 3.5 and improved in Linux 3.14. It lets you trace user-level functions; for example, the return of the readline() function from all running bash shells, with the returned string:

# ./uprobe 'r:bash:readline +0($retval):string'
Tracing uprobe readline (r:readline /bin/bash:0x8db60 +0($retval):string). Ctrl-C to end.
 bash-11886 [003] d... 19601837.001935: readline: (0x41e876 <- 0x48db60) arg1="ls -l"
 bash-11886 [002] d... 19601851.008409: readline: (0x41e876 <- 0x48db60) arg1="echo "hello world""
 bash-11886 [002] d... 19601854.099730: readline: (0x41e876 <- 0x48db60) arg1="df -h"
 bash-11886 [002] d... 19601858.805740: readline: (0x41e876 <- 0x48db60) arg1="cd .."
 bash-11886 [003] d... 19601898.378753: readline: (0x41e876 <- 0x48db60) arg1="foo bar"
^C
Ending tracing...

And http://www.brendangregg.com/blog/2015-07-03/hacking-linux-usdt-ftrace.html

There were also other solutions of tracing user-space functions, like ftrace, systemtap, dtrace, lttng. Some of them need recompilation and defining tracing points statically in the program; and uprobes are "user-level dynamic tracing".

Some links about uprobes:

There is handler of uprobes which has pt_regs. As said in last link: "Uprobes thus implements a mechanism by which a kernel function can be invoked whenever a process executes a specific instruction location." and it suggests that uprobes may replace some ptrace/gdb based solutions; so there is a possibility to change execution of any program hitting active uprobe, by changing its eip/rip (PC) register.

You may try some other dynamic instrumentation tools, like pin or dyninst; but they are designed for per-process usage.

Bicentenary answered 29/6, 2016 at 2:54 Comment(5)
uprobes look very interesting — I’ll give them a shot! All three of the answers to this question are on point, but I’ll give you the bounty because yours is about a technique that not only has nothing to do with ld.so, but it’s also one that I’ve never heard of before.Ced
I have no exact idea how to implement so interposing using uprobes; but Gregg is here stackoverflow.com/users/2603561/brendan-gregg and he may have some ideas. @Brendan Gregg, what do you think?Bicentenary
I’ve seen a lot of people @tag other users on Stack Overflow. Does that (or writing a link to their profile) even do anything in terms of notifications?Ced
meta.stackexchange.com/questions/43019/… (found with internet search - stackoverflow.com notify user). But there was no notification from my comment generated; they usually limited to users who were at the page (answered or commented). You can notify him.Bicentenary
Delan, just started experiments on uprobes: I was able to uprobe libc.so functions (readdir) on ubuntu 16.04 and get information of their usage from applications using [perf probe](http://linux.die.net/man/1/perf-probe)/trace-cmd. Doc says redhat.com/archives/utrace-devel/2009-June/msg00024.html "A probe handler can modify the environment of the probed function -- e.g., by modifying data structures, or by modifying the contents of the pt_regs struct ... So Uprobes can be used, for example, to install a bug fix or to inject faults for testing". stap (systemtap) may help tooBicentenary
B
2

Another solution would be to make system wide "overlay" for lib, with custom libfreetype and then proxying unmodified methods to real lib.

You have to make custom lib compatible with real one. You can to that by using dlopen with absolute path (eg. dlopen("/usr/lib64/libfreetype.so.6")), copying definitions of real, exported functions and proxying them with dlsym. It think that for ease of maintenance you could event replace proxied argument types with simple void*. You would only need to make changes when freetype functions change (arguments count, function names).

To create lib "overlay", you could install custom lib into eg. "/opt/myapp/lib64/libfreetype.so.6", then add this path to dynamic linker run time paths. You may have to create symlinks for other versions or compile new custom lib if original implementation changes. Whatever is needed to shadow real lib and keep other apps working :)

Google says that to change run time loading paths on Debian you have to simply edit /etc/ld.so.conf. Add /opt/myapp/lib64 path at the beginning so it will be checked first. Now any app searching for freetype should load your lib, you can check it with ldd <path to app>.

I can think of just one case when this solution will not work: if app is loading bundled libfreetype or loading it by full path, not by name.

Burglar answered 28/6, 2016 at 10:2 Comment(1)
Fair enough! I thought I had tried that already, and dlopen(3) did nothing when opening a library with the same soname, but I haven’t touched this project in over six weeks, so I could be making that up. It would be nice if dlopen(3) provided a way to open a library by soname “from the next path in the resolution order to the last one”, so that I don’t have to use an absolute path, but by that point I’d be scraping at the bottom of the barrel for ways to critique your approach.Ced
O
2

to LD_PRELOAD all programs, all of the time, which seems slow and fragile

That's a good solution (for what you want). I don't see a better one.

  • It's not fragile. It provides information to the runtime linker in a documented way. You're not bonking on anything, pretending something isn't what it is. You're just altering the preference hierarchy for function-name resolution.

  • It's not slow. The linker has to do something sometime. It's got to check if LD_PRELOAD is defined, which in any case is a user-space operation. So it will follow that path, and load your library before doing a bunch of other work. I'd be astonished if the time was even measurable under normal circumstances.

There are two concerns I'd have, but they're orthogonal to the technique. The code actually has to work in all cases, and you have to dig into the process-creation framework a bit to make sure LD_PRELOAD really is defined everywhere. Other than that, ld.so defines its environment variables precisely for your intended use. Who's to argue?

Ourself answered 29/6, 2016 at 2:49 Comment(2)
You’re right — I’m glad I covered my rear with “seems”, because I didn’t get around to benchmarking execution times to get a concrete idea of the time it takes to load a library. My only other aversion was the thought that I’d be polluting the global namespace (as in “oh no, now every program has a FT_Load_Glyph!”), but I think that only matters if a program is (a) relying on the lack of existence of a FT_Load_Glyph to operate correctly, or (b) relying on having a FT_Load_Glyph that has nothing to do with FreeType — both of which seem like very contrived scenarios.Ced
As for your concerns with the approach, I’ve been using /etc/ld.so.preload to avoid looking up how to configure PAM (or systemd or …) to globally define an environment variable, though I’m pretty sure that those are no more “standard” (in a POSIX sense) than /etc/ld.so.preload, which seems to be a glibc specific feature. What do you mean by making sure that the code “[works] in all cases”?Ced

© 2022 - 2024 — McMap. All rights reserved.