Why won't LD_PRELOAD work with Python?
Asked Answered
B

3

9

Using function interposition for open() with Python doesn't seem to work after the first few calls. I suspect Python is doing some kind of initialization, or something is temporarily bypassing my function.

Here the open call is clearly hooked:

$ cat a
hi
$ LD_PRELOAD=./libinterpose_python.so cat a
sandbox_init()
open()
hi

Here it happens once during Python initialization:

$ LD_PRELOAD=./libinterpose_python.so python
sandbox_init()
Python 2.7.2 (default, Jun 12 2011, 20:20:34) 
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
open()
>>> 
sandbox_fini()

Here it doesn't happen at all, and there's no error to indicate the file handle had write privileges removed:

$ LD_PRELOAD=./libinterpose_python.so python3 -c 'b = open("a", "w"); b.write("hi\n"); b.flush()'
sandbox_init()
sandbox_fini()

The code is here. Build with make -f Makefile.interpose_python.

A full solution is given here.

Barbour answered 21/6, 2011 at 7:26 Comment(12)
One question, though this gets you no closer to solving your problem... Why don't you set up next_open in sandbox_init?Groundspeed
Is it possible that Python is statically compiled?Pasta
@Omnifarious: I was so paranoid I was doing something I basically copied verbatim an example from the net. I definitely intended to do it that way however.Barbour
@X-Istence: If that's the default mode for Python, then you've nailed it, but it seems unlikely. Not much statically compiles libc, but I'll check.Barbour
@Matt Joiner: I was just thinking out loud =). Seems @zvrba has figured it out =)Pasta
Linux really needs a way to interpose your own system call handling layer when launching a process. I've come to the conclusion that the system call API is a singleton with all the attendent headaches and security risks.Groundspeed
@Omnifarious: What do you mean? You want to interpose without using LD_PRELOAD?Barbour
@Matt Joiner: Yes. LD_PRELOAD is an unreliable way to interpose. Someone could just invoke the system call directly using the appropriate assembly instructions. I want OSes to be more capability based. A program won't have access to anything it wasn't given by its runtime environment, and that's enforced at the OS level.Groundspeed
@Omnifarious: Ptrace... Also I've seen sandboxing that hooks the system call interface somehow.Barbour
@Omnifarious: And here is a project that uses it, bathe in teh glory: fakeroot-ng.lingnu.com/index.php/PTRACE_LD_PRELOAD_comparisonBarbour
@MattJoiner: Your Solution section should go in an Answer.Johnnie
@bukzor: Done, thanks. https://mcmap.net/q/1193912/-why-won-39-t-ld_preload-work-with-pythonBarbour
P
8

There are open() and open64() functions, you might need to redefine both.

Pathless answered 21/6, 2011 at 10:33 Comment(1)
This is totally going to be what it is!Barbour
C
2

You should be able to find out what your python process is actually doing by running it under strace (probably without your pre-load).

My python3.1 (on AMD64) does appear to use open:

axa@ares:~$ strace python3.1 -c 'open("a","r+")'
...
open("a", O_RDWR)                       = -1 ENOENT (No such file or directory)
Caulfield answered 21/6, 2011 at 10:44 Comment(5)
Curiously, it also attempts to open a file called <string>, and attempts to use it as a TTY if it exists...Caulfield
I've actually tried this already, it still uses open as expected, none of my changes appear to be made either. They are made however on some other programs (and amusing crashes abound).Barbour
That sounds extremly strange; mine doesn't. You should have that looked at.Pintsize
Strace shows system calls and LD_PRELOAD affects library function. They are often quite related (the open() libc function makes the open() syscall), but there is nothing that could prevent the open() syscall being called by any other library function. E.g. opendir() uses the open() syscall too.Sidewheel
@Jacek Konieczny: That explains a lot. The open64 calls are converted into system calls as well. This is why the open64 library call maps to the open system call.Barbour
B
1

It turns out there is an open64() function:

$ objdump -T /lib32/libc.so.6  | grep '\bopen'
00064f10 g    DF .text  000000fc  GLIBC_2.4   open_wmemstream
000cc010 g    DF .text  0000007b  GLIBC_2.0   openlog
000bf6d0  w   DF .text  000000b6  GLIBC_2.1   open64
00094460  w   DF .text  00000055  GLIBC_2.0   opendir
0005f9b0 g    DF .text  000000d9  GLIBC_2.0   open_memstream
000bf650  w   DF .text  0000007a  GLIBC_2.0   open
000bf980  w   DF .text  00000081  GLIBC_2.4   openat
000bfb90  w   DF .text  00000081  GLIBC_2.4   openat64

The open64() function is a part of the large file extensions, and is equivalent to calling open() with the O_LARGEFILE flag.

Running the example code with the open64 section uncommented gives:

$ LD_PRELOAD=./libinterpose_python.so python3 -c 'b = open("a", "w"); b.write("hi\n"); b.flush()'
sandbox_init()
open64()
open64()
open64()
Traceback (most recent call last):
  File "<string>", line 1, in <module>
open64()
open64()
open64()
open64()
open64()
open64()
open64()
IOError: [Errno 9] Bad file descriptor
sandbox_fini()

Which clearly shows all of Python's open calls, and several propagated errors due to the write flag being stripped from the calls.

Barbour answered 18/12, 2011 at 4:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.