C extensions - how to redirect printf to a python logger?
Asked Answered
D

3

7

I have a simple C-extension(see example below) that sometimes prints using the printf function. I'm looking for a way to wrap the calls to the function from that C-extensions so that all those printfs will be redirected to my python logger.

hello.c:

#include <Python.h>

static PyObject* hello(PyObject* self)
{
   printf("example print from a C code\n");
   return Py_BuildValue("");
}

static char helloworld_docs[] =
   "helloworld(): Any message you want to put here!!\n";

static PyMethodDef helloworld_funcs[] = {
   {"hello", (PyCFunction)hello,
   METH_NOARGS, helloworld_docs},
   {NULL}
};

static struct PyModuleDef cModPyDem =
{
    PyModuleDef_HEAD_INIT,
    "helloworld",
    "Extension module example!",
    -1,
    helloworld_funcs
};

PyMODINIT_FUNC PyInit_helloworld(void)
{
    return PyModule_Create(&cModPyDem);
};

setup.py:

from distutils.core import setup, Extension
setup(name = 'helloworld', version = '1.0',  \
   ext_modules = [Extension('helloworld', ['hello.c'])])

to use first run

python3 setup.py install

and then:

import helloworld
helloworld.hello()

I want to be able to do something like this:

with redirect_to_logger(my_logger)
   helloworld.hello()

EDIT: I saw a number of posts showing how to silence the prints from C, but I wasn't able to figure out from it how can I capture the prints in python instead. Example of such post: Redirect stdout from python for C calls

I assume that this question didn't get much traction because I maybe ask too much, so I don't care about logging anymore... how can I capture the C prints in python? to a list or whatever.

EDIT So I was able to achieve somewhat a working code that does what I want - redirect c printf to python logger:

import select
import threading
import time
import logging
import re

from contextlib import contextmanager

from wurlitzer import pipes
from helloworld import hello


logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
ch.setFormatter(formatter)
logger.addHandler(ch)

    

class CPrintsHandler(threading.Thread):
    def __init__(self, std, poll_std, err, poll_err, logger):
        super(CPrintsHandler, self).__init__()
        self.std = std
        self.poll_std = poll_std
        self.err = err
        self.poll_err = poll_err
        self.logger = logger
        self.stop_event = threading.Event()

    def stop(self):
        self.stop_event.set()

    def run(self):
        while not self.stop_event.is_set():
            # How can I poll both std and err at the same time?
            if self.poll_std.poll(1):
                line = self.std.readline()
                if line:
                    self.logger.debug(line.strip())

            if self.poll_err.poll(1):
                line = self.err.readline()
                if line:
                    self.logger.debug(line.strip())


@contextmanager
def redirect_to_logger(some_logger):
    handler = None
    try:
        with pipes() as (std, err):
            poll_std = select.poll()
            poll_std.register(std, select.POLLIN)
            poll_err = select.poll()
            poll_err.register(err, select.POLLIN)
            handler = CPrintsHandler(std, poll_std, err, poll_err, some_logger)
            handler.start()
            yield
    finally:
        if handler:
            time.sleep(0.1) # why do I have to sleep here for the foo prints to finish?
            handler.stop()
            handler.join()


def foo():
    logger.debug('logger print from foo()')
    hello()


def main():
    with redirect_to_logger(logger):
        # I don't want the logs from here to be redirected as well, only printf.
        logger.debug('logger print from main()')
        foo()


main()

But I have a couple of issues:

  1. The python logs are also being redirected and caught by the CPrintsHandler. Is there a way to avoid that?

  2. The prints are not exactly in the correct order:

    python3 redirect_c_example_for_stackoverflow.py

    2020-08-18 19:50:47,732 - root - DEBUG - example print from a C code

    2020-08-18 19:50:47,733 - root - DEBUG - 2020-08-18 19:50:47,731 - root - DEBUG - logger print from main()

    2020-08-18 19:50:47,733 - root - DEBUG - 2020-08-18 19:50:47,731 - root - DEBUG - logger print from foo()

Also, the logger prints all go to err, perhaps the way I poll them causes this order.

  1. I'm not that familiar with select in python and not sure if there is a way to poll both std and err at the same time and print whichever has something first.
Dimmer answered 27/7, 2020 at 10:0 Comment(2)
It's very easy to redirect Python stdout but that doesn't help you here. You can redirect C stdout to a file fairly easily but I think redirecting them to somewhere in memory is quite a bit harder. Are you using Windows/Mac/Linux - I suspect there are workable single-platform answers?Hafnium
I'm using Linux.Dimmer
E
5

On Linux you could use wurlitzer which would capture the output from fprint, e.g.:

from wurlitzer import pipes
with pipes() as (out, err):
    helloworld.hello()
out.read()
#'example print from a C code\n'

wurlitzer is based on this article of Eli Bendersky, the code from which you can use if you don't like to depend on third-party libraries.

Sadly, wurlitzer and the code from the article work only for Linux (and possible MacOS).

Here is a prototype (an improved version of the prototype can be installed from my github) for Windows using Eli's approach as Cython-extension (which probably could be translated to ctypes if needed):

%%cython

import io
import os

cdef extern from *:
    """
    #include <windows.h>
    #include <io.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <fcntl.h>

    int open_temp_file() {
        TCHAR lpTempPathBuffer[MAX_PATH+1];//path+NULL

        //  Gets the temp path env string (no guarantee it's a valid path).
        DWORD dwRetVal = GetTempPath(MAX_PATH,          // length of the buffer
                                     lpTempPathBuffer); // buffer for path 
        if(dwRetVal > MAX_PATH || (dwRetVal == 0))
        {
            return -1;
        }

        //  Generates a temporary file name. 
        TCHAR szTempFileName[MAX_PATH + 1];//path+NULL
        DWORD uRetVal = GetTempFileName(lpTempPathBuffer, // directory for tmp files
            TEXT("tmp"),     // temp file name prefix 
            0,                // create unique name 
            szTempFileName);  // buffer for name 
        if (uRetVal == 0)
        {
            return -1;
        }

        HANDLE tFile = CreateFile((LPTSTR)szTempFileName, // file name 
                GENERIC_READ | GENERIC_WRITE,      // first we write than we read 
                0,                    // do not share 
                NULL,                 // default security 
                CREATE_ALWAYS,        // overwrite existing
                FILE_ATTRIBUTE_TEMPORARY | FILE_FLAG_DELETE_ON_CLOSE, // "temporary" temporary file, see https://learn.microsoft.com/en-us/archive/blogs/larryosterman/its-only-temporary
                NULL);                // no template 

        if (tFile == INVALID_HANDLE_VALUE) {
            return -1;
        }

        return _open_osfhandle((intptr_t)tFile, _O_APPEND | _O_TEXT);
    }

    int replace_stdout(int temp_fileno)
    {
        fflush(stdout);
        int old;
        int cstdout = _fileno(stdout);

        old = _dup(cstdout);   // "old" now refers to "stdout"
        if (old == -1)
        {
            return -1;
        }
        if (-1 == _dup2(temp_fileno, cstdout))
        {
            return -1;
        }
        return old;
    }

    int restore_stdout(int old_stdout){
        fflush(stdout);

        // Restore original stdout
        int cstdout = _fileno(stdout);
        return _dup2(old_stdout, cstdout);
    }
    
    
    void rewind_fd(int fd) {
        _lseek(fd, 0L, SEEK_SET);
    }
    """
    int open_temp_file()
    int replace_stdout(int temp_fileno)
    int restore_stdout(int old_stdout)
    void rewind_fd(int fd)
    void close_fd "_close" (int fd)
    
cdef class CStdOutCapture():
    cdef int tmpfile_fd
    cdef int old_stdout_fd
    def start(self): #start capturing
        self.tmpfile_fd = open_temp_file()
        self.old_stdout_fd = replace_stdout(self.tmpfile_fd)
    
    def stop(self): # stops capturing, frees resources and returns the content
        restore_stdout(self.old_stdout_fd)
        rewind_fd(self.tmpfile_fd) # need to read from the beginning
        buffer = io.TextIOWrapper(os.fdopen(self.tmpfile_fd, 'rb'))
        result = buffer.read()
        close_fd(self.tmpfile_fd)
        return result
           

And now:

b = CStdOutCapture()
b.start()
helloworld.hello()
out = b.stop()
print("HERE WE GO:", out)
# HERE WE GO: example print from a C code
Ephemeron answered 9/8, 2020 at 16:13 Comment(0)
E
1

This is what I would do if I am free to edit the C code. Open a memory map in C and write to its file descriptor using fprintf(). Expose the file descriptor to Python either as the int and then use mmap module to open it or use os.openfd() to wrap it in a simpler file-like object, or wrap it in file-like object in C and let Python use that.

Then I would create a class that will enable me to write to sys.stdout through usual interface, i.e. its write() method (for Python's side usage) , and that would use select module to poll the file from C that acts as its stdout in a thread. Then I would switch sys.stdout with an object of this class. So, when Python does sys.stdout.write(...) the string will be redirected to sys.stdout.write(), and when the loop in a thread detects output on a file from C, it will write it using sys.stdout.write(). So, everything will be written to the screen and be available to loggers as well. In this model, the strictly C part will never actually be writing to the file descriptor connected to the terminal.

You can even do much of this in C itself and leave little for the Python's side, but its easier to influence the interpreter from the Python's side as the extension is the shared library which involves some kind of, lets call it, IPC and OS in the whole story. That's why the stdout is not shared between extension and Python in the first place.

If you want to continue printf() on C side, you can see how you can redirect it in C before programming this whole mess.

This answer is strictly theoretical because I have no time to test it; but it should be doable according to my knowledge. If you try it, please let me know in a comment how it went. Perhaps I missed something, but, I am certain the theory is sane. Beauty of this idea is that it will be OS independent, although the part with shared memory or connecting a file descriptor to allocated space in RAM can be sometimes PITA on Windows.

Exeat answered 11/8, 2020 at 14:37 Comment(0)
M
1

If you are not constrained to using the printf in C, it would be easier to use the print equivalent from python C API and pass where you want to redirect the message as an argument.

For example, your hello.c would be:

#include <Python.h>

static PyObject* hello(PyObject* self, PyObject *args)
{
    PyObject *file = NULL;
    if (!PyArg_ParseTuple(args, "O", &file))
        return NULL;
    PyObject *pystr = PyUnicode_FromString("example print from a C code\n");
    PyFile_WriteObject(pystr, file, Py_PRINT_RAW);
   return Py_BuildValue("");                                                                     
}

static char helloworld_docs[] =
   "helloworld(): Any message you want to put here!!\n";

static PyMethodDef helloworld_funcs[] = { 
   {"hello", (PyCFunction)hello,
   METH_VARARGS, helloworld_docs},
   {NULL}
};

static struct PyModuleDef cModPyDem =
{
    PyModuleDef_HEAD_INIT,
    "helloworld",
    "Extension module example!",
    -1, 
    helloworld_funcs
};

PyMODINIT_FUNC PyInit_helloworld(void)
{
    return PyModule_Create(&cModPyDem);
};

We can check if it is working with the program below:

import sys
import helloworld

helloworld.hello(sys.stdout)
helloworld.hello(sys.stdout)

helloworld.hello(sys.stderr)

In the command line we redirect each output separately:

python3 example.py 1> out.txt 2> err.txt

out.txt will have two print calls, while err.txt will have only one, as expected from our python script.

You can check python's print implementation to get some more ideas of what you can do.

cpython print source code

Meathead answered 11/8, 2020 at 16:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.