Make executable file from multiple pyx files using cython
Asked Answered
P

2

5

I am trying to make one unix executable file from my python source files.

I have two file, p1.py and p2.py

p1.py :-

from p2 import test_func 
print (test_func())

p2.py :-

def test_func():
    return ('Test')

Now, as we can see p1.py is dependent on p2.py . I want to make an executable file by combining two files together. I am using cython.

I changed the file names to p1.pyx and p2.pyx respectively.

Now, I can make file executable by using cython,

cython p1.pyx --embed

It will generate a C source file called p1.c . Next we can use gcc to make it executable,

gcc -Os -I /usr/include/python3.5m -o test p1.c -lpython3.5m -lpthread -lm -lutil -ldl 

But how to combine two files into one executable ?

Publicist answered 24/10, 2018 at 1:28 Comment(0)
C
3

There are some loops you have to jump through to make it work.

First, you must be aware that the resulting executable is a very slim layer which just delegates the whole work to (i.e. calls functions from) pythonX.Ym.so. You can see this dependency when calling

ldd test
...
libpythonX.Ym.so.1.0 => not found
...

So, to run the program you either need to have the LD_LIBRARY_PATH showing to the location of the libpythonX.Ym.so or build the exe with --rpath option, otherwise at the start-up of test dynamic loader will throw an error similar to

/test: error while loading shared libraries: libpythonX.Ym.so.1.0: cannot open shared object file: No such file or directory

The generic build command would look like following:

gcc -fPIC <other flags> -o test p1.c -I<path_python_include> -L<path_python_lib> -Wl,-rpath=<path_python_lib> -lpython3.6m <other_needed_libs>

It is also possible to build against static version of the python-library, thus eliminating run time dependency on the libpythonX.Ym, see for example this SO-post.


The resulting executable test behaves exactly the same as if it were a python-interpreter. This means that now, test will fail because it will not find the module p2.

One simple solution were to cythonize the p2-module inplace (cythonize p2.pyx -i): you would get the desired behavior - however, you would have to distribute the resulting shared-object p2.so along with test.

It is easy to bundle both extension into one executable - just pass both cythonized c-files to gcc:

# creates p1.c:
cython --empbed p1.pyx
# creates p2.c:  
cython p2.pyx
gcc ... -o test p1.c p2.c ...

But now a new (or old) problem arises: the resulting test-executable cannot once again find the module p2, because there is no p2.py and no p2.so on the python-path.

There are two similar SO questions about this problem, here and here. In your case the proposed solutions are kind of overkill, here it is enough to initialize the p2 module before it gets imported in the p1.pyx-file to make it work:

# making init-function from other modules accessible:
cdef extern  object PyInit_p2();

#init/load p2-module manually
PyInit_p2()  #Cython handles error, i.e. if NULL returned

# actually using already cached imported module
#          no search in python path needed
from p2 import test_func
print(test_func())

Calling the init-function of a module prior to importing it (actually the module will not be really imported a second time, only looked up in the cache) works also if there are cyclic dependencies between modules. For example if module p2 imports module p3, which imports p2in its turn.


Warning: Since Cython 0.29, Cython uses multi-phase initialization per default for Python>=3.5, thus calling PyInit_p2 is not enough (see e.g. this SO-post). To switch off this multi-phase initialization -DCYTHON_PEP489_MULTI_PHASE_INIT=0should be passed to gcc or similar to other compilers.


Note: However, even after all of the above, the embedded interpreter will need its standard libraries (see for example this SO-post) - there is much more work to do to make it truly standalone! So maybe one should heed @DavidW's advice:

"don't do this" is probably the best solution for the vast majority of people.


A word of warning: if we declare PyInit_p2() as

from cpython cimport PyObject
cdef extern  PyObject *PyInit_p2();

PyInit_p2(); # TODO: error handling if NULL is returned

Cython will no longer handle the errors and its our responsibility. Instead of

PyObject *__pyx_t_1 = NULL;
__pyx_t_1 = PyInit_p2(); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 4, __pyx_L1_error)
__Pyx_GOTREF(__pyx_t_1);
__Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;

produced for object-version, the generated code becomes just:

(void)(PyInit_p2());

i.e. no error checking!

On the other hand using

cdef extern from *:
    """
    PyObject *PyInit_p2(void);
    """
    object PyInit_p2()

will not work with g++ - one has to add extern C to declaration.

Caftan answered 24/10, 2018 at 20:16 Comment(0)
G
3

People are tempted to do this because it's fairly easy to do for the simplest case (one module, no dependencies). @ead's answer is good but honestly pretty fiddly and it is handling the next simplest case (two modules that you have complete control of, no dependencies).

In general a Python program will depend on a range of external modules. Python comes with a large standard library which most programs use to an extent. There's a wide range of third party libraries for maths, GUIs, web frameworks. Even tracing those dependencies through the libraries and working out what you need to build is complicated, and tools such as PyInstaller attempt it but aren't 100% reliable.

When you're compiling all these Python modules you're likely to come across a few Cython incompatibilities/bugs. It's generally pretty good, but struggles with features like introspection, so it's unlikely a large project will compile cleanly and entirely.

On top of that many of those modules are compiled modules written either in C, or using tools such as SWIG, F2Py, Cython, boost-python, etc.. These compiled modules may have their own unique idiosyncrasies that make them difficult to link together into one large blob.

In summary, it may be possible, but for non-trivial programs it is not a good idea however appealing it seems. Tools like PyInstaller, Py2Exe, and PyOxidizer that use a much simpler approach (bundle everything into a giant zip file) are much more suitable for this task (and even then they struggle to be really robust).


Note this answer is posted with the intention of making this question a canonical duplicate for this problem. While an answer showing how it might be done is useful, "don't do this" is probably the best solution for the vast majority of people.

Gilstrap answered 18/12, 2019 at 10:10 Comment(3)
I think "don't do it" is a very solid advice, even if I understand the appeal of having a truly standalone embedded python interpreter and have from time to time to resist the urge to do exactly that. Not sure Cython is the best tool to handle the .py-files though: freezing the modules seems like a better/more robust strategy.Caftan
In addition to PyInstaller and Py2Exe, your answer might also mention PyOxidizer which can embed everything into a statically linked Python executable.Clown
@Clown thanks - that wasn't a tool I know about, but I'll add it to the list of recommendationsGilstrap

© 2022 - 2024 — McMap. All rights reserved.