ImportError after cython embed
Asked Answered
H

1

7

I can't get otherwise-available modules seen by a compiled python script. How do I need to change the below process in order to accept either venv-based or global modules?

Steps:

$ python3 -m venv sometest
$ cd sometest
$ . bin/activate
(sometest) $ pip3 install PyCrypto Cython

The basic script, using a non-standard module Crypto:

# hello.py
from Crypto.Cipher import AES
import base64
obj = AES.new('This is a key123', AES.MODE_CBC, 'This is an IV456')
msg = "The answer is no"
ciphertext = obj.encrypt(msg)
print(msg)
print(base64.b64encode(ciphertext))
(sometest) $ python3 hello.py
The answer is no
b'1oONZCFWVJKqYEEF4JuL8Q=='

Compiling it:

(sometest) $ cython -3 --embed hello.py
(sometest) $ gcc -Os -I /usr/include/python3.5m -o hello hello.c -lpython3.5m -lpthread -lm -lutil -ldl
(sometest) $ $ ./hello
Traceback (most recent call last):
  File "hello.py", line 1, in init hello
    from Crypto.Cipher import AES
ImportError: No module named 'Crypto'

I don't think it's a problem with using the venv from a cython-embedded-compiled script: the script works elsewhere in the system without the venv (that is, python3 -c 'from Crypto.Cipher import AES' does not fail).

The process works fine otherwise:

(sometest) $ echo 'print("hello world")' > hello2.py
(sometest) $ cython -3 --embed hello2.py
(sometest) $ gcc -Os -I /usr/include/python3.5m -o hello2 hello2.c -lpython3.5m -lpthread -lm -lutil -ldl
(sometest) $ ./hello2
hello world

System:

(sometest) $ python3 --version
Python 3.5.2
(sometest) $ pip3 freeze
Cython==0.29.11
pkg-resources==0.0.0
pycrypto==2.6.1

(sometest) $ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.6 LTS"
Hachmin answered 2/7, 2019 at 17:28 Comment(3)
The embedded python has probably not the same pythonpathMendoza
That's it, thank you @ead. I had assumed that it would use the same source of modules that otherwise work. Now, I see that PYTHONPATH=/usr/lib/python3/dist-packages/ ./hello works. While I feel a little embarrassed it was that simple, I feel it's worth enough to not delete this. If you answer, I'll accept. Thank you again!Hachmin
I hope it is clear to you, that ith your solution the embeded iterpreter uses system-installation and not from the virtual environment.Mendoza
M
5

Usually, a Python-interpreter isn't "standalone" and in order to work it needs its standard libraries (for example ctypes (compiled) or site.py (interpreted)) and also path to other site-packages (for example numpy) must be set.

Albeit it is possible to make a Python-interpter fully standalone by freezing the py-modules and merging all c-extensions (see for example this SO-post) into the resulting executable, it is easier to provide the needed installation to the embeded interpeter. One can download files needed for a "standard" installation from python-homepage (at least for windows), see also this SO-question).

Sometimes finding standard modules/site packages doesn't work out of the box: one has to help the interpreter by setting Python-path, i.e. by adding <..>/sometest/lib/python3.5/site-packages (sometest being a virtual environment root-folder) to sys.path either programmatically in the pyx-file or by setting PYTHONPATH-environment variable prior to start.

Read on for more gory details and alternative solutions.


This answer is for Linux and Python3 (Python 3.7), the basic idea is the same for Windows/MacOS, but some details might be different.

Because venv is used we have the following alternative to solve the issue:

  • adding <..>/sometest/lib/python3.5/site-packages (sometest being a virtual environment root-folder) to sys.path either programmatically in the pyx-file or by setting PYTHONPATH-environment variable prior to start.
  • placing the executable with embeded python in a subdirectory of sometest (e.g. bin or creating an own).
  • using virtualenv instead of venv.

Note: For the executable with the embeded python, it doesn't play any role whether the virtual environment (or which) is activated or not.


Why does the above solves the issue in your scenario?

The problem is, that the (embeded) Python-interpreter needs to figure out where following things are:

  • platform independent directory/files, e.g. os.py, argparse.py (mostly everything *.py/ *.pyc). Given sys.prefix, the interpreter can figure out where to find them (i.e. in prefix/lib/pythonX.Y).
  • platform dependent directory/files, e.g. shared libraries. Given sys.exec_prefix the interpreter can figure out where to find them (e.g. shared libraries can be found in in exec_prefix/lib/pythonX.Y/lib-dynload).

The algorithm can be found here and the search is performed, when Py_Initialize is executed. Once these directories are found, sys.path can be constructed.

However, when using venv, there is a pyvenv.cfg-file next to exe or in the parent directory, which ensures that the right Python-Home is found - a good starting point is the home-key in this file.

If Py_NoSiteFlag is not set, Py_Initialize will utilize site.py (it can be found by the interpreter, because sys.prefix is known) , or more precise site.main(), to add site-packages of the virtual environment to sys.path. While doing so, site.py looks for pyvenv.cfg and parses it. However, local site-packages are added to the python-path only when:

If a file named "pyvenv.cfg" exists one directory above sys.executable, sys.prefix and sys.exec_prefix are set to that directory and it is also checked for site-packages (sys.base_prefix and sys.base_exec_prefix will always be the "real" prefixes of the Python installation).

In your case pyvenv.cfg is not in the directory above, but in the same as the exe - thus the local site-packages, where the libraries were installed via pip, aren't included. Global site-packages aren't included because pyvenv.cfg has key include-system-site-packages = false. Thus there are no site-packages allowed and the installed libraries cannot be found.

However, moving the exe one directory down, would lead to inclusion of the local site-packages to the path.


There are other scenarios possible, what counts is the location of the executable and not which environment is activated.

A: Executable is somewhere, but not inside a virtual environment

This search heuristic works more or less reliable for installed python-interpreters, but can fall for embeded-interpreters or virtual environments (see this issue for much more information).

If python was installed using usual apt install or similar, then it will be found (due to 4. step in the search algorithm) and the system-installation will be used by the embeded interpreter.

However if files were moved around or python was build from source but not installed, then embeded interperter cannot start up:

Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Fatal Python error: initfsencoding: unable to load the file system codec
ModuleNotFoundError: No module named 'encodings'

In this case, Py_SetPythonHome or setting environment variable $PYTHONHOME are possible solutions.

B: Executable inside a virtual environment, created with virtualenv

Assuming it is the same Python version for virtual environment and the embeded python (otherwise we have the above case), the emebeded exe will use local side-packages. The home search algorithmus will always find the local home, due to this rule:

Step 3. Try to find prefix and exec_prefix relative to argv0_path, backtracking up the path until it is exhausted. This is the most common step to succeed. Note that if prefix and exec_prefix are different, exec_prefix is more likely to be found; however if exec_prefix is a subdirectory of prefix, both will be found.

In this case argv0_path is the path to the exe (there is no pyvenv.cfg file!), and the "landmarks" (lib/python$VERSION/os.py and lib/python$VERSION/lib-dynload) will be found, because they are presented as symlinks in the local-home above the exe.

C: Executable two folders deep inside a venv-environment

Going two and not one folder (where it works) down in a venv-environment results in case A: pyvenv.cfg file isn't read while searching for home (too far above), 'venv`-environments lack symlinks to "landmarkers" (localy only side-packages are present) and such step 3 will fail, with 4. step being the only hope.


Corollary: Embeded Python will not work without a right Python-installation, unless among other possibilities:

  • the needed files are packed into lib\pythonX.Y\* next to the embeding executable or somewhere above (and there is no pyvenv.cfg around to mess the search up).

  • or pyvenv.cfg used to point the interpreter to the right location.

Mendoza answered 3/7, 2019 at 0:9 Comment(3)
This is a lot to take in. I did not know there was a (significant) difference between venv and virtualenv. When I try the steps outside of either (assuming modules available globally, no PYTHONPATH set), it works without error; this helps me to understand a few things. Your answer will take a few for me to take in, thank you for taking the time for a thorough answer!Hachmin
I understand that the embedded executable still needs a "complete" python installation, but when I run ./hello it fails despite having lib/python3.5/site-packages/Crypto next to the executable. Is it a safer interpretation of "next to the embeding executable" to assume that the executable must be in a subdir next to ./lib/? (I hadn't realized that nuance of venv's, but now I see the behavior difference.)Hachmin
@Hachmin If there are no pyvenv.cfg around to mess the search up, then "next to exe" is enough (at least for Python3.7 - cannot test it with Python3.5), but your proposal is safer (putting exe in a subfolder), because it also works when pyenv.cfg is around.Mendoza

© 2022 - 2024 — McMap. All rights reserved.