I'm running a C++ application which tries to run python using the https://docs.python.org/3.5/extending/embedding.html function calls. This is the error that the application error message pipes are giving me.
class 'ImportError': Importing the multiarray numpy extension module failed. Most likely you are trying to import a failed build of numpy. If you're working with a numpy git repo, try
git clean -xdf
(removes all files not under version control). Otherwise reinstall numpy.Original error was: /usr/local/lib/python3.5/site-packages/numpy/core/multiarray.cpython-35m-x86_64-linux-gnu.so: undefined symbol: PyExc_UserWarning
I'm quite puzzled as this only occurs when embedding Python in C++ as the import works when I use it through the interpreter. I'm more interested in an answer that adds to my understanding than a quick do this or do that fix. I list some system/problem information below, and some other questions that I'm considering posting about the same topic. Any guidance is appreciated!
System/Problem information:
- Ubuntu 16.04, 64 bit
- Compiled Python 3.5.5 with enabled-shared
- numpy import works in the interpreter (python3.exe, and python3.5.exe)
- I have made sure that the PySys_SetPath() sets the same sys.path as the output from the interpreter:
import sys
,sys.path
- I can import other modules like PIL, and datetimeutil; however, numpy and pandas are not importable (pandas uses numpy or seems to)
- The embedded Python uses the following commands:
Py_Import_Import()
,Py_Initialize()
(I made sure. It is only called once.), etc., but it does not get a global lock on the interpreter. - The application is built with a CMake build system which compiles to MakeFiles for my system.
- Installed numpy-1.14.2 using pip 9.0.0 using the
pip3.5 install numpy
command - The python script that causes this error has one line:
import numpy
... - I do not have a .zip file that I'm importing files from.
- The .exe used by the Python embedded in the C++ is located at /usr/local/bin/python3 (used Py_GetProgramName() to determine this). This .exe is linked to the libpython3.5m.so.1.0, and the missing symbol lives in libpython3.5m.so.1.0 (ran nm)
ldd on multiarray.cpython-35m-x86_64-linux-gnu.so shows:
ldd multiarray.cpython-35m-x86_64-linux-gnu.so
linux-vdso.so.1 => (0x00007ffd9e36b000)
libopenblasp-r0-39a31c03.2.18.so => /usr/local/lib/python3.5/site-packages/numpy/core/./../.libs/libopenblasp-r0-39a31c03.2.18.so (0x00007fdbe149b000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fdbe1192000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fdbe0f75000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdbe0bab000) /lib64/ld-linux-x86-64.so.2 (0x00007fdbe3ed5000)
libgfortran-ed201abd.so.3.0.0 => /usr/local/lib/python3.5/site-packages/numpy/core/./../.libs/libgfortran-ed201abd.so.3.0.0 (0x00007fdbe08b1000)
I could/might try reinstalling numpy through different means, but I'm having trouble tracking why that might work.
At this point, I'm assuming some hole in my knowledge exists. I have looked at a lot of similar posts regarding not being able to import the multiarray component and numpy when embedding Python in C++; however, either none of them match my specific case or as I stated there exists a hole. Here are a list of sub-questions that I will probably be asking if no one sees anything in this setup that is obviously concerning. I'll probably update the questions with links when/if I ask them (After I polish them).
- How does the numpy multiarray.so link to the pythonX.X.so for symbol resolution? The ldd does not seem to suggest that it ever does. Asked this question at this link
- CMake Question non-related issue resolved in this question asked on 4/12/18 and answered on 4/16/18.
- Setting PYTHONPATH in .bashrc does not seem to update what Py_GetPath() returns, I had to add in the site-packages for imports through a different methodology to sys.path. It may only update the bash script environment variable which doesn't effect the C++.
I'm not asking for an answer for the above question list at this point, rather I'm giving more clues to where my gap in knowledge may be.
Thank you for taking time from your day to read this question. Any help is appreciated.
Edit: 4/17/18:
Well, I found a work around, and I'm currently using it. Dunes question started making me think more closely about undefined symbols and how it could be a linker/compiler error or that the numpy import always expects an environment with those symbols already loaded into memory. This got me trying to install different versions of numpy to see if any of the older versions made a difference. They did not, but it did make the error thrown to be slightly different. When I googled that, this question appeared. The accepted answer gave me a work around by adding these two lines to the pythonInterface.cpp:
#include <dlfcn.h>
dlopen("libpython3.5m.so.1.0", RTLD_LAZY | RTLD_GLOBAL)
These commands add the shared library to be loaded in and available to the cpython.multiarray.so.
This is not an ideal solution as pointing to a specific .so which may be different from machine to machine. It resolves the issue for now, but it also could lead to errors where mismatches of shared libraries can occur during the python call process if the linked library to the pythonInterface.so changes, and this line does not get updated. I believe a better answer can be achieved if this sub-question is answered, so I'm currently holding out on submitting or accepting an answer until then. Thanks!
PyRun_SimpleString
to import numpy, create an array, do arithmetic with it and print out. I also tried directly importnumpy.core.multiarray
– Souza