Explain why numpy should not be imported from source directory
Asked Answered
K

5

22

Disclaimer of research:

I have examined the following other StackOverflow questions:

Perhaps to some, those may answer my question, but according to my knowledge, I still do not understand the situation.

I am trying to import numpy so that matplotlib will work, but upon execution of the __init__.py file in the numpy folder, the following error message is displayed:

ImportError: Error importing numpy: you should not try to import numpy from
    its source directory; please exit the numpy source tree, and relaunch
    your python intepreter from there.

Explain what it means to import something from its source directory as opposed to some other way of importing it. Does it mean that it should not be source code when it is imported? Or does it mean that it literally is just the wrong directory/folder that I am importing. I know that one other StackOverflow answer is:

The message is fairly self-explanatory; your working directory should not be the numpy source directory when you invoke Python; numpy should be installed and your working directory should be anything but the directory where it lives.

However, I don't understand this. Aren't you supposed to import things that you want to work with? I'm assuming that the import command combines the source directory into your current working directory in this statement.

I also read the other answers such as:

  • Using distutils to install local directories

  • Using virtualenv to create a virtual system directory

  • Using Enthought's EPD to have numpy pre-installed in what I believe to be the system directory, and

  • Using a command like $ dpkg -i --force-not-root --root=$HOME mypackagename.deb to create what I believe is some kind of sub-system directory that is treated like a system directory.

So, correct me if I'm wrong, but does numpy somehow strongly require to be somehow installed in the main system directory?

Machine status:

I am using Windows machines without administrative privlidges. They have Python 3.3 Shell as well as matplotlib installed. When running command prompt, python and python3 are not recognized. I have to run the Python shell from the applications menu. I can successfull begin importing matplotlib from even my own directory, different from theirs, but it stops upon reaching __init__.py of the numpy module, if it exists and reports the error stated above.

Update:

Luckily, my administrators were able to directly install numpy correctly in the site-packages folder. Thank you for answering my question though. I understand the situation a lot more because of you.

Katabasis answered 28/1, 2013 at 19:51 Comment(1)
I have numpy in the site-packages folder. Still does not work from eclipseEclipse IDE for Eclipse Committers Version: Oxygen.3a Release (4.7.3a) Build id: 20180405-1200Dissipate
A
15

numpy includes extension modules written in C. You will need to build these extension modules before the numpy package is complete. The most robust way to do this is to build it and install it to site-packages like normal. You can also install it to another directory using the standard distutils options for this. However, once you have installed it, you should change your directory out of the source tree. Python starts looking for packages in your current directory, so the presence of the incomplete numpy package (without the necessary built C extension modules) will be picked up first and lead to the error that message that you quote. This happens a lot, so we give a long message explaining what to do.

Avellaneda answered 28/1, 2013 at 22:53 Comment(3)
Is the building of the extension modules of numpy necessary so that numpy can work with the os? Do all modules have extensions or only the "more advanced" ones?Does pip do the building when you invoke pip install numpy? Is there a beginner tutorial where I can build these C extensions?Kenakenaf
1. No, most of numpy is an extension module for speed. 2. The core of numpy is an extension module. You must build the extension modules. 3. If it can't find a prebuilt binary wheel, yes. 4. This is not the kind of thing that usually gets tutorial-style docs. Beginners should use the binaries on PyPI. scipy.org/install.html numpy.org/devdocs/user/building.htmlAvellaneda
What does this mean: 'However, once you have installed it, you should change your directory out of the source tree.' I have been struggling with this a lot and somehow it feels that changing my directory out of the source tree is key, but its cryptic. what is 'my directory' and what is source treeOller
P
18

I ran into this error on raspberry pi. To fix it, I had to install a dependency:

sudo apt-get install libopenblas-dev
Perspicacity answered 17/1 at 20:18 Comment(2)
This worked for me. It seems, everytime you upgrade numpy externally, you need to reinstall these dependencies to the new upgraded version. Especially for raspberry piAlternate
Thank you! This fixed my pandas on a fresh install of the 32 bit Bullseye image (was not required on the 64 bit image)Rennes
A
15

numpy includes extension modules written in C. You will need to build these extension modules before the numpy package is complete. The most robust way to do this is to build it and install it to site-packages like normal. You can also install it to another directory using the standard distutils options for this. However, once you have installed it, you should change your directory out of the source tree. Python starts looking for packages in your current directory, so the presence of the incomplete numpy package (without the necessary built C extension modules) will be picked up first and lead to the error that message that you quote. This happens a lot, so we give a long message explaining what to do.

Avellaneda answered 28/1, 2013 at 22:53 Comment(3)
Is the building of the extension modules of numpy necessary so that numpy can work with the os? Do all modules have extensions or only the "more advanced" ones?Does pip do the building when you invoke pip install numpy? Is there a beginner tutorial where I can build these C extensions?Kenakenaf
1. No, most of numpy is an extension module for speed. 2. The core of numpy is an extension module. You must build the extension modules. 3. If it can't find a prebuilt binary wheel, yes. 4. This is not the kind of thing that usually gets tutorial-style docs. Beginners should use the binaries on PyPI. scipy.org/install.html numpy.org/devdocs/user/building.htmlAvellaneda
What does this mean: 'However, once you have installed it, you should change your directory out of the source tree.' I have been struggling with this a lot and somehow it feels that changing my directory out of the source tree is key, but its cryptic. what is 'my directory' and what is source treeOller
D
1

This error typically arises when your script operates within the numpy installation tree due to the improper importing of C binaries in numpy, as highlighted by Robert. Specifically, there will be an original error message above the ImportError message. In my case, I encountered this issue while attempting to execute the OpenMPI mpi4py Python interface on an arm64 Mac. Strangely, mpirun necessitates all Python dependencies' binaries to be compiled as x86_64. Since I had installed numpy on my arm64 Mac, the version in my environment was compiled for arm64 by default. Consequently, openmpi attempted to utilize it as x86_64 architecture.

Here's the error log I encountered:

Original error was: dlopen(/.venv/lib/python3.12/site-packages/numpy/core/_multiarray_umath.cpython-312-darwin.so, 0x0002): tried: '/.venv/lib/python3.12/site-packages/numpy/core/_multiarray_umath.cpython-312-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/.venv/lib/python3.12/site-packages/numpy/core/_multiarray_umath.cpython-312-darwin.so' (no such file), '/.venv/lib/python3.12/site-packages/numpy/core/_multiarray_umath.cpython-312-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/processing.py", line 2, in <module>
    import numpy as np
  File "/.venv/lib/python3.12/site-packages/numpy/__init__.py", line 135, in <module>
    raise ImportError(msg) from e
ImportError: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your Python interpreter from there.

To resolve this issue, I opted to enforce the installation of the x86_64 architecture when installing numpy using pip. This ensured that mpirun consistently imported the correct x86_64 architecture.

arch -x86_64 pip install numpy

I understand this is a specific solution for a specific case, but I hope it helps someone in the future.

Daniel answered 26/4 at 17:24 Comment(0)
F
1

Open your .spec file and copy & paste code below.

a = Analysis(
...
binaries=[
    ('c:\\users\\USERNAME\\anaconda3\\envs\\ENVNAME\\lib\\site-packages\\numpy.libs\\libopenblas64__v0.3.23-293-gc2f4bdbb-gcc_10_3_0-2bde3a66a51006b2b53eb373ff767a3f.dll', '.'),
    ('c:\\users\\USERNAME\\anaconda3\\envs\\ENVNAME\\lib\\site-packages\\pandas.libs\\msvcp140-fa0758dedafbbe194d3ee96e3dc2b9a3.dll', '.'),
],
...

)

and, from C:\Users\USER\anaconda3\Library\bin directory, copy libcrypto-1_1-x64.dll and libssl-1_1-x64.dll file and paste to C:\Users\USER\anaconda3\envs\ENVNAME\DLLs

Flatt answered 2/6 at 13:19 Comment(0)
S
0

ENVIRONMENT:

OS: macOS Sonoma Version 14.5 Chip: Apple M1 Memory: 8GB Virtual Environment venv: venv Python: 3.10.4 Pycharm: Build #PC-241.17011.127, built on May 28, 2024 Runtime version: 17.0.11+1-b1207.24 aarch64 VM: OpenJDK 64-Bit Server VM by JetBrains s.r.o.


Packages installed: numpy 1.26.4 pandas 2.2.2 pip 24.0 python-dateutil 2.9.0.post0 pytz 2024.1 setuptools 69.5.1 six 1.16.0 tzdata 2024.1 wheel 0.43.0


CODE:

from pyspark.sql import SparkSession
from datetime import datetime, date
from pyspark.sql import Row
import pandas as pd
from pyspark.sql.functions import pandas_udf

spark = SparkSession.builder.getOrCreate()

df = spark.createDataFrame([
    Row(a=1, b=2., c='string1', d=date(2000, 1, 1), e=datetime(2000, 1, 1, 12, 0)),
    Row(a=2, b=3., c='string2', d=date(2000, 2, 1), e=datetime(2000, 1, 2, 12, 0)),
    Row(a=4, b=5., c='string3', d=date(2000, 3, 1), e=datetime(2000, 1, 3, 12, 0))
])


@pandas_udf('long')
def pandas_plus_one(series: pd.Series) -> pd.Series:
    # Simply plus one by using pandas Series.
    return series + 1


df.show()
df.select(pandas_plus_one(df.a)).show()

ERROR: File "/Users/brunodaemon/PycharmProjects/spark-env/lib/python3.10/site-packages/pandas/init.py", line 19, in raise ImportError( ImportError: Unable to import required dependencies: numpy: Error importing numpy: you should not try to import numpy from its source directory; please exit the numpy source tree, and relaunch your python interpreter from there. Process finished with exit code 1


FIX:

Using venv did not work and resulted in the error mentioned above. I then installed Anaconda, reinstalled the necessary packages, and reran the Python code. This resolved the issue and everything worked fine.

Satisfy answered 9/6 at 20:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.