ImportError: libcuda.so.1: cannot open shared object file
Asked Answered
I

8

8

When I run my code with TensorFlow directly, everything is normal.

However, when I run it in a screen window, I get the following error.

ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

I have tried the command:

source /etc/profile

But it doesn't work.

Cause I use ssh to connect to the servers, the screen is necessary.

How can I fix it?

Illinois answered 18/1, 2019 at 7:39 Comment(2)
If you do not have Cuda installed, install it at: docs.nvidia.com/cuda/cuda-installation-guide-linux/index.htmlFilagree
This can happen also after simply trying to import cupyBluebottle
H
3

Try to put libcuda.so.1 path to LD_LIBRARY_PATH environment variable.

example:

export LD_LIBRARY_PATH=/path/of/libcuda.so.1:$LD_LIBRARY_PATH
Hullda answered 18/1, 2019 at 8:4 Comment(3)
THX!But it doesn't work. I run your command and then update the profile with : ` source /etc/profile ` and ` source bash_profile `, they either don't work. Are there any other methods?Illinois
THX! It worked. But the ` $LD_LIBRARY_PATH ` should be found manually.Illinois
You can put the above code in ~/.bashrc so everytime you open up a terminal the code will execute and update the LD_LIBRARY_PATH.Hullda
S
15

Steps to follow:
Find libcuda.so.1:

echo $LD_LIBRARY_PATH #path
sudo find /usr/ -name 'libcuda.so.*' #version

Then add to $LD_LIBRARY_PATH, in my case /usr/local/cuda-10.0/compat, with the following command, in terminal:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.0/compat
Spoilage answered 25/8, 2021 at 15:8 Comment(0)
T
8

Background

libcuda.so.1 is the library for interacting with the CUDA driver (as opposed to CUDA's "Runtime API", for which you need libcudart.so.*).

Now, it's quite possible to have the CUDA Toolkit properly installed, without the driver being properly installed. And this error could be the result of building a (non-statically-linked) CUDA application in this situation.

Alternatively, it could be the case that there's some misconfiguration of the library search path - because normally, libcuda.so.* are supposed to be installed in some directory on that path!

So, what's on that search path? As explained here, it is:

  1. directories from $LD_LIBRARY_PATH
  2. directories from /etc/ld.so.conf
  3. /lib
  4. /usr/lib

A typical scenario would be for /etc/ld.so.conf to add, say, /usr/lib/x86_64-linux-gnu; and for libcuda.so.* to be there.

Bottom line

Here's what you should do:

  1. Make sure a(n up-to-date) CUDA driver has been properly installed. If it hasn't, download and install it, problem solved.
  2. Locate the libcuda.so.1 file (e.g. using locate). If it's been placed somewhere weird that's not in the library search path - act as in step 1.
  3. If you wanted the driver library installed someplace weird, then add that path to your user's $LD_LIBRARY_PATH.
Teucer answered 30/7, 2021 at 8:3 Comment(2)
Appreciate the background!Bluebottle
One common scenario where this may happen is not passing the --gpus argument to various docker commands, as explained here. That will result in the driver library not being present in your container.Mcavoy
H
3

Try to put libcuda.so.1 path to LD_LIBRARY_PATH environment variable.

example:

export LD_LIBRARY_PATH=/path/of/libcuda.so.1:$LD_LIBRARY_PATH
Hullda answered 18/1, 2019 at 8:4 Comment(3)
THX!But it doesn't work. I run your command and then update the profile with : ` source /etc/profile ` and ` source bash_profile `, they either don't work. Are there any other methods?Illinois
THX! It worked. But the ` $LD_LIBRARY_PATH ` should be found manually.Illinois
You can put the above code in ~/.bashrc so everytime you open up a terminal the code will execute and update the LD_LIBRARY_PATH.Hullda
H
0

As my condition, I develop in docker container environment, I do the following steps:

  1. Confirm your docker container have run with nvidia gpu
  2. Find libcuda.so.1: sudo find /usr/ -name 'libcuda.so.*'
  3. Then add to $LD_LIBRARY_PATH, in my case /usr/local/cuda-11.5/compat, with the following command, in terminal: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.5/compat
Haney answered 23/3, 2023 at 7:39 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Afflict
B
0

If you're trying to run the job inside a container, try starting it with nvidia-docker run instead of docker run. Additional instructions can be found here: https://github.com/NVIDIA/nvidia-docker

Bunin answered 17/6, 2023 at 0:35 Comment(0)
H
0

First, put this in my notebook (Collab):

import os
os.environ['LD_LIBRARY_PATH'] += ':/usr/local/cuda-12.2/compat'

/usr/local/cuda-12.2/compat <--- This is my path (libcuda.so.1)

then:

!pip install llama.cpp 

from git, and next copy the path in the new directory:

!export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./llama.cpp/quantize
Hollo answered 5/2 at 22:2 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Afflict
B
0

I received the same error in google colab. I resolved the error by setting up the runtime to T4 GPU. T4 GPU is required so that you do not get error related to CUDA.

Benioff answered 13/3 at 12:58 Comment(0)
B
0

Some hints for Docker containers users:

CUDA Driver files (libcuda.so*) are usually duplicated in GPU-enabled Docker containers (2nd copy) on GPU-capable hosts (1st copy). One copy of these files comes from the host (placed there during GPU driver installation) and the other from the CUDA Forward Compatibility Package (that some ML packages need to run on CPU-only hosts). In the official nvidia/cuda containers the CUDA Driver files are shipped inside the /usr/local/cuda-{11.x|12.x}/compat/ folder and match the CUDA major and minor versions of the container and the cuda-{11.x|12.x} subfolder name.

These two copies and potentially also two different CUDA versions - one from the host and one from the container - do not have to match as long as:

  1. the driver on the host is capable of supporting newer CUDA versions than the ones installed in the container AND
  2. we prepare a custom GPU-enabled container where we remove the entire /usr/../compat/ folder (see e.g. latestml/ml-gpu-* containers on Docker Hub).

The workaround from point 2) restores the forward compatibility for CUDA on GPU-enabled hosts (the official nvidia/cuda containers require major and minor CUDA versions to match between the host driver and the CUDA containers), but breaks some ML/AI packages on CPU-only systems (without the CUDA Driver files from either host or container), such as llama.ccp, which seems to have been compiled against one of those CUDA Driver files, so on CPU-only machines running containers without the /usr/../compat folder gives us essentially the same error message as in the question ("libcuda.so.1:cannot open shared object file"), even though llama.cpp is perfectly capable of running on CPU-only hosts (but as we see now, only with /usr/../compat folder present containing the required GPU-specific dependencies):

RuntimeError: Failed to load shared library '/opt/conda/lib/python3.11/site-packages/llama_cpp/libllama.so': libcuda.so.1:cannot open shared object file: No such file or directory
Bluebottle answered 7/4 at 9:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.