Failed to compile cuda_ndarray.cu: libcublas.so.7.5: cannot open shared object file
Asked Answered
E

4

6

I am trying to import theano library in an aws instance to use GPU. I have written a python script using boto to automate aws setup which will essentially do an ssh to the instance from my local machine and then start a bash script where I do "python -c 'import theano'" to start the GPU. But I get the following error:

ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: libcublas.so.7.5: cannot open shared object file: No such file or directory

When I tried to import theano module directly in the instance command shell it automatically starts using GPU.

Using gpu device 0: GRID K520 (CNMeM is disabled)

I guess I am missing some other import that has to made while importing through my automation python script. What could possibly be the solution?

Erminiaerminie answered 25/2, 2016 at 6:30 Comment(3)
May be a environ problem. Try python "import os; print(os.eviron["PATH"])" and see if "/usr/local/cuda/bin" is in PATH.Khorma
Also check LD_LIBRARY_PATH to make sure that the CUDA libraries can be found at runtime.Apperceive
In my case, this was a problem caused by my CUDA installation strategy. It was creating libcublas.so.11 when I was trying to install libcublas.so.10. Using DEBIAN_FRONTEND=noninteractive apt -y install cuda-10-1 as part of my install script fixed the issue.Personally
I
6

I will try to solve this problem clearly and concise, as I found not really good answer for people which are starting using unix or are not familiar with compilation and linking.

The problem has to do with dynamic linking and it can be solved in two ways. First one is by setting LD_LIBRARY_PATH enviroment variable. Assuming cuda is installed in /usr/local/cuda/, just add in your enviroment file /etc/enviroment:

LD_LIBRARY_PATH=/usr/local/cuda/

Or simply in your bashrc:

export LD_LIBRARY_PATH=/usr/local/cuda/lib64/

This solution is not recommended by unix gurus (i am not one i have just read that on the internet and i follow linux gurus). So the solution I found is simple, modify the path where the linux ld software search for libraries by default. To do that just do (you have to do it as root):

cd /etc/ld.so.conf.d/

Then pick for example and edit:

vi libc.conf 

Inside this file just add the path to the lib64 root like:

/usr/local/cuda/lib64/

You would get something like this in the file:

\# libc default configuration

/usr/local/lib

/usr/local/cuda/lib64/

And then just run:

sudo ldconfig

Hope this answer helps people which are starting seen programming, or using high level languages such as python that uses C code below (like theano does) and are not familiar with compilation, linkig...

Iritis answered 28/2, 2017 at 15:15 Comment(0)
O
4

I faced the same error on Ubuntu 16.04 with cuda 7.5 and found the solution here:

  1. cuda 7.5 don't support the default g++ version. Install an supported version and make it the default:

    sudo apt-get install g++-4.9
    
    sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20
    sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10
    
    sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20
    sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10
    
    sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 30
    sudo update-alternatives --set cc /usr/bin/gcc
    
    sudo update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++ 30
    sudo update-alternatives --set c++ /usr/bin/g++
    
  2. Work around a glibc bug - create .theanorc in the home directory with the following settings:

    [global]
    device=gpu
    floatX=float32
    
    [nvcc]
    flags=-D_FORCE_INLINES
    

And don't forget to check environment variables: PATH should contain your cuda bin folder location and CUDA_HOME should contain cuda home location

I've added it to mine .bashrc file this way:

export PATH="/usr/local/cuda/bin:$PATH"
export CUDA_HOME="/usr/local/cuda:$CUDA_HOME"
Outfitter answered 18/8, 2016 at 18:58 Comment(0)
A
3

I had a similar problem recently and spend ages figuring out what was going wrong (to the point I corrupted my Linux install and had to do a fresh install).

A potential solution for this error is to delete the .theano/ directory that is (possibly) located in your home directory:

sudo rm -rf ~/.theano

To prevent this error from happening again, do not run your scripts as root user (i.e. without sudo).

Running a script as root will create the hidden directory with root permissions, making it inaccessible to other processes.

Adaminah answered 9/3, 2016 at 16:20 Comment(0)
K
0

On the suggestion of Kumar here, I did

sudo ldconfig /usr/local/cuda/lib64

And it magically started working. Thanks Kumar!

Knudson answered 28/7, 2017 at 16:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.