TensorFlow libdevice not found. Why is it not found in the searched path?
Asked Answered
W

11

33

Win 10 64-bit 21H1; TF2.5, CUDA 11 installed in environment (Python 3.9.5 Xeus)

I am not the only one seeing this error; see also (unanswered) here and here. The issue is obscure and the proposed resolutions are unclear/don't seem to work (see e.g. here)

Issue Using the TF Linear_Mixed_Effects_Models.ipynb example (download from TensorFlow github here) execution reaches the point of performing the "warm up stage" then throws the error:

InternalError: libdevice not found at ./libdevice.10.bc [Op:__inference_one_e_step_2806]

The console contains this output showing that it finds the GPU but XLA initialisation fails to find the - existing! - libdevice in the specified paths

2021-08-01 22:04:36.691300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9623 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2021-08-01 22:04:37.080007: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
2021-08-01 22:04:54.122528: I tensorflow/compiler/xla/service/service.cc:169] XLA service 0x1d724940130 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-08-01 22:04:54.127766: I tensorflow/compiler/xla/service/service.cc:177]   StreamExecutor device (0): NVIDIA GeForce GTX 1080 Ti, Compute Capability 6.1
2021-08-01 22:04:54.215072: W tensorflow/compiler/tf2xla/kernels/random_ops.cc:241] Warning: Using tf.random.uniform with XLA compilation will ignore seeds; consider using tf.random.stateless_uniform instead if reproducible behavior is desired.
2021-08-01 22:04:55.506464: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
2021-08-01 22:04:55.512876: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2021-08-01 22:04:55.517387: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77]   C:/Users/Julian/anaconda3/envs/TF250_PY395_xeus/Library/bin
2021-08-01 22:04:55.520773: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77]   C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2
2021-08-01 22:04:55.524125: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77]   .
2021-08-01 22:04:55.526349: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions.  For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.

Now the interesting thing is that the paths searched includes "C:/Users/Julian/anaconda3/envs/TF250_PY395_xeus/Library/bin"

the content of that folder includes all the (successfully loaded at TF startup) DLLs, including cudart64_110.dll, dudnn64_8.dll... and of course libdevice.10.bc

Question Since TF says it is searching this location for this file and the file exists there, what is wrong and how do I fix it?

(NB C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2 does not exist... CUDA is intalled in the environment; this path must be a best guess for an OS installation)

Info: I am setting the path by

aPath = '--xla_gpu_cuda_data_dir=C:/Users/Julian/anaconda3/envs/TF250_PY395_xeus/Library/bin'
print(aPath)
os.environ['XLA_FLAGS'] = aPath

but I have also set an OS environment variable XLA_FLAGS to the same string value... I don't know which one is actually working yet, but the fact that the console output says it searched the intended path is good enough

Woodchuck answered 1/8, 2021 at 21:37 Comment(2)
Your eventual solution at discuss.tensorflow.org/t/… solved the issue for me, thanks.Menstruum
I had forgotten about that... glad it helped.Woodchuck
W
6

The diagnostic information is unclear and thus unhelpful; there is however a resolution

The issue was resolved by providing the file (as a copy) at this path

C:\Users\Julian\anaconda3\envs\TF250_PY395_xeus\Library\bin\nvvm\libdevice\

Note that C:\Users\Julian\anaconda3\envs\TF250_PY395_xeus\Library\bin was the path given to XLA_FLAGS, but it seems it is not looking for the libdevice file there it is looking for the \nvvm\libdevice\ path This means that I can't just set a different value in XLA_FLAGS to point to the actual location of the libdevice file because, to coin a phrase, it's not (just) the file it's looking for.

The debug info earlier:

2021-08-05 08:38:52.889213: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
2021-08-05 08:38:52.896033: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2021-08-05 08:38:52.899128: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77]   C:/Users/Julian/anaconda3/envs/TF250_PY395_xeus/Library/bin
2021-08-05 08:38:52.902510: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77]   C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2
2021-08-05 08:38:52.905815: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77]   .

is incorrect insofar as there is no "CUDA" in the search path; and FWIW I think a different error should have been given for searching in C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2 since there is no such folder (there's an old V10.0 folder there, but no OS install of CUDA 11)

Until/unless path handling is improved by TensorFlow such file structure manipulation is needed in every new (Anaconda) python environment.

Full thread in TensorFlow forum here

Woodchuck answered 6/8, 2021 at 7:13 Comment(1)
H
34

The following worked for me. With error message:

error: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice

Firstly I searched for nvvm directory and then verified that libdevice directory existed:

$ find / -type d -name nvvm 2>/dev/null
/usr/lib/cuda/nvvm
$ cd /usr/lib/cuda/nvvm
/usr/lib/cuda/nvvm$ ls
libdevice
/usr/lib/cuda/nvvm$ cd libdevice
/usr/lib/cuda/nvvm/libdevice$ ls
libdevice.10.bc

Then I exported the environment variable:

export XLA_FLAGS=--xla_gpu_cuda_data_dir=/usr/lib/cuda

as shown by @Insectatorious above. This solved the error and I was able to run the code.

Hudgins answered 6/5, 2022 at 18:1 Comment(2)
The above approach did not work for me when using a custom environment on a cluster. Probably tensorflow expects to locate nvvm folder in your custom environment. Somehow conda install cudatoolkit did not create nvvm folder. What worked for me is to conda install -c nvidia cuda-nvcc and then export the path of cuda-nvcc folder as done above.Kingkingbird
I'm using Windows, so I had to add the path with quotes around: os.environ['XLA_FLAGS'] = '--xla_gpu_cuda_data_dir="/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2"'Transfigure
A
14

for Windows user

Step-1

run (as administrator)

conda install -c anaconda cudatoolkit

you can specify the cudatoolkit version as per your installed cudaCNN /supported version ex:conda install -c anaconda cudatoolkit=10.2.89

Step-2

go to the installed conada folder

C:\ProgramData\Anaconda3\Library\bin

Step-3

locate "libdevice.10.bc" ,copy the file

Step-4

create a folder named "nvvm" inside bin

create another folder named "libdevice" inside nvvm

paste the "libdevice.10.bc" file inside "libdevice"

Step-5

go to environmental variables

System variables >New

variable name:

XLA_FLAGS

variable value:

--xla_gpu_cuda_data_dir=C:\ProgramData\Anaconda3\Library\bin

(edit above as per your directory)

Step-6 restart the cmd/virtual env

Amity answered 16/5, 2022 at 8:44 Comment(4)
This would be more useful if the reasons behind the specific steps were provided - understanding is reusable :)Woodchuck
Follow this thread for more information. :discuss.tensorflow.org/t/…Amity
SE answers should seek to be self-contained. And you referred me to the thread that I started. My comments boil down to: what are you adding to the answer I gave previously and why does it matter? What is distinctive about your contribution? Answering that will help others decide which answer is most relevant to their needs.Woodchuck
yep yep . the program uses the XLA_FLAGS(env variable) to get the path of CUDA_DIR ,but its unable to access the .bc file outside the Anaconda env So i installed cuda in anaconda the i created /nvvm/libdevice ,as in the program it says $CUDA_DIR/nvvm/libdevice and placed the file there . and why its unable to access outside of anaconda i tried to find which module is extracting the file ,but can't find it so, that's all the expiation .Amity
W
6

The diagnostic information is unclear and thus unhelpful; there is however a resolution

The issue was resolved by providing the file (as a copy) at this path

C:\Users\Julian\anaconda3\envs\TF250_PY395_xeus\Library\bin\nvvm\libdevice\

Note that C:\Users\Julian\anaconda3\envs\TF250_PY395_xeus\Library\bin was the path given to XLA_FLAGS, but it seems it is not looking for the libdevice file there it is looking for the \nvvm\libdevice\ path This means that I can't just set a different value in XLA_FLAGS to point to the actual location of the libdevice file because, to coin a phrase, it's not (just) the file it's looking for.

The debug info earlier:

2021-08-05 08:38:52.889213: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:73] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
2021-08-05 08:38:52.896033: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] Searched for CUDA in the following directories:
2021-08-05 08:38:52.899128: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77]   C:/Users/Julian/anaconda3/envs/TF250_PY395_xeus/Library/bin
2021-08-05 08:38:52.902510: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77]   C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2
2021-08-05 08:38:52.905815: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77]   .

is incorrect insofar as there is no "CUDA" in the search path; and FWIW I think a different error should have been given for searching in C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2 since there is no such folder (there's an old V10.0 folder there, but no OS install of CUDA 11)

Until/unless path handling is improved by TensorFlow such file structure manipulation is needed in every new (Anaconda) python environment.

Full thread in TensorFlow forum here

Woodchuck answered 6/8, 2021 at 7:13 Comment(1)
I
5

For those using windows and PowerShell, assuming cuda is in C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.7

The environment can be set as:

$env:XLA_FLAGS="--xla_gpu_cuda_data_dir='C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.7'"

Here "''", i.e. nested quotes, is required!

I think this may be the lightest way to deal with this XLA bug.

Increasing answered 27/9, 2022 at 16:3 Comment(1)
Just in case someone wishes to manually enter the system variable in Windows 10. Here is how: Go to Type here to search box. Search for Environment variables -> System Variables -> New -> Variable name put XLA_FLAGS Variable value put "--xla_gpu_cuda_data_dir='C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2'" without the double quotation marks on the outside. (Assuming your CUDA version is 12.2).Dill
A
5

For those using miniconda just copy the file libdevice.10.bc into the root folder of python application or notebook.

It works here using python=3.9, cudatoolkit=11.2, cudnn=8.1.0, and tensorflow==2.9

Adulterous answered 12/12, 2022 at 16:44 Comment(1)
On Linux, you may also set the XLA_FLAGS with the following command: export XLA_FLAGS=--xla_gpu_cuda_data_dir=$CONDA_PREFIX. It worked with python3.10 and cudatoolkit=11.2.Duky
T
5

I had same problem on fresh install Ubuntu 24.04 with Nvidia RTX3090 I used instructions from this page: https://www.tensorflow.org/install/pip and I couldn't run model.fit because it gave me the error. Then in attempt to resolve the issue I've installed this driver: NVIDIA-Linux-x86_64-525.105.17.run but it didn't help.

I believe this actually solved the issue:

sudo apt-get install cuda-toolkit
Tamasha answered 12/4, 2023 at 21:10 Comment(0)
C
3

i meet the same error with Tensorflow 2.11,CUDA 11.2, cuDNN 8.1.0. because i use conda build the env, so no nvvm directory and no need to export the environment variable and can't use the command nvcc -V, so many suggestions i searched are not suitable for my problem. i solve the error by downgrade tensonflow to 2.10. Use conda install tensorflow=2.10.0 cudatoolkit cudnn to downgrade your tensorflow version and its dependencies. reference:https://github.com/tensorflow/tensorflow/issues/58681

Castle answered 3/4, 2023 at 8:44 Comment(0)
R
3

In my case I noticed that there was an error regarding to Adam at the final line :

libdevice not found at ./libdevice.10.bc
         [[{{node Adam/StatefulPartitionedCall_88}}]] [Op:__inference_train_function_10134]

I changed this line: from keras.optimizers import Adam

to this: from keras.optimizers.legacy import Adam

and it worked. It was suggested in this link: https://github.com/keras-team/tf-keras/issues/62

There are some other suggestions for this kind of error.

Roommate answered 19/10, 2023 at 21:22 Comment(0)
M
2

For linux users, with tensorflow==2.8 add the following environment variable.

XLA_FLAGS=--xla_gpu_cuda_data_dir=/usr/local/cuda-11.4
Mielke answered 31/3, 2022 at 12:23 Comment(1)
Are you installing CUDA toolkit in the base environment? The current tutorial asks to install within a tf environment, and I am still having this problem on Linux - tensorflow.org/install/pip#windows-wsl2Spohr
U
1

I had the same issue on Ubuntu 22.04 using tensorflow in jupyter-lab. The following steps solve the problem:

  1. Copy libdevice.10.bc to working folder (where jupyter lab is started)
  2. export XLA_FLAGS=--xla_gpu_cuda_data_dir=/usr/lib/cuda/

or

  1. In python code (before training) os.environ["XLA_FLAGS"] = "--xla_gpu_cuda_data_dir=/usr/lib/cuda/"
Unctuous answered 29/2, 2024 at 21:52 Comment(2)
1 + 2 or 3 seems to work. No need to do 2 and 3 at the same time.Maronite
Thanks I've incorporated that in my answerUnctuous
S
0

I was having a similar error:

2024-07-02 14:11:12.392126: W external/local_xla/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:510] 
Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in 
compilation or runtime failures, if the program we try to run uses routines 
from libdevice.
Searched for CUDA in the following directories:
  ./cuda_sdk_lib
  /usr/local/cuda-12.3
  /usr/local/cuda
  /home/spotparking/.local/lib/python3.10/site-packages/tensorflow/python/platform/../../../nvidia/cuda_nvcc
  /home/spotparking/.local/lib/python3.10/site-packages/tensorflow/python/platform/../../../../nvidia/cuda_nvcc
  .

You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions.  

For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.

and following the instructions in the error message worked for me.

I first had to install the nvidia-cuda-nvcc by running:

python3 -m pip install nvidia-pyindex
python3 -m pip install nvidia-cuda-nvcc

I then ran this command to find the path to cuda_nvcc:

find / -type d -name "cuda_nvcc" 2>/dev/null

I copied that path, and then I exported the environment variable with:

export XLA_FLAGS=--xla_gpu_cuda_data_dir=/copied/path/to/cuda_nvcc
Sanders answered 2/7, 2024 at 21:4 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.