- OS: Ubuntu 22.04.1
- Python 3.8.1 (Conda)
- GPU: RTX4090
- Nvidia driver: 530.30.02
When I set the environment of Deep Learning, I found that in pytorch, the torch.cuda.is_available()
function is always False
. I tried many times to change the version of pytorch, the cpu version installed successfully, but the gpu version can not be installed. The server may installed CUDA in wrong way before (nvcc --version
not working, but I can see a lot files like CUDA-11.4), so I tried to install CUDA 12.1 and delete the file before. But still failed to install CUDA.
When I first check nvidia-smi
, the output is like:
Mon Apr 24 11:16:34 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 On | 00000000:05:00.0 Off | Off |
| 0% 42C P8 12W / 450W| 1MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
It shows me the current nvidia driver version is 530.30.02, and the max CUDA version supported is 12.1. Then I try to download the CUDA 12.1 and install it by following commands:
wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda_12.1.1_530.30.02_linux.run
sudo sh cuda_12.1.1_530.30.02_linux.run
Then, it shows me a graph like this: CUDA Installer Then I continued to install by changing nothing:
Installation failed. See log at /var/log/cuda-installer.log for details.
Then I opened cuda-installer.log:
cuda-installer.log
The first line said 'Driver not installed', but when I checked nvidia-smi
it shows me the driver is installed. Why?
Then I tried by not installing driver in the CUDA Installer: Not installing Driver Then it outputs following warnings:
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-12.1/
Please make sure that
- PATH includes /usr/local/cuda-12.1/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-12.1/lib64, or, add /usr/local/cuda-12.1/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.1/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 530.00 is required for CUDA 12.1 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
sudo <CudaInstaller>.run --silent --driver
But at this time, when I check nvidia-smi , it actually works, when I check nvcc --version
, it prints command not found
.
Then I checked other methods to install CUDA like
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda-repo-ubuntu2204-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-1-local_12.1.1-530.30.02-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
It doesn't works, the outputs like this:
(base) root@6f0f4f1d5e21:~/zyx/test# sudo apt-get -y install cuda
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
cuda : Depends: cuda-12-1 (>= 12.1.1) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.