...
NVIDIA tools are available in /usr/local/cuda-11.48/bin/. You can add them to PATH following:
Code Block |
---|
$ export PATH=$PATH:/usr/local/cuda-11.48/bin/ |
Libraries
CUDA version is currently 11.4 which need to be the same with drivers and thus can't be changed. Tensorflow library compatibility is available at: https://www.tensorflow.org/install/source#gpu. We have tested that TensorFlow > 2.6.1 work.
...
Code Block |
---|
$ nvidia-smi Mon Jan 8 10:24:59 2024 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA RTXA6000... On | 00000000:00:05.0 Off | 0 | | N/A N/A P8 N/A / N/A | 3712MiB / 48895MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ $ python3 --version Python 3.8.18 $ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Mon_Oct_11_21:27:02_PDT_2021 Cuda compilation tools, release 11.4, V11.4.152 Build cuda_11.4.r11.4/compiler.30521435_0 $ whereis cuda cuda: /usr/local/cuda $ cat /home/<USERNAME>/miniforge3/envs/ML/include/cudnn.h . . . /* cudnn : Neural Networks Library */ #if !defined(CUDNN_H_) #define CUDNN_H_ #include <cuda_runtime.h> #include <stdint.h> #include "cudnn_version.h" #include "cudnn_ops_infer.h" #include "cudnn_ops_train.h" #include "cudnn_adv_infer.h" #include "cudnn_adv_train.h" #include "cudnn_cnn_infer.h" #include "cudnn_cnn_train.h" #include "cudnn_backend.h" #if defined(__cplusplus) extern "C" { #endif #if defined(__cplusplus) } #endif #endif /* CUDNN_H_ */ $ conda list | grep tensorflow tensorflow 2.13.1 cuda118py38h409af0c_1 conda-forge tensorflow-base 2.13.1 cuda118py38h52ca5c6_1 conda-forge tensorflow-estimator 2.13.1 cuda118py38ha2f8a09_1 conda-forge tensorflow-gpu 2.13.1 cuda118py38h0240f8b_1 conda-forge $ conda list | grep keras keras 2.13.1 pyhd8ed1ab_0 conda-forge $ python import tensorflow as tf tf.test.is_built_with_cuda() True tf.config.list_physical_devices('GPU') [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')] print(tf.__version__) 2.13.1 # if you installed pytorch using above optional conda installation, check:(OPTIONAL) Check pytorch $ python import torch $ python print(torch.__version__) # Print PyTorch version 2.2.0 $ python print(torch.cuda.is_available()) # Check if CUDA is available True $ python print(torch.version.cuda) # Print the CUDA version PyTorch is using 11.8 $ python if torch.cuda.is_available(): # Create a tensor and move it to GPU x = torch.tensor([1.0, 2.0]).cuda() print(x) # Print the tensor to verify it's on the GPU else: print("CUDA is not available. Check your PyTorch installation.") tensor([1., 2.], device='cuda:0') |
#Using Docker
If you want to use GPUs in docker, you need to take few extra steps after creating the VM.
Install Docker
In ubuntu:Code Block sudo apt install -y docker.io sudo usermod -aG docker $USER
In Centos:
Code Block sudo yum-config-manager \ --add-repo \ https://download.docker.com/linux/centos/docker-ce.repo sudo yum install docker-ce docker-ce-cli containerd.io sudo systemctl --now enable docker sudo usermod -aG docker $USER
- Logout and login again
Install nvidia-container toolkit
Ubuntu:Code Block distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit sudo systemctl restart docker
Centos:
Code Block distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo sudo yum clean expire-cache && sudo yum install -y nvidia-docker2 sudo systemctl restart docker
Run GPU-compatible notebook. For example:
Code Block # might need sudo sudo docker run --gpus all --env NVIDIA_DISABLE_REQUIRE=1 -it --rm -v $(realpath ~/notebooks):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:latest-gpu-jupyter