Preamble

I recently got a fresh daily drive laptop which happens to have secure boot enabled. Typically this has added yet more complication to getting setup with CUDA and PyTorch. Hopefully the following reference can help someone else out or even just me if I need it again in the future.

How To

1. Purge system of NVIDIA CUDA in case of a previous failed install

sudo rm /etc/apt/sources.list.d/cuda*
sudo apt remove --autoremove nvidia-cuda-toolkit
sudo apt remove --autoremove nvidia-*
sudo rm -rf /usr/local/cuda*
sudo apt-get purge nvidia*
sudo apt-get update
sudo apt-get autoremove
sudo apt-get autoclean

2. Create keys for driver install

mkdir /home/$USER/cuda_install
cd /home/$USER/cuda_install
openssl req -new -x509 -newkey rsa:2048 -keyout /home/$USER/cuda_install/Nvidia.key -outform DER -out /home/$USER/cuda_install/Nvidia.der -nodes -days 36500 -subj "/CN=Graphics Drivers"

3. Enroll the public key you just created to MOK.

sudo mokutil --import /home/$USER/cuda_install/Nvidia.der

Here you will be prompted to create a password. Now reboot your system and you will be prompted to enter the password to enroll the key on the next startup.

sudo reboot

4. Get the driver that is packaged with the CUDA toolkit separately.

For me the run file, cuda_11.8.0_520.61.05_linux.run, is provided through this page. We can see from the file name we are installing CUDA 11.8.0 with the device driver 520.61.05.

Now we need to locate the NVIDIA driver’s run file on it’s own and sign with keys created earlier as for some reason I couldn’t get this to work signing the CUDA toolkit directly. Googling “NVIDIA 520.61.05 download” should suffice. Once you’ve downloaded it put it in /home/$USER/cuda_install.

5. Installation

Before we install the driver we need to disable the default kernel driver.

echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf; sudo update-initramfs -u

And reboot to have the changes take effect.

sudo reboot

Install the CUDA toolkit run file and turn off the driver in the installation options menu.

cd /home/$USER/cuda_install
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
sudo sh cuda_11.8.0_520.61.05_linux.run

Install from the device driver run file passing in the keys you created earlier.

sudo sh ./NVIDIA-Linux-x86_64-520.61.05.run -s --module-signing-secret-key=/home/$USER/cuda_install/Nvidia.key --module-signing-public-key=/home/$USER/cuda_install/Nvidia.der

6. Validate the CUDA Install

Confirm the CUDA install using the device query from the CUDA samples provided by NVIDIA.

git clone https://github.com/NVIDIA/cuda-samples.git
./cuda-samples/bin/x86_64/linux/release/deviceQuery

Should return something similar to the following…

./cuda-samples/bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA RTX A2000 8GB Laptop GPU"
  CUDA Driver Version / Runtime Version          11.8 / 11.8
  CUDA Capability Major/Minor version number:    8.6
  Total amount of global memory:                 7985 MBytes (8372486144 bytes)
  (020) Multiprocessors, (128) CUDA Cores/MP:    2560 CUDA Cores
  GPU Max Clock rate:                            1177 MHz (1.18 GHz)
  Memory Clock rate:                             5501 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total shared memory per multiprocessor:        102400 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Managed Memory:                Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.8, CUDA Runtime Version = 11.8, NumDevs = 1
Result = PASS

Confirm the CUDA compiler

nvcc -V

Should return something similar to the following…

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

Confirm the driver, docs here

nvidia-smi

Should return something similar to the following…

Mon Jan 22 10:40:25 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A200...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   44C    P3    N/A /  N/A |      9MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1075      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A      1841      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+

7. Installing PyTorch

Remove PyTorch

python3 -m pip uninstall torch torchvision torchaudio

Install PyTorch for CUDA 11.8

python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Confirm the PyTorch install and it’s ability to use CUDA by creating and running a Python script with the following contents.

import torch

print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA using device: {torch.cuda.get_device_name(torch.cuda.current_device())}")

Should return something similar to the following…

CUDA available: True
CUDA using device: NVIDIA RTX A2000 8GB Laptop GPU

Setting Up CUDA 11.8 and Pytorch on Ubuntu 20.04 with Secure Boot Enabled

Preamble

How To

1. Purge system of NVIDIA CUDA in case of a previous failed install

2. Create keys for driver install

3. Enroll the public key you just created to MOK.

4. Get the driver that is packaged with the CUDA toolkit separately.

5. Installation

6. Validate the CUDA Install

7. Installing PyTorch

Refs and Further Reading

Hey you!

Found this useful or interesting?

Any question, comments, corrections or suggestions?