I was trying to run some LLM docker stuff on my Arch Linux and I encountered the following error message.
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
I looked around and I noticed that the packages nvidia-container-toolkit
and libnvidia-container
are required to make this work, which seems to provide the GPU functionality to the docker containers for Nvidia cards.
I tried to install them using the pacman
and yay
, but both says package not found. When I google it, I found it on the following links, which is in the extra repo of Arch.
I went through like changing the pacman config, pacman mirrors and none of them seem to help me find the packages I need.
Finally I found a post on reddit and that leads me to try downloading the packages, building it and installing myself.
git clone https://gitlab.archlinux.org/archlinux/packaging/packages/libnvidia-container
cd libnvidia-container
makepkg -si
git clone https://gitlab.archlinux.org/archlinux/packaging/packages/nvidia-container-toolkit
cd nvidia-container-toolkit
makepkg -si
When installing the nvidia-container-toolkit
, it might fail with the following.
time="2024-05-11T14:30:59+08:00" level=info msg="Installed '/tmp/output-2742741441/input.real'"
time="2024-05-11T14:30:59+08:00" level=info msg="Installed wrapper '/tmp/output-2742741441/input'"
--- PASS: TestInstallExecutable (0.00s)
=== RUN TestNvidiaContainerRuntimeInstallerWrapper
--- PASS: TestNvidiaContainerRuntimeInstallerWrapper (0.00s)
PASS
ok github.com/NVIDIA/nvidia-container-toolkit/tools/container/toolkit (cached)
FAIL
==> ERROR: A failure occurred in check().
Aborting...
I just vi PKGBUILD
and removed the check()
function. Then it was able to install, and docker was able to use GPU. The following is how I test if the GPU is detected in the docker container.
Remember to restart the docker service after installing the above packages.
docker run --gpus all nvidia/cuda:12.1.1-runtime-ubuntu22.04 nvidia-smi
==========
== CUDA ==
==========
CUDA Version 12.1.1
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
Sat May 11 06:45:03 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67 Driver Version: 550.67 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4080 ... Off | 00000000:01:00.0 On | N/A |
| 0% 44C P8 21W / 320W | 622MiB / 16376MiB | 19% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+