Skip to content

Nvidia GPU in Docker

September 29, 2023
November 3, 2017

Installing Docker and The Docker Utility Engine for NVIDIA GPUs — NVIDIA AI Enterprise documentation

NVIDIA Container Runtime | NVIDIA Developer
Enabling GPUs in the Container Runtime Ecosystem | NVIDIA Developer Blog

With the release of Docker 19.03, usage of nvidia-docker2 packages is deprecated since NVIDIA GPUs are now natively supported as devices in the Docker runtime.

NVIDIA/nvidia-docker: Build and run Docker containers leveraging NVIDIA GPUs
Releases · NVIDIA/nvidia-docker Motivation · NVIDIA/nvidia-docker Wiki
2.0 uses nvidia-container-runtime instead of runC.

I Heard You Like GPUs in Servers... GPU Passthrough on Linux and Docker | Techno Tim Documentation

Repository configuration | libnvidia-container
NVIDIA/nvidia-container-runtime: NVIDIA container runtime a modified runC that will invoke nvidia-container-cli from project libnvidia-container when container starts.
NVIDIA/libnvidia-container: NVIDIA container runtime library

NVIDIA GPU Operator: Simplifying GPU Management in Kubernetes | NVIDIA Developer Blog

Installation

Overview — NVIDIA Cloud Native Technologies documentation
Installation Guide — NVIDIA Cloud Native Technologies documentation

Installation (version 2.0) · NVIDIA/nvidia-docker Wiki
Migration from nvidia-docker 1.0 — NVIDIA Cloud Native Technologies documentation

Docker

sudo apt install libltdl7
sudo dpkg -i docker-ce_17.06.2-ce-0-ubuntu_amd64.deb
sudo systemctl enable --now docker
sudo usermod -a -G docker ${USER}

NVIDIA driver

How do I install the NVIDIA driver? · NVIDIA/nvidia-docker Wiki
Installation Guide Linux :: CUDA Toolkit Documentation
Installing Nvidia CUDA 8.0 on Ubuntu 16.04 for Linux GPU Computing (New Troubleshooting Guide)

install NVIDIA driver (384.90+) and CUDA

sudo apt-get install nvidia-384 nvidia-cuda-toolkit

# verify installation
lsmod | grep nvidia
ldconfig -p | grep -E 'nvidia|cuda'

nvidia-smi
nvidia-modprobe
nvcc --version

解决 Driver/library version mismatch | Comzyh 的博客
Also fixed by a reboot
Proprietary GPU Drivers : “Graphics Drivers” team
ImportError: /usr/local/cuda-8.0/lib64/libcudnn.so.5: file too short - xianglao1935 的博客 - CSDN 博客

nvidia-docker2

Note: installing this overwrites /etc/docker/daemon.json

Set default runtime of docker daemon with --default-runtime=nvidia.
Environment variables (OCI spec)

Use nvidia-docker2 matching the Docker version installed

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install nvidia-docker2  # latest

# MUST match the Docker version installed
apt show nvidia-docker2 -a | grep Version
apt show nvidia-container-runtime -a | grep Version
# for Docker 17.06.2
sudo apt-get install \
  nvidia-docker2=2.0.3+docker17.06.2-1 \
  nvidia-container-runtime=2.0.0+docker17.06.2-1
# for Docker 17.09
sudo apt-get install \
  nvidia-docker2=2.0.3+docker17.09.0-1 \
  nvidia-container-runtime=2.0.0+docker17.09.0-1
sudo pkill -SIGHUP dockerd

Changing default runtime

Use /usr/bin/nvidia-container-runtime in /etc/docker/daemon.json:

{
  "default-runtime": "nvidia",
  "runtimes": {
    "nvidia": {
      "path": "/usr/bin/nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}
# now use
docker run --rm nvidia/cuda:8.0-runtime-ubuntu16.04 nvidia-smi
# instead of
docker run --runtime=nvidia --rm nvidia/cuda:8.0-runtime-ubuntu16.04 nvidia-smi

# swarm service also work
docker swarm init
docker service create --name cuda-service --constraint node.labels.gpu==true nvidia/cuda:test-service

cuda Base Image

nvidia/cuda - Docker Hub
8.0/runtime/Dockerfile · ubuntu16.04 · nvidia / cuda · GitLab
8.0/devel/Dockerfile · ubuntu16.04 · nvidia / cuda · GitLab
8.0/runtime/cudnn8/Dockerfile · ubuntu16.04 · nvidia / cuda · GitLab

Kubernetes

NVIDIA Container Runtime and Orchestrators | NVIDIA Developer
Kubernetes on NVIDIA GPUs Installation Guide :: Data Center Documentation

Schedule GPUs - Kubernetes
Using GPGPUs with Kubernetes | Ubuntu
基于 Kubernetes 的 GPU 类型调度实现