How to containerize GPU-accelerated applications with Docker-Nvidia

The role of GPU accelerators in high performing applications and step-by-step guide to Docker-Nvidia set-up

DevOpsUPDATED ON December 2, 2021

DevOps Engineer cartoon depiction

In this instalment of our DevOps consulting series, we look at how to build and run Docker containers using high-powered NVIDIA GPUs, providing a step-by-step tutorial. GPU-accelerated computing is the use of a graphics processing unit to accelerate deep learning, analytics, and engineering applications. First introduced in 2007 by NVIDIA, today GPU accelerators power energy-efficient data-centres worldwide and play a key role in applications’ acceleration.

Agile & DevOps teams and consultants

Supercharge your next cloud development project!

Containerization of GPU applications with Docker-Nvidia

Containerizing GPU applications leads to a number of benefits including ease of deployment, streamlined collaboration, isolation of individual devices and many more. However, Docker® containers are most commonly used to easily deploy CPU-based applications on several machines, where containers are both hardware- and platform-agnostic. The Docker engine doesn’t natively support NVIDIA GPUs as it uses specialized hardware that requires the NVIDIA driver to be installed.

This is our experience of using a graphics processing unit to build and run Docker containers and a step-by-step description of how this was achieved.

Step-by-step Nvidia-Docker set-up to unleash GPU-accelerated computing

To start, we’re going to need a server with NVIDIA GPU. Hetzner has a server with GeForce® GTX 1080

Requirements:

OS
CentOS 7.3 

Docker
Docker version 19.03.15 

NVIDIA Drivers
latest 

Let’s download and install the necessary drivers for this graphic card:

After downloading, we need to install the driver, performing all the steps

1./NVIDIA-Linux-x86_64-<major_version>.<minor_version>.run 

How Nvidia and Docker work together

We will need to install nvidia-docker and the nvidia-docker-plugin. You can learn more about how to do that on nvidia github

1 wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker-1.0.1-1.x86_64.rpm 
2 sudo rpm -i /tmp/nvidia-docker*.rpm && rm /tmp/nvidia-docker*.rpm

Launching service:

1  sudo systemctl start nvidia-docker 

Testing:

1  nvidia-docker run --rm nvidia/cuda nvidia-smi 

Should get the following result:

1 Thu Jul 27 13:44:07 2017 
2 +-------------------------------------------------------------+
3 | NVIDIA-SMI 375.20 Driver Version: 375.20 |
4 |--------------------+----------------+-----------+
5 | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
6 | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
7 |==================+=====================+======================|
8 | 0 GeForce GTX 1080 Off | 0000:01:00.0 Off | N/A |
9 | 33% 36C P8 11W / 180W | 0MiB / 8145MiB | 0% Default |  
10 +------------------------+---------------------+----------------------+
11   
12 +-----------------------------------------------------------------------+
13 | Processes: GPU Memory |
14 | GPU PID Type Process name Usage |
15 |================================================================|
16 | No running processes found |
17 +-----------------------------------------------------------------------+

When does IT Outsourcing work?

(And when doesn’t it?)

Docker container with GPU support in orchestrator

Docker Swarm is not an option here as it is not possible to access the inside of the device in docker-compose V3.

From the official website:

We can now use the resources of the graphics card, but if we need to use orchestration tools, nvidia-docker will not be able to start, since it is an add-on over the Docker.

We’ve just launched a container in the Rancher cluster.

What exactly is Nvidia-Docker?

Now let’s dive into the details of what Nvidia-docker actually is. Basically, this is a service that creates a Docker volume and mounts the devices into a container.

To find out what was created and mounted, we will need to run the following command:

1  curl -s http://localhost:3476/docker/cli 

Here’s the result:

volume-driver=nvidia-docker
volume=nvidia_driver_375.20:/usr/local/nvidia:ro
device=/dev/nvidiactl
device=/dev/nvidia-uvm
device=/dev/nvidia-uvm-tools
device=/dev/nvidia0

For mathematical calculations, we use a Python library – tensorflow-gpu (TensorFlow)

Let’s write a Dockerfile where the base image is taken from Docker Hub Nvidia/CUDA

1FROM nvidia/cuda:8.0-cudnn5-runtime-centos7
2
3 RUN pip install tensorflow-gpu
4
5 ENTRYPOINT ["python", "math.py"]

Then write docker-compose to build and run the compute container:

1 version: '2'
2 services:
3    math: 
4   build: .
5     volumes:
6       - nvidia_driver_375.20:/usr/local/nvidia:ro
7     devices:
8      - /dev/nvidiactl
9       - /dev/nvidia-uvm
10       - /dev/nvidia-uvm-tools
11       - /dev/nvidia0
12
13
14 volumes:
15 nvidia_driver_375.20:
16  river: nvidia-docker
17  external: true

Launching Docker container:

1  docker-compose up -d 

If everything is done correctly, then when you run the command:

1  nvidia-docker run --rm nvidia/cuda nvidia-smi 
You get the following result:
1 Thu Jul 27 15:12:40 2017
2 +-----------------------------------------------------------------------+
3| NVIDIA-SMI 375.20                 Driver Version: 375.20               |
4|--------------------------+-------------------+-------------------+
5| GPU  Name     Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC|
6| Fan  Temp  Perf  Pwr:Usage/Cap|     Memory-Usage | GPU-Util  Compute M.|
7|===============================+======================+========|
8|   0  GeForce GTX 1080    Off  | 0000:01:00.0     Off |             N/A |
9| 39%   53C   P2   86W / 180W |   7813MiB /  8145MiB |    56%    Default |
10 +-----------------------+----------------------+----------------------+
11
12 +----------------------------------------------------------------------+
13| Processes:                                              GPU Memory |
14|  GPU       PID  Type  Process name                      Usage      |
15|==============================================================|
16|    0     27798    C   python                             7803MiB |
17 +---------------------------------------------------------------------+

In the processes, you can see that Python uses 56% of the GPU

We’ve just taught Docker, the leading container platform, to work with GeForce graphics cards and it can now be used to containerize GPU-accelerated applications. This means you can easily containerize and isolate accelerated application without any modifications and deploy it on any supported GPU-enabled infrastructure.

K&C - Creating Beautiful Technology Solutions For 20+ Years . Can We Be Your Competitive Edge?

Drop us a line to discuss your needs or next project