Install TensorFlow on Docker Running on Creodias WAW3-1 vGPU Virtual Machine
TensorFlow is one of the most popular libraries for Machine Learning. Using TensorFlow on vGPU VMs enables significant speed up to the machine learning workflows. In this article we demonstrate how to install TensorFlow on WAW3-1 cloud with enabled vGPU support.
This article describes the installation process of TensorFlow using Docker. If you prefer not to use Docker, please follow the instructions from article: Install TensorFlow on WAW3-1 vGPU enabled VM on Creodias.
These instructions are based on the following sources:
A virtual machine with the Nvidia GPU created on the Creodias cloud. This machine must have a floating IP address and you must have the ability to connect to it using an SSH key stored on your PC (in this article we assume that you have Ubuntu 20.04 on your local computer). The following article describes how to create such machine: How To Create a New Linux VM With NVIDIA Virtual GPU in the OpenStack Dashboard Horizon on Creodias. If during that process you did not add a floating IP, this article describes such process: How to Add or Remove Floating IP’s to your VM on Creodias.
These instructions were tested on an Ubuntu 20.04 virtual machine with the default configuration for Creodias hosting. In particular, it means using the eouser account for the CLI commands.
Step 1: Initial operations
Connect to your virtual machine using SSH by invoking the following command (replace 18.104.22.168 with the floating IP address of your virtual machine).
Update all the software on your virtual machine:
sudo apt update && sudo apt upgrade
Verify that the NVIDIA graphics card works:
The result of your command should look like this:
Step 2: Install Docker
Install Docker using the official script and enable its service:
curl https://get.docker.com | sh && sudo systemctl --now enable docker
Step 3: Install and verify the NVIDIA Container Toolkit (ndivia-docker2)
The NVIDIA Container Toolkit is a tool used for building and running GPU-accelerated Docker containers. More information regarding it can be found under the following link: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/k8s/containers/container-toolkit
We need it since we will be running a GPU-accelerated workflow in a container.
Add the appropriate repository and GPG key:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
Now install the package nvidia-docker2:
sudo apt update && sudo apt install -y nvidia-docker2
sudo systemctl restart docker
Verify that the NVIDIA Container Toolkit works:
sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
You should see the output of the nvidia-smi command (this time, however, it is running from the inside of the container):
Step 4 Install TensorFlow with vGPU support
Pull the TensorFlow image:
sudo docker pull tensorflow/tensorflow:latest-gpu-jupyter
Run a test inside it.
sudo docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
After the last command, outcome of a random sample TF operation is shown (ignore warnings).
Your output should include information similar to this:
What To Do Next
Now that you have successfully installed TensorFlow on a Creodias WAW3-1 virtual machine with an enabled vGPU, you can try to use it for practical purposes. One of the ways to test it is described in the article Sample Deep Learning Workflow Using TensorFlow Running on Docker on Creodias WAW3-1 vGPU Virtual Machine. There you will see how quick a deeplearning operation can ben when a vGPU is present.