/geforce-gtx-1070-ubuntu-22.04-setup

GeForce GTX 1070 - Ubuntu 22.04 setup to use Tensorflow GPU model training

Primary LanguageJupyter NotebookMIT LicenseMIT

GeForce GTX 1070 - Ubuntu 22.04 setup

This repository contains the needed instructions to install NVIDIA drivers and libraries:

  • CUDA
  • cuDNN

So that a Machine learning model can be developed and trained with Tensorflow on the GPU.

Prerequisites

Setup & configuration

Nvidia setup

Verify that Nvidia drivers are installed:

$ nvidia-smi

Output example:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03              Driver Version: 530.41.03    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1070         Off| 00000000:0B:00.0  On |                  N/A |
| 11%   49C    P0               34W / 180W|   1080MiB /  8192MiB |      4%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2123      G   /usr/lib/xorg/Xorg                          647MiB |
|    0   N/A  N/A      2294      G   /usr/bin/gnome-shell                        116MiB |
|    0   N/A  N/A     21442      G   ...sion,SpareRendererForSitePerProcess      160MiB |
|    0   N/A  N/A    124925      G   ...8698312,18086028496798649942,262144      148MiB |
|    0   N/A  N/A    157723      G   gnome-control-center                          2MiB |
+---------------------------------------------------------------------------------------+

Check for driver updates

In above example output, versions installed at the time of writing:

  • Driver version: 530.41.03

https://www.nvidia.com/download/driverResults.aspx/200481/en-us/

  • CUDA Version: 12.1

Look up latest drivers

https://www.nvidia.com/download/driverResults.aspx/204837/en-us/

At the time of writing this is:

Version: 525.116.04
Release Date: 2023.5.9

Thus current version is outdated:

Version: 530.41.03
Release Date: 2023.3.23

At the time of writing i chose to not update to the latest driver and continue with next steps

Compute Unified Device Architecture (CUDA) setup

CUDA is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing. CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements, for the execution of compute kernels

Install CUDA toolkit, select the correct version (that matches the CUDA version, see previous step):

https://developer.nvidia.com/cuda-12-0-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local

$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin && \
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600 && \
wget https://developer.download.nvidia.com/compute/cuda/12.0.0/local_installers/cuda-repo-ubuntu2204-12-0-local_12.0.0-525.60.13-1_amd64.deb && \
sudo dpkg -i cuda-repo-ubuntu2204-12-0-local_12.0.0-525.60.13-1_amd64.deb && \
sudo cp /var/cuda-repo-ubuntu2204-12-0-local/cuda-*-keyring.gpg /usr/share/keyrings/ && \
sudo apt-get update && sudo apt-get -y install cuda

Verify that CUDA is installed

$ nvcc --version

Output example:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0

Take note of the release version, at the time of writing, we will need this information for the CUDA installation in the next steps:

11.5.119

Compile & and run CUDA C code example

This repo contains a CUDA script, (source) to test and verify that Graphics card drives and CUDA is properly installed:

Use the CUDA C compiler to compile the hello.cu example:

$ nvcc -o cuda/hello cuda/hello.cu

It should compile without any errors / warnings

Now we can run the hello world code example:

$ ./cuda/hello

Expected output:

Max error: 0.000000

If the output is as the same as the expected output we are good to go and means the CUDA libraries are properly working

CUDA Deep Neural Network (cuDNN)

NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks

Download cuDNN libary

In order to download the cuDNN library you need to register for a free account with the NVIDIA Developer programme in order to obtain the cuDDN library version that is compatible with the CUDA version installed in the previous step.

Once logged in follow this link to download cuDNN

Select the correct library version (that matches the CUDA installation version see previous step):

Here we use the deb file to download:

cuDNN deb download

But also the tar file can be used so either download the deb or the Local Installer for Linux x86_64 (Tar)

Install cuDNN Library

DEB

Install the downloaded deb file:

$ sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.2.26_1.0-1_amd64.deb
Tarball

In case you have chosen to download the tar file instead of the deb file, unpack tarball:

$ tar xvf cudnn-linux-x86_64-8.9.2.26_cuda11-archive.tar.xz

Then when tarball has been unpacked, copy the libraries:

$ sudo cp cuda/include/cudnn.h /usr/lib/cuda/include/ && \
sudo cp cuda/lib64/libcudnn* /usr/lib/cuda/lib64/ && \
sudo chmod a+r /usr/lib/cuda/include/cudnn.h /usr/lib/cuda/lib64/libcudnn*

Post-installation configurations

We need to setup the needed PATH and LD_LIBRARY_PATH variables, before the CUDA Toolkit and Driver can be used. See also the Post-installation actions guide for more information.

For bash:

$ echo 'export LD_LIBRARY_PATH=/usr/lib/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc && \
echo 'export LD_LIBRARY_PATH=/usr/lib/cuda/include:$LD_LIBRARY_PATH' >> ~/.bashrc && \
source ~/.bashrc

For zsh:

$ echo 'export LD_LIBRARY_PATH=/usr/lib/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.zshrc && \
echo 'export LD_LIBRARY_PATH=/usr/lib/cuda/include:$LD_LIBRARY_PATH' >> ~/.zshrc && \
source ~/.zshrc

Create and activate virtual environment

Create and activate a virtual environment, from root of project:

Venv

Create a virtual environment to install the needed packages:

$ python -m venv .venv && source .venv/bin/activate
$ pip install --upgrade pip
Install required packages
$ pip install -r requirements.txt

Run GPU detection verification

Notebook verification

Start Jupyter notebook:

$ jupyter notebook

Navigate to the notebooks directory and open up hello_world_tensorflow_gpu.ipynb and run all

Finallly tensorflow should find the GPU and should result in:

  • /device:CPU:0: The CPU of your machine.
  • /GPU:0: Short-hand notation for the first GPU of your machine that is visible to TensorFlow
import tensorflow as tf
tf.config.list_physical_devices('GPU')

Expected output:

[PhysicalDevice(name='/physical_device:GPU:0', device_type: 'GPU')]

Python module verification

From root of project run:

python src/main.py

Expected output:

[PhysicalDevice(name='/physical_device:GPU:0', device_type: 'GPU')]