This lesson is being piloted (Beta version)

Setup

Setup Jupyter kernel in ManeFrame M2’s JupyterLab

In our HPC system M2 and SuperPOD, in order to use Tensorflow and Pytorch with GPU support, we should install these with cuda and cudnn library enabled.

The following instruction shows step by step installation using CLI (Command Line Interface). You can use any terminal from your Windows/MacOS/Linux system.

Installing Tensorflow-GPU to ManeFrame III

Note that some of the previous model of Tensorflow (2.2, 2.4) are not working well with our preexisting cuda library. Therefore I encourage everyone to use the latest version 2.9.1 (the most current at this time of writing, Sep 2022) to avoid missing library.

Here we use 1 node with P100 GPU to do the installation:

From login node, request a compute node with GPU:

$ srun -N1 -G1 -c10 -p gpu-dev --pty $SHELL

Load neccesary library

$ module load spack conda
$ module load cuda/gcc-11.2.0/cuda/11.8.0-vnha6cm cuda-11.8.0-gcc-11.2.0/cudnn/8.7.0.84-11.8-aydlfs6

Install Conda environment and tensorflow to your home directory

$ conda create --prefix ~/tensorflow_2.9 python=3.8 pip
$ conda activate ~/tensorflow_2.9/  
$ pip install tensorflow==2.9.1 --no-cache-dir

Create Jupyter kernel to work in HPC Open OnDemand

$ pip install ipykernel
$ python3 -m ipykernel install --user --name tensorflow_2.9 --display-name TensorflowGPU29

Once installation done, check if tensorflow can find any GPU?

$  python
>>> import tensorflow as tf
>>> tf.config.list_physical_devices('GPU')
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Note: in order to enable GPU run for Tensorflow via Jupyter Notebook, we need to add the following lines to Custom environment settings when requesting a node in the Open OnDemand:

module load spack conda
module load cuda/gcc-11.2.0/cuda/11.8.0-vnha6cm cuda-11.8.0-gcc-11.2.0/cudnn/8.7.0.84-11.8-aydlfs6

Install Tensorflow from existing skln conda environment (ML_SKLN kernel) used in previous Machine Learning workshop

Once open Jupyter Lab, open new terminal:

$ conda activate ~/tensorflow_2.9
$ pip install tensorflow==2.9.1 tensorboard

Check if keras is correctly installed:

>>> import tensorflow as tf
>>> from tensorflow import keras
>>> print(keras.__version__)