# Run Pytorch and Tensorflow Containers with Nvidia Enroot ## Install enroot For Debian-based distributions ``` arch=$(dpkg --print-architecture) curl -fSsL -O https://github.com/NVIDIA/enroot/releases/download/v3.4.0/enroot_3.4.0-1_${arch}.deb curl -fSsL -O https://github.com/NVIDIA/enroot/releases/download/v3.4.0/enroot+caps_3.4.0-1_${arch}.deb # optional sudo apt install -y ./*.deb ``` For others, see [here](https://github.com/NVIDIA/enroot/blob/master/doc/installation.md) Plese note that enroot has been installed on Berzelius. You can skip this installation step if you plan to use it on Berzeliu. ## Set up Nvidia credentials Complete step [4.1](https://docs.nvidia.com/ngc/ngc-overview/index.html#account-signup) and [4.3](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key). Save the API key. Add the API key to the config file at ```~/.config/enroot/.credentials ``` ``` machine nvcr.io login $oauthtoken password your_api_key machine authn.nvidia.com login $oauthtoken password your_api_key ``` Set the config path by adding the line to ```~/.bashrc``` ``` export ENROOT_CONFIG_PATH=/home/xuagu37/.config/enroot ``` To make the path valid ``` source ~/.bashrc ``` This step is necessary for importing container images from Nvidia NGC. ## Import container images You can import a container image either from Nvidia NGC or Pytorch/Tensorflow official Docker Hub repositories. From Nvidia NGC ``` enroot import 'docker://nvcr.io#nvidia/pytorch:22.09-py3' enroot import 'docker://nvcr.io#nvidia/tensorflow:22.11-tf2-py3' ``` For other versions, please see the release notes for [Pytorch](https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html) and [Tensorflow](https://docs.nvidia.com/deeplearning/frameworks/tensorflow-release-notes/index.html). From Pytorch/Tensorflow official Docker Hub repositories ``` enroot import 'docker://pytorch/pytorch:1.12.1-cuda11.3-cudnn8-devel' enroot import 'docker://tensorflow/tensorflow:2.11.0-gpu' ``` For other versions, please see the Docker tags for [Pytorch](https://hub.docker.com/r/pytorch/pytorch/tags) and [Tensorflow](https://hub.docker.com/r/tensorflow/tensorflow/tags). ## Create a container I will only take Pytorch from Nvidia NGC for an example. ``` enroot create --name nvidia_pytorch_22.09 nvidia+pytorch+22.09-py3.sqsh ``` ## Start a container As the root user ``` enroot start --root --rw --mount /proj/nsc_testing/xuan:/proj/nsc_testing/xuan nvidia_pytorch_22.09 ``` As a non-root user ``` enroot start --rw --mount /proj/nsc_testing/xuan:/proj/nsc_testing/xuan nvidia_pytorch_22.09 ``` The flag ```--mount``` mounts your local directory to your container. You can also start a container and run your command at the same time. ``` enroot start --rw --mount /proj/nsc_testing/xuan:/proj/nsc_testing/xuan nvidia_pytorch_22.09 sh -c 'python path_to_your_script.py' ``` ## Access to GUI within Enroot ``` enroot start --rw --env DISPLAY --mount /tmp/.X11-unix:/tmp/.X11-unix --mount /proj/nsc_testing/xuan:/proj/nsc_testing/xuan nvidia_pytorch_22.09 ``` Please note that you need to use the flag ```-X``` when connecting to Berzelius. ## Path to user container storage By default, your enroot containers will be saved in your home directory. On Berzelius, you have 20 GB hard drive space for home. It is a better practice to put enroot containers in your proj directory. Add this line to your ```bashrc``` ``` export ENROOT_DATA_PATH=/proj/nsc_testing/xuan/enroot/data ```