# Run Pytorch and Tensorflow Containers with Nvidia Enroot Enroot is a simple, yet powerful tool to turn traditional container/OS images into unprivileged sandboxes. Read [more](https://github.com/NVIDIA/enroot). ## Install enroot - For Debian-based distributions ``` arch=$(dpkg --print-architecture) curl -fSsL -O https://github.com/NVIDIA/enroot/releases/download/v3.4.0/enroot_3.4.0-1_${arch}.deb curl -fSsL -O https://github.com/NVIDIA/enroot/releases/download/v3.4.0/enroot+caps_3.4.0-1_${arch}.deb # optional sudo apt install -y ./*.deb ``` - For others, see [here](https://github.com/NVIDIA/enroot/blob/master/doc/installation.md) Plese note that enroot has been installed on Berzelius. You can skip this installation step if you plan to use it on Berzeliu. ## Set up Nvidia credentials This step is necessary for importing container images from Nvidia NGC. - Complete step [4.1](https://docs.nvidia.com/ngc/ngc-overview/index.html#account-signup) and [4.3](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key). Save the API key. - Add the API key to the config file at ```~/.config/enroot/.credentials ``` ``` machine nvcr.io login $oauthtoken password your_api_key machine authn.nvidia.com login $oauthtoken password your_api_key ``` - Set the config path by adding the line to ```~/.bashrc``` ``` export ENROOT_CONFIG_PATH=/home/xuagu37/.config/enroot ``` - To make the path valid ``` source ~/.bashrc ``` ## Set path to user container storage By default, your enroot containers will be saved in your home directory. On Berzelius, you have 20 GB hard drive space for home. It is a better practice to put enroot containers in your proj directory. Add this line to your ```bashrc``` ``` export ENROOT_CACHE_PATH=/proj/nsc_testing/xuan/enroot/cache export ENROOT_DATA_PATH=/proj/nsc_testing/xuan/enroot/data ``` ## Import container images You can import a container image either from Nvidia NGC or Pytorch/Tensorflow official Docker Hub repositories. - From Nvidia NGC ``` enroot import 'docker://nvcr.io#nvidia/pytorch:22.09-py3' enroot import 'docker://nvcr.io#nvidia/tensorflow:22.11-tf2-py3' ``` For other versions, please see the release notes for [Pytorch](https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html) and [Tensorflow](https://docs.nvidia.com/deeplearning/frameworks/tensorflow-release-notes/index.html). - From Pytorch/Tensorflow official Docker Hub repositories ``` enroot import 'docker://pytorch/pytorch:1.12.1-cuda11.3-cudnn8-devel' enroot import 'docker://tensorflow/tensorflow:2.11.0-gpu' ``` For other versions, please see the Docker tags for [Pytorch](https://hub.docker.com/r/pytorch/pytorch/tags) and [Tensorflow](https://hub.docker.com/r/tensorflow/tensorflow/tags). ## Create a container I will take Pytorch from Nvidia NGC for an example. ``` enroot create --name nvidia_pytorch_22.09 nvidia+pytorch+22.09-py3.sqsh ``` ## Start a container - As the root user ``` enroot start --root --rw --mount /proj/nsc_testing/xuan:/proj/nsc_testing/xuan nvidia_pytorch_22.09 ``` - As a non-root user ``` enroot start --rw --mount /proj/nsc_testing/xuan:/proj/nsc_testing/xuan nvidia_pytorch_22.09 ``` The flag ```--mount``` mounts your local directory to your container. - You can also start a container and run your command at the same time. ``` enroot start --rw --mount /proj/nsc_testing/xuan:/proj/nsc_testing/xuan nvidia_pytorch_22.09 sh -c 'python path_to_your_script.py' ``` ## Access to GUI within Enroot ``` enroot start --rw --env DISPLAY --mount /tmp/.X11-unix:/tmp/.X11-unix --mount /proj/nsc_testing/xuan:/proj/nsc_testing/xuan nvidia_pytorch_22.09 ``` Please note that you need to use the flag ```-X``` when connecting to Berzelius.