Skip to content
Snippets Groups Projects
Commit 78a63ee3 authored by Xuan Gu's avatar Xuan Gu
Browse files

Update 4 files

- /NVIDIA/DeepLearningExamples/PyTorch/Segmentation/nnUNet/build.txt
- /NVIDIA/DeepLearningExamples/PyTorch/README.md
- /MLPerf/training/image_segmentation/pytorch/build.txt
- /MLPerf/training/image_segmentation/pytorch/README.md
parent 93d6cdc8
No related branches found
No related tags found
No related merge requests found
### Seting paths
The U-Net3D from MLPerf has no version control.
```
MODEL_NAME=nnunet_for_pytorch
MODEL_BASE=/proj/nsc_testing/xuan/containers/pytorch_1.7.1-cuda11.0-cudnn8-runtime.sif
CONTAINER_DIR=/proj/nsc_testing/xuan/containers/${MODEL_NAME}.sif
DEF_DIR=/proj/nsc_testing/xuan/berzelius-benchmarks/MLPerf/training/image_segmentation/pytorch//${MODEL_NAME}.def
WORK_DIR=/proj/nsc_testing/xuan/berzelius-benchmarks/MLPerf/training/image_segmentation/pytorch
```
### Building the container
```
apptainer build $MODEL_BASE docker://pytorch/pytorch:1.7.1-cuda11.0-cudnn8-runtime
apptainer build $CONTAINER_DIR $DEF_DIR
```
### Downloading and preprocessing the data
```
cd $WORK_DIR
git clone https://github.com/neheller/kits19
apptainer exec $CONTAINER_DIR bash -c "cd kits19 && python3 -m starter_code.get_imaging"
mv kits19/data/* raw-data/
rm -rf kits19
apptainer exec --nv -B ${WORK_DIR}/raw-data:/raw-data -B ${WORK_DIR}/data:/data -B ${WORK_DIR}/results:/results $CONTAINER_DIR bash -c "cd /workspace/unet3d && python3 preprocess_dataset.py --data_dir /raw-data --results_dir /data"
```
### Running benchmarking
```
apptainer exec --nv -B ${WORK_DIR}/raw-data:/raw-data -B ${WORK_DIR}/data:/data -B ${WORK_DIR}/results:/results $CONTAINER_DIR bash -c "cd /workspace/unet3d && bash run_and_time.sh 1"
```
### Running benchmarking using batch jobs
```
bash submit_benchmark_jobs.sh
```
apptainer build containers/pytorch_1.7.1-cuda11.0-cudnn8-devel.sif docker://pytorch/pytorch:1.7.1-cuda11.0-cudnn8-devel
apptainer build containers/pytorch_1.7.1-cuda11.0-cudnn8-runtime.sif docker://pytorch/pytorch:1.7.1-cuda11.0-cudnn8-runtime
MODULE_NAME=U-Net3D
......
### Seting paths
```
MODEL_NAME=nnunet_for_pytorch
MODEL_VERSION=21.11.0
MODEL_BASE=/proj/nsc_testing/xuan/containers/nvidia_pytorch_21.11-py3.sif
CONTAINER_DIR=/proj/nsc_testing/xuan/containers/${MODEL_NAME}_${MODEL_VERSION}.sif
DEF_DIR=/proj/nsc_testing/xuan/berzelius-benchmarks/NVIDIA/DeepLearningExamples/PyTorch/Segmentation/nnUNet/${MODEL_NAME}_${MODEL_VERSION}.def
WORK_DIR=/proj/nsc_testing/xuan/berzelius-benchmarks/NVIDIA/DeepLearningExamples/PyTorch/Segmentation/nnUNet
```
### Building the container
```
apptainer build $MODEL_BASE docker://nvcr.io/nvidia/pytorch:21.11-py3
apptainer build $CONTAINER_DIR $DEF_DIR
```
### Downloading and preprocessing the data
```
apptainer exec --nv -B ${WORK_DIR}/data:/data -B ${WORK_DIR}/results:/results --pwd /workspace/nnunet_pyt $CONTAINER_DIR python download.py --task 01
apptainer exec --nv -B ${WORK_DIR}/data:/data -B ${WORK_DIR}/results:/results --pwd /workspace/nnunet_pyt $CONTAINER_DIR python /workspace/nnunet_pyt/preprocess.py --task 01 --dim 2
```
### Running benchmarking
```
apptainer exec --nv -B ${WORK_DIR}/data:/data -B ${WORK_DIR}/results:/results --pwd /workspace/nnunet_pyt $CONTAINER_DIR python scripts/benchmark.py --mode train --gpus 1 --dim 2 --batch_size 256 --amp
apptainer exec --nv -B ${WORK_DIR}/data:/data -B ${WORK_DIR}/results:/results --pwd /workspace/nnunet_pyt $CONTAINER_DIR python scripts/benchmark.py --mode predict --gpus 1 --dim 2 --batch_size 256 --amp
```
### Running benchmarking using batch jobs
```
bash submit_benchmark_jobs.sh
```
......@@ -2,8 +2,6 @@ https://catalog.ngc.nvidia.com/orgs/nvidia/resources/nnunet_for_pytorch/quick-st
apptainer build containers/nvidia_pytorch_21.11-py3.sif docker://nvcr.io/nvidia/pytorch:21.11-py3
export APPTAINER_BINDPATH=
MODULE_NAME=nnunet_for_pytorch
MODULE_VERSION=21.11.0
CONTAINER_DIR=/proj/nsc_testing/xuan/containers/${MODULE_NAME}_${MODULE_VERSION}.sif
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment