diff --git a/README.md b/README.md
index 3024f4dd98fba56c3f47e9a08104d24f7279af19..29149e5407f202e35099de3bd19ed88dcea8d00e 100644
--- a/README.md
+++ b/README.md
@@ -115,3 +115,4 @@ Ref: https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html
 - Use as large batch_size as possible for a more stable benchmark result. For single node, use 256; for multi-node, use 128.  
 - Benchmarking with dim = 2, nodes = 1, gpus = 8, batch_size = 128, 256 takes ~2mins.  
 - Specify the paths for enroot cache and data, see this [page](https://gitlab.liu.se/xuagu37/run-pytorch-and-tensorflow-containers-with-nvidia-enroot#set-path-to-user-container-storage).
+- For single node case, command ```srun singularity/enroot``` gives slightly worse throughput performance compared with running ```singularity/enroot``` directly.