diff --git a/README.md b/README.md
index 006abad7db24b1b411ad1daf736dc6f92a123964..182002b48eedcbea9138c6923029126ad1bd014c 100644
--- a/README.md
+++ b/README.md
@@ -91,5 +91,5 @@ Coefficient of variation is calculated as the ratio of the standard deviation to
 - It seems running directly via singularity shell will give worse performance (when I WFH). We should run it via sbatch script instead.
 - It took around a week to finish 100 iterations of benchmarking for all sets of parameters.
 - For multi-node benchmarking, we need to use "srun" command; also, the line "#SBATCH --ntasks-per-node=8" has to be added. Otherwise the process will hang.
-- Use as large batch_size as possible for a more stable benchmark result.
+- Use as large batch_size as possible for a more stable benchmark result. For single node, use 256; for multi-node, use 128.  
 - Benchmarking with dim = 2, nodes = 1, gpus = 8, batch_size = 128, 256 takes ~2mins.