Update README.md

06749567 · Xuan Gu · GitHub · b1665a6e · 06749567
Unverified Commit 06749567 authored 2 years ago by Xuan Gu Committed by GitHub 2 years ago
--- a/README.md
+++ b/README.md
@@ -84,13 +84,12 @@ when batch_size is large (16, 32, 64, 128), throughput_amp > throughput_tf32.

 - Observation 2: The coefficient of variation of throughput for 100 iterations is smallest when batch_size = 128.  

+<img src="https://github.com/xuagu37/Benchmark_nnU-Net_for_PyTorch/blob/main/figures/benchmark_throughput_cv.png" width="400">
+
 **Benchmarking with dim = 2, nodes = 1, 2, gpus = 8, batch_size = 128 can be used for node health check.** 
 - The expected throughput for dim = 2, nodes = 1, gpus = 8, batch_size = 128 would be 4700 ± 500 (TF32).
 - The expected throughput for dim = 2, nodes = 2, gpus = 16, batch_size = 128 would be 9250 ± 150 (TF32).

-
-<img src="https://github.com/xuagu37/Benchmark_nnU-Net_for_PyTorch/blob/main/figures/benchmark_throughput_cv.png" width="400">
-
 - Observation 3: Ideally, the improvement of throughput would be linear when batch_size increases. In practice, throughtput stays below the ideal curve when batch_size > 16.

 <img src="https://github.com/xuagu37/Benchmark_nnU-Net_for_PyTorch/blob/main/figures/benchmark_throughput_batch_size_ideal.png" width="400">