3. Number of iterations for each parameter setting.
3. Number of iterations for each parameter setting.
We will average the benchmark performance over the iterations. We use a batch size of 128 which is the maximum usable batch size without a OOM error.
We will average the benchmark performance over the iterations. The maximum usable (without a OOM error) batch size is 256 and 128 for single and multi-node, respectively.