DRL-boys Vecka 22 Nov
Tasks:
Batchify selfplay-
Learning rate schedule(Do manually because we will not train in one long big session) Test distributed on Olympen (Oskar)Measure replay rateInvestigate flipping of actions (Juppy)- ~~Paper on optimizing self play
Automatically save checkpoints for training and don't update model on each batch (Oskar)Profile selfplay to see if other things can be sped up (Oskar)- ~~Add more game end conditions to position and make sure that they are used in node and selfplay~
Goal:
- Try "real" distributed training for an extended period of time