Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

RL: Save intermediate model files and delete them after training #138

Merged
merged 1 commit into from
Jun 21, 2021

Conversation

magehrke
Copy link
Contributor

Changes:

  • Removes a bug, where the GPU crashed if a spike occurred during training
  • The spike could not load the latest model, because no intermediate models where saved during training
  • Set export_weights in rl_training.py to true
  • Added function in fileio that removes all model files in the weight directory
  • Calling the function at the end of training

TODO:

  • if a spike occures before the first intermediate model is saved, the GPU will still crash. A fix to this is a little more challenging, because the rl_loop saves the models in a different format and we cannot reconstruct the file name that the trainer class needs.

@QueensGambit
Copy link
Owner

Thanks. Looks good to me!

@QueensGambit QueensGambit merged commit 8d53870 into QueensGambit:master Jun 21, 2021
@magehrke magehrke deleted the rl-spike-handling branch June 22, 2021 19:31
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants