Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Downloading checkpoints from AML if not found on disk #614

Merged
merged 9 commits into from
Dec 9, 2021

Conversation

ant0nsc
Copy link
Contributor

@ant0nsc ant0nsc commented Dec 9, 2021

Workaround for an issue with low-priority preemption: checkpoint files are not available.

Please follow the guidelines for PRs contained here. Checklist:

  • Ensure that your PR is small, and implements one change.
  • Add unit tests for all functions that you introduced or modified.
  • Run PyCharm's code cleanup tools on your Python files.
  • Link the correct GitHub issue for tracking.
  • Update the Changelog file: Describe your change in terms of
    Added/Changed/Removed/... in the "Upcoming" section.
  • When merging your PR, replace the default merge message with a description of your PR,
    and if needed a motivation why that change was required.

javier-alvarez
javier-alvarez previously approved these changes Dec 9, 2021
Copy link
Contributor

@javier-alvarez javier-alvarez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but a lot of things have been moved, so it is a bit difficult to review the real changes.

@ant0nsc ant0nsc enabled auto-merge (squash) December 9, 2021 17:08
temp_folder = download_checkpoints_to_temp_folder()
available_checkpoints = find_all_recovery_checkpoints(temp_folder)
if available_checkpoints is not None:
return extract_latest_checkpoint_and_epoch(available_checkpoints)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a potential scenario in which we are not currently running in AML, but we want to call download_checkpoints_to_temp_folder, i.e. for a previous Run?

@ant0nsc ant0nsc merged commit c7eef5e into main Dec 9, 2021
@ant0nsc ant0nsc deleted the antonsc/checkpointhotfix branch December 9, 2021 20:18
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants