Remove local_files_only
and use codebase_version
instead of branches
#734
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this does
Simplifies the use of the dataset by looking for files locally first, then pulling from the hub if needed. By setting the argument
force_cache_sync=True
, this behavior can be overridden and force syncing local files from the hub first.Also eliminate the hub branch convention we used so far to determine the version of a dataset. The value from
codebase_version
in theinfo.json
is now the single source of truth to determine with what version of LeRobot a dataset was created.We will only use the
main
branch by default, but one can specify a branch or commit by using the newrevision
arg. If thisrevision
is not available on the hub, the latest available version will be downloaded.How it was tested
v2.0
dataset on this branch (e.g. lerobot/pusht) correctly displays a warning:v2.1
dataset on the main branch still works with no issue (e.g. aliberts/koch_tutorial)