-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Properly raise FileNotFound even if the dataset is private #4536
Properly raise FileNotFound even if the dataset is private #4536
Conversation
The documentation is not available anymore as the PR was closed or merged. |
@@ -783,7 +783,7 @@ def get_module(self) -> DatasetModule: | |||
hfh_dataset_info = HfApi(config.HF_ENDPOINT).dataset_info( | |||
self.name, | |||
revision=self.revision, | |||
token=token, | |||
token=token if token else "no-token", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would passing token=False
work instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also don't hesitate to ping @SBrandeis / @coyotte508 / @Pierrci on those kind of PRs =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the source code of dataset_info
and it would use token="False"
and pass Bearer False
for the authentication x) so yes it would work
Though the type hint requires token
to be a string, not a boolean. So unless we're ok to say that the type hint can be ignored, I'll keep "no-token"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"no-token"
won't trigger the type checker so I think it's better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If HFH maintainers prefer "no-token"
, then this is OK! ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Good catch! ;)
@@ -783,7 +783,7 @@ def get_module(self) -> DatasetModule: | |||
hfh_dataset_info = HfApi(config.HF_ENDPOINT).dataset_info( | |||
self.name, | |||
revision=self.revision, | |||
token=token, | |||
token=token if token else "no-token", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If HFH maintainers prefer "no-token"
, then this is OK! ;)
tests/test_load.py::test_load_streaming_private_dataset
was failing because the hub now returns 401 when getting the HfApi.dataset_info of a dataset without authentication.load_dataset
was raising ConnectionError, while it should be FileNoteFoundError since it first checks for local files before checking the Hub.Moreover when use_auth_token is not set (default is False), we should not pass
token=None
to HfApi.dataset_info, or it will use the local token by default - instead it should use no token. It's currently not possible to ask for no token to be used, so as a workaround I simply set token="no-token"