You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A pretty common behavior of an interaction between the Hub and datasets is the following.
An organization adds a dataset in private mode and wants to load it afterward.
FileNotFoundError: Couldn't find a dataset script at /home/patrick/NewT5/dummy_data/dummy_data.py or any data file in the same directory. Couldn't find 'NewT5/dummy_data' on the Hugging Face Hub either: FileNotFoundError: Dataset 'NewT5/dummy_data' doesn't exist on the Hub
even though the user has access to the website NewT5/dummy_data since she/he is part of the org.
We need to improve the error message here similar to how @sgugger, @LysandreJik and @julien-c have done it for transformers IMO.
Steps to reproduce the bug
E.g. execute the following code to see the different error messages between transformes and datasets.
OSError: patrickvonplaten/gpt2-xl is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.
We raise the error “ FileNotFoundError: can’t find the dataset” mainly to follow best practice in security (otherwise users could be able to guess what private repositories users/orgs may have)
We can indeed reformulate this and add the "If this is a private repository,..." part !
Describe the bug
A pretty common behavior of an interaction between the Hub and datasets is the following.
An organization adds a dataset in private mode and wants to load it afterward.
This command then fails with:
even though the user has access to the website
NewT5/dummy_data
since she/he is part of the org.We need to improve the error message here similar to how @sgugger, @LysandreJik and @julien-c have done it for transformers IMO.
Steps to reproduce the bug
E.g. execute the following code to see the different error messages between
transformes
anddatasets
.The error message is clearer here - it gives:
Let's maybe do the same for datasets? The PR was introduced to
transformers
here:huggingface/transformers#15261
Expected results
Better error message
Actual results
Specify the actual results or traceback.
Environment info
datasets
version: 1.18.4.dev0The text was updated successfully, but these errors were encountered: