-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Use a finer exception when local_files_only=True and a file is missing in cache #979
Comments
i saw this being discussed with a community member in transformers, mind linking the convo here? |
I haven't seen it discussed anywhere personally, this stems from this PR (bug reported by Stas on slack) and I was pointing at it on this comment. |
yes that's what i meant, thanks! |
cc @Wauplin if you have an opinion on the error handling currently done in |
Request seems legitimate :) By quickly looking at the code, error handling already looks good with the refined HTTPError ( I see 2 cases where we should keep the base raise ValueError("We have no connection or you passed local_files_only, so force_download is not an accepted option.")
raise ValueError(f"Invalid repo type: {repo_type}. Accepted repo types are: {str(REPO_TYPES)}") For the 2 others cases, it's definitely the case you described where the entry is not found locally. raise ValueError("Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.")
raise ValueError("Connection error, and we cannot find the requested files in the disk cache. Please try again or make sure your Internet connection is on.") From a clean state, my first instinct would be to create 3 exceptions: class EntryNotFoundError(Exception):
(...)
class LocalEntryNotFoundError(EntryNotFoundError):
(...)
class DistantEntryNotFoundError(EntryNotFoundError, HTTPError):
(...) But since @sgugger how would you like it ? |
|
@julien-c @LysandreJik If you are strongly opinionated against the |
no opinion on my side! |
In Transformers we sometime try to access files that are not in a repo, catch the error (
EntryNotFoundError
for distant repos) and try a different file. For instance most models have a single weight file namedpytorch_model.bin
but some models have several checkpoint files and an index since they are extremely big. For those, there is nopytorch_model.bin
but apytorch_model.bin.index.json
.Therefore, the logic in
from_pretrained
is to look atpytorch_model.bin
first, and if it's not there, atpytorch_model.bin.index.json
. Now when we have an internet connection, we can catch theEntryNotFoundError
and all is fine. When there is no internet or the user decided to activate the offline mode however,hf_hub_download
returns aValueError
, but it also returns aValueError
in many different situations, so we need to catch it and match the error message in Transformers which is not very clean, and very prone to breaking in the future.It would be much nicer if the error raised was more specific, like a
FileNotFoundError
or any subclass ofValueError
that would only be raised in this specific situation.The text was updated successfully, but these errors were encountered: