Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add defensive check for config num_labels and id2label #16709

Merged
merged 3 commits into from
Apr 13, 2022
Merged

Conversation

sgugger
Copy link
Collaborator

@sgugger sgugger commented Apr 11, 2022

What does this PR do?

As seen in #16600, there can be some unclear errors when the user tries to pass together an inconsistent num_labels and id2label. This PR addresses that with a clear error message.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 11, 2022

The documentation is not available anymore as the PR was closed or merged.

@BramVanroy
Copy link
Collaborator

BramVanroy commented Apr 12, 2022

Might I suggest to add an example in the error message? Especially for regression where num_labels=1, it might not be obvious to users what the id2label map has to look like. Suggestion:

f"You passed along `num_labels={kwargs['num_labels']}` with an incompatible ID to label map:
{kwargs['id2label']}. The id2label map should be a dictionary of ID (int) to label (str). E.g., for
num_labels=1 (regression), it should look like this: {0: "LABEL_0"},
even if you do not have any explicitly labelled data."

@sgugger
Copy link
Collaborator Author

sgugger commented Apr 12, 2022

I really don't understand why you want to pass both, as setting num_labels=1 will do exactly that.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks good to me! Thanks for working on the implementation.

@BramVanroy
Copy link
Collaborator

I really don't understand why you want to pass both, as setting num_labels=1 will do exactly that.

Exactly, but this comes back to the original issue that I posted. Passing num_labels=1, id2label=None causes issues, because id2label=None overwrites the generated id2label map. So the only thing that works is:

config = BertConfig.from_pretrained(model_name_or_path, num_labels=1, id2label=None)
config.num_labels = num_labels

or

config = BertConfig.from_pretrained(model_name_or_path, num_labels=1, id2label={0: "LABEL_0"})

Yet it is not obvious why

config = BertConfig.from_pretrained(model_name_or_path, num_labels=1, id2label=None)

doesn't work even though you don't strictly need labels for a regression problem, and even though id2label=None is the default argument.

And yes, I am aware that most users will not encounter this issue because they will not explicitly pass id2label=None, but that does not mean that it cannot happen. And if it does it should be made obvious to the user why something goes wrong. I often write code for different use-cases, and as my issue showed, you will encounter this issue if you need to write code for different num_labels/tasks dynamically. If users are not expected to write code like that, it doesn't hurt to tell them in the error message how they should write their code instead.

You are right though that my message seemed to imply that they have to provide an id2label map. Suggestion:

f"You passed along num_labels={kwargs['num_labels']} with an incompatible ID to label map:
{kwargs['id2label']}. If given (not required), the id2label map should be a dictionary of ID (int)
to label (str). Note that explicitly setting id2label to None may lead to unexpected errors.
Instead, do not pass the id2label argument at all or pass a dummy id2label with the same len
as num_labels."

@sgugger
Copy link
Collaborator Author

sgugger commented Apr 13, 2022

I adapted the error message slightly to insist on removing one of the incompatible kwarg.

@sgugger sgugger merged commit 8e0d3b4 into main Apr 13, 2022
@sgugger sgugger deleted the labels_in_config branch April 13, 2022 15:28
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
…6709)

* Add defensive check for config num_labels and id2label

* Actually check value...

* Only warning inside init plus better error message
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants