-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
tokenizer started throwing this warning, ""Truncation was not explicitely activated but max_length
is provided a specific value, please use truncation=True
to explicitely truncate examples to max length. Defaulting to 'only_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you may want to check this is the right behavior.""
#5397
Comments
This is because we recently upgraded the library to version v3.0.0, which has an improved tokenizers API. You can either disable warnings or put |
how do you disable the warnings for this? I'm encountering the same issue. But I don't want to set the truncation=True |
You can disable the warnings with: import logging
logging.basicConfig(level=logging.ERROR) |
I've changed the logging level and removed max_length but am still getting this error: WARNING:transformers.tokenization_utils_base:Truncation was not explicitely activated but |
On which version are you running? Can you try to install v3.0.2 to see if it fixes this issue? |
I've tried with v3.0.2 and I'm getting the same warning messages even when I changed the logging level with the code snippet above. |
@tutmoses @wise-east can you give us a self-contained code example reproducing the behavior? |
I got the same question |
update transformers library to v3 and explicitly provide "trucation=True" while encoding text using tokenizers |
Could reproduce the error with this code:
|
Fixes the issue similar to the mentioned in huggingface/transformers#5397, starting from transformers version v3.0.0.
Hello, Using the following command had solved the problem:
However, since today 15h40 (Paris time), it does not work anymore and the following warning continues to pop up until crashing Google Colab:
Could you please tell me how to solve it? I also tried to deactivate truncation from the encode_plus tokenizer:
But it did not work. Thank for your help/replies, ----------EDIT--------------- I modified my code in the following way by setting "truncation = True" as suggested on this post. It worked perfectly! From what I understood, this should consider the max_lenght I'm applying and avoid the warning from comming up.
J. |
'truncation=True' solves the problem. |
not elegant solution
|
add 'truncation=True' to tokenizer.encode_plus(truncation=True). |
This line is effective |
Recently while experimenting, BertTokenizer start to throw this warning
I know, this warning asks to provide truncation value.
I'm asking here because this warning started this morning.
The text was updated successfully, but these errors were encountered: