Skip to content

Fix Max New Tokens in HF's Generation Config #257

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mostafaelhoushi
Copy link

HuggingFace's max_length configuration corresponds to the total length of the prompt and the generated output, while max_new_tokens corresponds to the length of generated output only.

Using args.max_length_generation to set max_new_tokens fixed runtime errors for me. Using args.max_length_generation to set max_length lead to runtime errors because the total length of prompt+generation would exceed the intended value.

…iguration

HuggingFace's `max_length` configuration corresponds to the total length of the prompt and the generated output, while `max_new_tokens` corresponds to the length of generated output only.

Using `args.max_length_generation` to set `max_new_tokens` fixed runtime errors for me. 
Using `args.max_length_generation` to set `max_length` lead to runtime errors because the total length of prompt+generation would exceed the intended value.
@kbmlcoding
Copy link

kbmlcoding commented Jul 23, 2024

Thanks for fixing it > This is the message i am seeing as well in the logs when ran humaneval against llama2-7b-chat-hf model:

bigcode-evaluation-harness/bigcode_eval/utils.py:361: UserWarning: An error with the following message was thrown: Input length of input_ids is 1000, but max_length is set to 1000. This can lead to unexpected behavior. You should consider increasing max_length or, better yet, setting max_new_tokens.. Returning the input as the generation, for higher scores consider using a larger max_length
2024-07-23 11:50:32 EDT code_eval line: 74: [INFO] warnings.warn(f"An error with the following message was thrown: {e}. Returning the input as the generation, for higher scores consider using a larger max_length")

Adding more details for clarity per official api doc from HF https://huggingface.co/docs/transformers/en/main_classes/text_generation

max_length (int, optional, defaults to 20) — The maximum length the generated tokens can have. Corresponds to the length of the input prompt + max_new_tokens. Its effect is overridden by max_new_tokens, if also set.

max_new_tokens (int, optional) — The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.

@mostafaelhoushi
Copy link
Author

Thanks @kbmlcoding for approving. Still unable to merge the PR. Do we need another approval?

Cc @loubnabnl

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants