Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Why change prompt_token_ids depending on encoder_decoder #851

Open
meanwo opened this issue Jan 18, 2024 · 0 comments
Open

Why change prompt_token_ids depending on encoder_decoder #851

meanwo opened this issue Jan 18, 2024 · 0 comments

Comments

@meanwo
Copy link

meanwo commented Jan 18, 2024

https://github.com/bentoml/OpenLLM/blob/6eb2ed5028dcaa7e6c7ba60e2ec8dc3377c353be/openllm-python/src/openllm/_runners.py#L181C1-L185

   if self.model.config.is_encoder_decoder:
      max_src_len = context_length
    else:
      max_src_len = context_length - max_new_tokens - 1
    prompt_token_ids = prompt_token_ids[-max_src_len:]

When using the only decoder-based model(llama2) on the hugging face, the prompt_token_ids(input_token_ids) length never changed because of max_new_tokens.

Is there a reason why you set promt_token_ids to change based on max_new_tokens via else syntax??

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant