Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Usage]: Sampling several sequences from OpenAI compatible server. #10852

Open
1 task done
Ignoramus0817 opened this issue Dec 3, 2024 · 5 comments
Open
1 task done
Labels
stale Over 90 days of inactivity usage How to use vllm

Comments

@Ignoramus0817
Copy link

Ignoramus0817 commented Dec 3, 2024

Your current environment

I got an error when running collect_env.py
ImportError: cannot import name '__version_tuple__' from 'vllm'

Anyway, I'm using vllm 0.5.3post1.

How would you like to use vllm

I want to sample n independent samples in one call of chat completion api from LLaMA-3-70B-Instruct (served with OpenAI compatible server).

I added the sampling parameter "--n" in the api call, but got n identical responses whether seed was set. Besides, if I manually call the api several times, each time with a different seed, I will get different outputs. Is this the expected behavior? If yes, what does this sampling parameter "--n" do?

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@Ignoramus0817 Ignoramus0817 added the usage How to use vllm label Dec 3, 2024
@jikunshang
Copy link
Contributor

#10503 this model should be supported last week. Please install latest vllm and try again.

@Ignoramus0817
Copy link
Author

Ignoramus0817 commented Dec 4, 2024

#10503 this model should be supported last week. Please install latest vllm and try again.

This PR is about OLMo model, which is irrelevant to this issue. I guess you reply to the wrong one?

@jikunshang
Copy link
Contributor

#10503 this model should be supported last week. Please install latest vllm and try again.

This PR is about OLMo model, which is irrelevant to this issue. I guess you reply to the wrong one?

sorry, please ignore this, reply to wrong thread.

@jikunshang
Copy link
Contributor

n means How many chat completion choices to generate for each input message see, https://platform.openai.com/docs/api-reference/chat and https://github.com/vllm-project/vllm/blob/main/vllm/sampling_params.py#L99
how do you sample n independent samples? can you try with examples/openai_completion_client.py or examples/openai_chat_completion_client.py(need add n parameter). I tested both, when set n to 2, two outputs are different.

Copy link

github-actions bot commented Mar 5, 2025

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

@github-actions github-actions bot added the stale Over 90 days of inactivity label Mar 5, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
stale Over 90 days of inactivity usage How to use vllm
Projects
None yet
Development

No branches or pull requests

2 participants