[Usage]: Sampling several sequences from OpenAI compatible server. #10852

Ignoramus0817 · 2024-12-03T08:37:46Z

Your current environment

I got an error when running collect_env.py
ImportError: cannot import name '__version_tuple__' from 'vllm'

Anyway, I'm using vllm 0.5.3post1.

How would you like to use vllm

I want to sample n independent samples in one call of chat completion api from LLaMA-3-70B-Instruct (served with OpenAI compatible server).

I added the sampling parameter "--n" in the api call, but got n identical responses whether seed was set. Besides, if I manually call the api several times, each time with a different seed, I will get different outputs. Is this the expected behavior? If yes, what does this sampling parameter "--n" do?

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

jikunshang · 2024-12-04T01:21:50Z

#10503 this model should be supported last week. Please install latest vllm and try again.

Ignoramus0817 · 2024-12-04T02:33:28Z

#10503 this model should be supported last week. Please install latest vllm and try again.

This PR is about OLMo model, which is irrelevant to this issue. I guess you reply to the wrong one?

jikunshang · 2024-12-04T02:35:45Z

#10503 this model should be supported last week. Please install latest vllm and try again.

This PR is about OLMo model, which is irrelevant to this issue. I guess you reply to the wrong one?

sorry, please ignore this, reply to wrong thread.

jikunshang · 2024-12-04T05:31:12Z

n means How many chat completion choices to generate for each input message see, https://platform.openai.com/docs/api-reference/chat and https://github.com/vllm-project/vllm/blob/main/vllm/sampling_params.py#L99
how do you sample n independent samples? can you try with examples/openai_completion_client.py or examples/openai_chat_completion_client.py(need add n parameter). I tested both, when set n to 2, two outputs are different.

github-actions · 2025-03-05T02:01:48Z

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

Ignoramus0817 added the usage How to use vllm label Dec 3, 2024

github-actions bot added the stale Over 90 days of inactivity label Mar 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Usage]: Sampling several sequences from OpenAI compatible server. #10852

[Usage]: Sampling several sequences from OpenAI compatible server. #10852

Ignoramus0817 commented Dec 3, 2024 •

edited

Loading

jikunshang commented Dec 4, 2024

Ignoramus0817 commented Dec 4, 2024 •

edited

Loading

jikunshang commented Dec 4, 2024

jikunshang commented Dec 4, 2024

github-actions bot commented Mar 5, 2025

[Usage]: Sampling several sequences from OpenAI compatible server. #10852

[Usage]: Sampling several sequences from OpenAI compatible server. #10852

Comments

Ignoramus0817 commented Dec 3, 2024 • edited Loading

Your current environment

How would you like to use vllm

Before submitting a new issue...

jikunshang commented Dec 4, 2024

Ignoramus0817 commented Dec 4, 2024 • edited Loading

jikunshang commented Dec 4, 2024

jikunshang commented Dec 4, 2024

github-actions bot commented Mar 5, 2025

Ignoramus0817 commented Dec 3, 2024 •

edited

Loading

Ignoramus0817 commented Dec 4, 2024 •

edited

Loading