Skip to content

Commit ac15518

Browse files
Isotr0pytlrmchlsmth
authored andcommitted
[Bugfix][CPU] Fix CPU embedding runner with tensor parallel (vllm-project#10394)
Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
1 parent f5478b1 commit ac15518

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

vllm/worker/cpu_embedding_model_runner.py

+4
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,10 @@ def execute_model(
6666

6767
hidden_states = model_executable(**execute_model_kwargs)
6868

69+
# Only perform pooling in the driver worker.
70+
if not self.is_driver_worker:
71+
return []
72+
6973
return [
7074
self.model.pooler(hidden_states=hidden_states,
7175
pooling_metadata=model_input.pooling_metadata)

0 commit comments

Comments
 (0)