Skip to content

Commit 33ba8f3

Browse files
simon-momfournioux
authored andcommitted
[Benchmark] Add new H100 machine (vllm-project#10547)
Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
1 parent 3496e9b commit 33ba8f3

File tree

2 files changed

+31
-21
lines changed

2 files changed

+31
-21
lines changed

.buildkite/nightly-benchmarks/benchmark-pipeline.yaml

+21-18
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ steps:
1313
- wait
1414

1515
- label: "A100"
16+
# skip: "use this flag to conditionally skip the benchmark step, useful for PR testing"
1617
agents:
1718
queue: A100
1819
plugins:
@@ -45,6 +46,7 @@ steps:
4546
medium: Memory
4647

4748
- label: "H200"
49+
# skip: "use this flag to conditionally skip the benchmark step, useful for PR testing"
4850
agents:
4951
queue: H200
5052
plugins:
@@ -63,21 +65,22 @@ steps:
6365
- VLLM_USAGE_SOURCE
6466
- HF_TOKEN
6567

66-
67-
# - label: "H100"
68-
# agents:
69-
# queue: H100
70-
# plugins:
71-
# - docker#v5.11.0:
72-
# image: public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:$BUILDKITE_COMMIT
73-
# command:
74-
# - bash
75-
# - .buildkite/nightly-benchmarks/run-benchmarks-suite.sh
76-
# mount-buildkite-agent: true
77-
# propagate-environment: true
78-
# ipc: host
79-
# gpus: all
80-
# environment:
81-
# - VLLM_USAGE_SOURCE
82-
# - HF_TOKEN
83-
68+
- label: "H100"
69+
# skip: "use this flag to conditionally skip the benchmark step, useful for PR testing"
70+
agents:
71+
queue: H100
72+
plugins:
73+
- docker#v5.12.0:
74+
image: public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:$BUILDKITE_COMMIT
75+
command:
76+
- bash
77+
- .buildkite/nightly-benchmarks/scripts/run-performance-benchmarks.sh
78+
mount-buildkite-agent: true
79+
propagate-environment: true
80+
ipc: host
81+
gpus: all # see CUDA_VISIBLE_DEVICES for actual GPUs used
82+
volumes:
83+
- /data/benchmark-hf-cache:/root/.cache/huggingface
84+
environment:
85+
- VLLM_USAGE_SOURCE
86+
- HF_TOKEN

.buildkite/nightly-benchmarks/scripts/convert-results-json-to-markdown.py

+10-3
Original file line numberDiff line numberDiff line change
@@ -157,10 +157,17 @@ def results_to_json(latency, throughput, serving):
157157
throughput_results,
158158
serving_results)
159159

160-
# Sort all dataframes by their respective "Test name" columns
161160
for df in [latency_results, serving_results, throughput_results]:
162-
if not df.empty:
163-
df.sort_values(by="Test name", inplace=True)
161+
if df.empty:
162+
continue
163+
164+
# Sort all dataframes by their respective "Test name" columns
165+
df.sort_values(by="Test name", inplace=True)
166+
167+
# The GPUs sometimes come in format of "GPUTYPE\nGPUTYPE\n...",
168+
# we want to turn it into "8xGPUTYPE"
169+
df["GPU"] = df["GPU"].apply(
170+
lambda x: f"{len(x.split('\n'))}x{x.split('\n')[0]}")
164171

165172
# get markdown tables
166173
latency_md_table = tabulate(latency_results,

0 commit comments

Comments
 (0)