Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Flaw in setting a per-request timeout #90

Open
sjmonson opened this issue Feb 1, 2025 · 0 comments · May be fixed by #91
Open

Flaw in setting a per-request timeout #90

sjmonson opened this issue Feb 1, 2025 · 0 comments · May be fixed by #91

Comments

@sjmonson
Copy link
Member

sjmonson commented Feb 1, 2025

In llm-load-test we give each request a response deadline. Requests that do not output all tokens within that deadline are filtered from the final result.

llm-load-test/utils.py

Lines 175 to 178 in 23db634

# Only consider requests that were completed within the duration of the test for
# calculating the summary statistics on tpot, ttft, itl, tt_ack
df_test_duration = df[df["output_tokens"] == df["output_tokens_before_timeout"]]
req_completed_within_test_duration = len(df_test_duration)

If all requests end after their deadline, then df_test_duration will contain no results and subsequent calculations will return NaN.

"summary": {
    "tpot": {
      "min": NaN,
      "max": NaN,
      "median": NaN,
      "mean": NaN,
      "percentile_80": NaN,
      "percentile_90": NaN,
      "percentile_95": NaN,
      "percentile_99": NaN
    },
...
}

We should rethink how per-request deadlines should work.

@sjmonson sjmonson linked a pull request Feb 4, 2025 that will close this issue
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant