Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add support for custom intervals #814

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

tgerdesnv
Copy link
Collaborator

@tgerdesnv tgerdesnv commented Jan 17, 2024

Fix support for Perf Analyzer's request-intervals

Fixes #808

if self._cli_config.is_llm_model():
# The possible inference loads are concurrency, request rate, periodic concurrency, or custom (request-intervals)
# - If custom is specified, it is used
# - For LLM models, periodic concurrency is used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rework comment as LLM has gone away

@@ -153,3 +157,9 @@ def _set_concurrency(self, run_config: RunConfig, concurrency: int) -> RunConfig
perf_config.update_config({"concurrency-range": concurrency})

return run_config

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is duplicated (also in brute search). Maybe this should be a static method in ModelProfileSpec?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is still duplicated. I didn't clean it up yet. Both classes implement ConfigGeneratorInterface. You could create a base class with common code if you want.

@@ -511,9 +511,12 @@ def _get_next_perf_analyzer_config(

perf_analyzer_config.update_config_from_profile_config(model_name, self._config)

concurrency = self._calculate_concurrency(dimension_values)
perf_config_params = {"batch-size": 1}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it'd be cleaner if the PerfAnalyzerConfig() constructor initialized batch-size: 1. Seems like we are always needed to add this in the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done

model_analyzer/result/inference_load_search.py Outdated Show resolved Hide resolved
@tgerdesnv tgerdesnv force-pushed the tgerdes-fix-custom-intervals branch from 1b24043 to 38abbe8 Compare January 26, 2024 16:48
@tgerdesnv tgerdesnv force-pushed the tgerdes-fix-custom-intervals branch from fae1195 to 6b3a199 Compare March 30, 2024 13:40
"concurrency-range": default_concurrency,
}
default_perf_analyzer_config.update_config(perf_config_params)
if not "request-intervals" in model.perf_analyzer_flags():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing a self-review: I'm wondering if this should be if not model.is_load_specified() just like line 515

@tgerdesnv tgerdesnv marked this pull request as ready for review March 30, 2024 21:32
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Feature idea: perf analyzer behavior control
2 participants