-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add support for custom intervals #814
base: main
Are you sure you want to change the base?
Conversation
model_analyzer/config/generate/brute_plus_binary_search_run_config_generator.py
Fixed
Show fixed
Hide fixed
model_analyzer/config/generate/quick_plus_concurrency_sweep_run_config_generator.py
Fixed
Show fixed
Hide fixed
if self._cli_config.is_llm_model(): | ||
# The possible inference loads are concurrency, request rate, periodic concurrency, or custom (request-intervals) | ||
# - If custom is specified, it is used | ||
# - For LLM models, periodic concurrency is used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rework comment as LLM has gone away
model_analyzer/config/generate/perf_analyzer_config_generator.py
Outdated
Show resolved
Hide resolved
@@ -153,3 +157,9 @@ def _set_concurrency(self, run_config: RunConfig, concurrency: int) -> RunConfig | |||
perf_config.update_config({"concurrency-range": concurrency}) | |||
|
|||
return run_config | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is duplicated (also in brute search). Maybe this should be a static method in ModelProfileSpec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is still duplicated. I didn't clean it up yet. Both classes implement ConfigGeneratorInterface. You could create a base class with common code if you want.
@@ -511,9 +511,12 @@ def _get_next_perf_analyzer_config( | |||
|
|||
perf_analyzer_config.update_config_from_profile_config(model_name, self._config) | |||
|
|||
concurrency = self._calculate_concurrency(dimension_values) | |||
perf_config_params = {"batch-size": 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it'd be cleaner if the PerfAnalyzerConfig() constructor initialized batch-size: 1
. Seems like we are always needed to add this in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done
1b24043
to
38abbe8
Compare
fae1195
to
6b3a199
Compare
"concurrency-range": default_concurrency, | ||
} | ||
default_perf_analyzer_config.update_config(perf_config_params) | ||
if not "request-intervals" in model.perf_analyzer_flags(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing a self-review: I'm wondering if this should be if not model.is_load_specified()
just like line 515
Fix support for Perf Analyzer's request-intervals
Fixes #808