Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix genai-perf command line for LLM model type #959

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

antonaleks
Copy link

Fix from this issue #935
If we run model-analyzer from nvcr.io/nvidia/tritonserver:24.08-py3-sdk docker container for model with LLM model type,
It will fail with the following error message:

Command:
genai-perf -m my_model -- -b 1 -u server:8001 -i grpc -f my_model-results.csv --verbose-csv --concurrency-range 64 --measurement-mode count_windows --collect-metrics --metrics-url http://server:8002 --metrics-interval 1000

Error:
2024-10-01 10:42 [INFO] genai_perf.parser:803 - Detected passthrough args: ['-b', '1', '-u', 'server:8001', '-i', 'grpc', '-f', 'my_model-results.csv', '--verbose-csv', '--concurrency-range', '64', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://server:8002', '--metrics-interval', '1000']
usage: genai-perf [-h] [--version] {compare,profile} ...
genai-perf: error: argument subcommand: invalid choice: 'my_model' (choose from 'compare', 'profile')
It looks like the genai-perf command line created by model_analyzer missing required mode (genai-perf profile ...).

it seems that genai-perf has changed their CLI and now it requires profile. We could try adding profile to line 328 of perf_analyzer.py. It should now look like:

cmd = ["genai-perf", "profile -m", self._config.models_name()]

@nv-braf
Copy link
Contributor

nv-braf commented Feb 6, 2025

Model Analyzer no longer supports LLMs (as you have noted the interface has changed). I would encourage you to use GenAI-Perf directly as the ability to both checkpoint and sweep through stimulus parameters has recently been added.
https://github.com/triton-inference-server/perf_analyzer/blob/main/genai-perf/docs/analyze.md

@antonaleks
Copy link
Author

Model Analyzer no longer supports LLMs (as you have noted the interface has changed). I would encourage you to use GenAI-Perf directly as the ability to both checkpoint and sweep through stimulus parameters has recently been added. https://github.com/triton-inference-server/perf_analyzer/blob/main/genai-perf/docs/analyze.md

Thank you for your response! I have a few follow-up questions:

  1. Does the suggested method allow for automatic Triton configuration tuning similar to Model Analyzer? If so, is there any documentation or example on how to achieve that?

  2. If not, is there a plan to introduce a tool or service that can assist with LLM configuration tuning in the future?

  3. What was the reason for discontinuing LLM support in Model Analyzer? Was it due to a shift in focus, technical limitations, or another factor?

  4. If LLM support is no longer available, perhaps it would make sense to remove mentions of this mode from the repository to avoid confusion?

Looking forward to your insights!

@nv-braf
Copy link
Contributor

nv-braf commented Feb 7, 2025

Model Analyzer no longer supports LLMs (as you have noted the interface has changed). I would encourage you to use GenAI-Perf directly as the ability to both checkpoint and sweep through stimulus parameters has recently been added. https://github.com/triton-inference-server/perf_analyzer/blob/main/genai-perf/docs/analyze.md

Thank you for your response! I have a few follow-up questions:

  1. Does the suggested method allow for automatic Triton configuration tuning similar to Model Analyzer? If so, is there any documentation or example on how to achieve that?
  2. If not, is there a plan to introduce a tool or service that can assist with LLM configuration tuning in the future?
  3. What was the reason for discontinuing LLM support in Model Analyzer? Was it due to a shift in focus, technical limitations, or another factor?
  4. If LLM support is no longer available, perhaps it would make sense to remove mentions of this mode from the repository to avoid confusion?

Looking forward to your insights!

No, GenAI-Perf does not support automatic Triton configuration tuning. Can you share what parameters you are interested in tuning?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants