-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Feature] Pluggable platform-specific scheduler #13161
[Feature] Pluggable platform-specific scheduler #13161
Conversation
Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
While we need more discussions about this feature in v1, I think it's ok for v0 to have it. Could you add a unit test with a dummy scheduler to 1) test the functionality, and 2) be an example? |
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
e0beefe
to
835e97b
Compare
Thanks for your feedback @comaniac ! I have added a test which validates the functionality. |
tests/plugins/vllm_add_dummy_scheduler/vllm_add_dummy_scheduler/dummy_platform.py
Outdated
Show resolved
Hide resolved
vllm/config.py
Outdated
@@ -1338,6 +1338,7 @@ class ParallelConfig: | |||
# will be determined based on the platform. | |||
worker_cls: str = "auto" | |||
sd_worker_cls: str = "auto" | |||
scheduler_cls: str = "vllm.core.scheduler.Scheduler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can make the type either a string (qualname) or a type directly, similar to distributed_executor_backend
that can directly be a class object. then you can easily test it, similar to https://github.com/vllm-project/vllm/blob/main/tests/engine/test_executor.py
vllm/engine/llm_engine.py
Outdated
@@ -346,6 +346,8 @@ def get_tokenizer_for_seq(sequence: Sequence) -> AnyTokenizer: | |||
# Create the scheduler. | |||
# NOTE: the cache_config here have been updated with the numbers of | |||
# GPU and CPU blocks, which are profiled in the distributed executor. | |||
Scheduler = resolve_obj_by_qualname( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is a string, use resolve_obj_by_qualname
.
if not, it should be directly used as a class (and maybe assert it inherits from a base class?)
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
…actoring allow scheduler_cls to be string or class directly, update testing
Thanks for your feedback @youkaichao and @comaniac . I addressed it, please review again |
enforce_eager=True, # reduce test time | ||
) | ||
vllm_config = engine_args.create_engine_config() | ||
vllm_config.parallel_config.scheduler_cls = DummyScheduler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should have a top-level argument for it, e.g. from --scheduler_cls "mod.name"
or EngineArgs(scheduler_cls="mod.name")
vllm/config.py
Outdated
@@ -1338,6 +1338,7 @@ class ParallelConfig: | |||
# will be determined based on the platform. | |||
worker_cls: str = "auto" | |||
sd_worker_cls: str = "auto" | |||
scheduler_cls: Union[str, Type[object]] = "vllm.core.scheduler.Scheduler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add it in the SchedulerConfig
?
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
…-argument move scheduler_cls to SchedulerConfig and add it to EngineArgs
This pull request has merge conflicts that must be resolved before it can be |
thanks for reviewing again @youkaichao , @comaniac . I have incorporated your feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the contribution!
Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com> Signed-off-by: Linkun Chen <github@lkchen.net>
Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com> Signed-off-by: saeediy <saidakbarp@gmail.com>
This PR enables different platforms to plugin their (hardware) specific scheduler class in vLLM version V0 and therefore addresses the scheduler part of this RFC #11162. A pluggable scheduler is needed to add support for the IBM Spyre accelerator #9652.
Note that this feature is under development for V1 but missing in the current V0.
Changes
scheduler_cls
to classSchedulerConfig
invllm/config.py
. Attribute can be a string containing the path to the scheduler class or the class itself.scheduler_cls
toEngineArgs
andadd_cli_args()
argparserscheduler_cls
invllm/engine/llm_engine.py
tests/plugins_tests/test_scheduler_plugins.py
and appended it to the test pipeline (.buildkite/test-pipeline.yaml
)Remarks
scheduler_cls
is initialized with "vllm.core.scheduler.Scheduler", the default behaviour does not change. However, the change allows to point to a (hardware) specific scheduler class in platforms.worker_cls
to have minimal changes (Otherwise I would have to overwrite "auto" for all platforms using the default scheduler).