-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[V1][PP] Fix & Pin Ray version in requirements-cuda.txt #13436
Conversation
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
i think it’s fine, as long as vllm does not directly use cupy . |
@youkaichao It seems to use cupy-cu12. However, IIUC, it doesn't break anything on our cu11.8 build unless the user explicitly chooses Ray? |
sounds good, then it's a ray-related issue, whether they want to support cuda 11.8 . we can go ahead with |
ray[adag] uses cupy-cuda12x. BTW, there is an issue in ray 2.42 and is being fixed. After that we can upgrade to the latest version with a small API change. |
…#13436) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
…#13436) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
…#13436) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
The issue with pinning to a specific Ray version is that anyone with a long running cluster will not be able to upgrade vLLM services unless they upgrade the entire Ray cluster. Please can we rather look at a range specifier (i.e. |
Hi @darthhexx , that makes sense. This is a short term fix and the plan is to support a ray version range. |
…#13436) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
…#13436) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: Linkun Chen <github@lkchen.net>
…#13436) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: saeediy <saidakbarp@gmail.com>
Pipeline parallelism in V1 requires
ray[adag]
instead ofray[default]
.Also, because of the API changes in 2.42.0, we have to pin the version to
2.41.0
(or 2.40.0).NOTE: Importantly, having
ray[adag]
will add CuPy (cu12) as a dependency. Since PP is not used for all models, we can consider keepingray[adag]
as an optional dependency if it's not acceptable.