-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Bug]: Could not run '_C::rms_norm' with arguments from the 'CUDA' backend. #12441
Comments
duplicate #12440 |
NotImplementedError: Could not run '_C::rms_norm' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist f
or this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTo
rch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. '_C::rms_norm' is only available for these backends: [HIP,
Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, Autograd
Other, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, Autoc
astXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot,
FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher]. |
same problem |
It is because of the
|
as the error message indicates, meta employee may have the solution
…---Original---
From: "Shuyue Jia ***@***.***>
Date: Mon, Feb 10, 2025 03:48 AM
To: ***@***.***>;
Cc: "Ye Canming ***@***.******@***.***>;
Subject: Re: [vllm-project/vllm] [Bug]: Could not run '_C::rms_norm' witharguments from the 'CUDA' backend. (Issue #12441)
+1
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Am I right in saying that this is an environment issue when incompatible versions of vLLM and torch are installed? |
您的邮件已收到!
|
I compiled VLLM under ARM architecture using torch 2.7, and later reverted to torch 2.6 due to compatibility issues. Will this problem also exist? |
Yes, at some point we will be updating the torch version. See #12721 for the upgrade to 2.6 |
Thank you, I successfully ran it in torch version 2.6, but I have two questions and hope you can help me answer them: |
I believe so yes, xFormers installs its own version of FlashAttention.
Anything that changes the order of flooating point operations will affect the inference results in some way. Sometimes it's not noticeable, sometimes it is. |
Thank you very much |
Your current environment
env
Model Input Dumps
No response
🐛 Describe the bug
shell
error log
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: