-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[issue tracker] make quantization compatible with dynamo dynamic shape #9234
Comments
cc @bnellnm |
there's a related issue in pytorch pytorch/pytorch#112883 , and the comment seems to be that pytorch will not fix it in the near future. I tested it in pytorch nightly (2.6.0.dev20241004) , it still has this problem. |
I was able to workaround the problem by modifying the schemas to take
|
This also works.
I think the |
yes, they are two separate problems. for dynamo dynamic shape to understand quantization ops, both problems need to be solved. |
Anything you want to discuss about vllm.
here is a simple demo code:
when we register the custom op from c++ side, dynamic shape will be directly specialized to an integer, and fail.
when we register the custom op from Python side, dynamic shape works as expected.
we should change the way we register quantization as custom ops, from c++ side to python side.
there's also one complicated object
vllm/vllm/scalar_type.py
Line 15 in f3a507f
vllm/vllm/_custom_ops.py
Lines 315 to 321 in f3a507f
we can use strings to represent the type, and look up the actual object to pass into the c++ function.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: