Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Hardware][Ascend] Add silu_and_mul/rope; Add mix ops into attention layer #18

Merged
merged 3 commits into from
Feb 8, 2025

Conversation

whx-sjtu
Copy link
Contributor

@whx-sjtu whx-sjtu commented Feb 7, 2025

Add silu_and_mul and rope ops;
Replace the original ops in attention impl with three mixed ops: reshape_and_cache, pagedattention and selfattention for better performance.

hw_whx added 2 commits February 7, 2025 12:40
Signed-off-by: hw_whx <wanghexiang7@huawei.com>
…ease

Signed-off-by: hw_whx <wanghexiang7@huawei.com>
from vllm.model_executor.layers.rotary_embedding import RotaryEmbedding


def rope_forward_oot(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does oot mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of tree, which means plugin backend types in vllm

Signed-off-by: hw_whx <wanghexiang7@huawei.com>
@wuhuikx
Copy link

wuhuikx commented Feb 8, 2025

lgtm

@ganyi1996ppo ganyi1996ppo merged commit 49e5baf into vllm-project:develop Feb 8, 2025
1 check passed
@whx-sjtu whx-sjtu deleted the add_four_mixops branch February 11, 2025 02:51
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants