Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[CK_TILE] Support moe with up gemm #1793

Draft
wants to merge 8 commits into
base: develop
Choose a base branch
from
Draft

Conversation

huaiguxu
Copy link

@huaiguxu huaiguxu commented Jan 4, 2025

Proposed changes

Support fused MoE with up gemm.

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

  • I have added tests relevant to the introduced functionality, and the unit tests are passing locally
  • I have added inline documentation which enables the maintainers with understanding the motivation
  • I have removed the stale documentation which is no longer relevant after this pull request
  • (If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
  • I have run clang-format on all changed files
  • Any dependent changes have been merged

Discussion

If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered

@carlushuang
Copy link
Contributor

carlushuang commented Jan 5, 2025

Please note that 2 function call to the flatmm will result in duplicated call to buffer load matrix A, which will reduce the overall performance. This will be a temporary solution to unblock gate+up, but as a production I think need carefully consider change the flatmm inline asm block to support both gate+up and gate before merge into the mainline

typename Ts_::IndexDataType,
// ck_tile::element_wise::FastGeluAsm, //
// TODO: hardcoded
ck_tile::element_wise::Silu,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huaiguxu
hardcode original activation to silu. If you need add new function into it, you really need to modify the template interface
cc @coderfeli

@@ -73,7 +73,7 @@ struct FusedMoeGemmPipeline_FlatmmUk
constexpr index_t smem_0 = Policy::template GetUK_0<Problem>().GetSmemSize();
constexpr index_t smem_1 = Policy::template GetUK_1<Problem>().GetSmemSize();
constexpr index_t smem_bridge =
BlockShape::Block_M0 * BlockShape::Block_N0 * sizeof(YDataType);
BlockShape::Block_M0 * BlockShape::Block_N0 * (IsGateOnly ? 1 : 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not correct, not multiply sizeop(dtype)

@carlushuang
Copy link
Contributor

covered by this PR: #1808

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants