Skip to content

Integrate SonicMoE kernels #1001

@hemildesai

Description

@hemildesai

https://github.com/Dao-AILab/sonic-moe/

  1. The current API can only be used without EP, so integrate it for non EP paths.

  2. Investigate whether a grouped gemm from sonicmoe could be used with EP. The grouped gemm function would need to implement the following for best perf:

  • gemm_gated followed by router weight multiplication in the down projection gemm epilogue similar to gemm_dgated
  • Custom backward for the two gemms
    Constraints:
    Tokens are already permuted for EP so only need the grouped gemm portion from sonic moe, but that API is not yet available.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions