Skip to content

Support CPU mCuSeqlensM in Grouped Gemm? #34

@zfan2356

Description

@zfan2356

Hello Tri Dao! I sincerely apologize for taking up your time, but I am wondering if there are any plans to support mCuSeqlensM on CPU. The reason is that DeepEP's returned Tensor are always on CPU memory, and copying them to GPU would consume a significant amount of time. Thank you very much for your consideration!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions