Skip to content

associative ssm op order #5

@kpich

Description

@kpich

i might be wrong, but in the colab/blog, i think the $\oplus$ op used to do the associative scan for the selective state space model should have, as the value of its first output, $a_2 a_1$ (rather than $a_1 a_2$), reflecting the fact that the leftmost $A$ transform gets applied first.

(Since the Mamba matrices are diagonal and therefore commutative it doesn't actually matter here I guess, I just found this initially confusing in the presentation).

It looks to be correct in the triton first_order_op but is I think reversed in the reference pytorch op and latex above it.

Thanks for this terrific writeup! It really clarified some things for me, thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions