Skip to content

Selective-Update RNNs (suRNN/suRNNCell) #167

Description

@MartinuzziFrancesco

Paper

https://arxiv.org/abs/2603.02226

Equations

$$\begin{align} \mathbf{D}_t &\triangleq \mathrm{diag}(\mathbf{g}_t), \qquad \mathbf{g}_t \in \{0,1\}^H, \qquad \mathbf{h}_t \in \mathbb{R}^H \\[2pt] \Delta \mathbf{h}_t &\triangleq f_\theta(\mathbf{h}_{t-1}, \mathbf{x}_t) - \mathbf{h}_{t-1} \\[2pt] \mathbf{h}_t &= \mathbf{h}_{t-1} + \mathbf{D}_t \, \Delta \mathbf{h}_t \\[-2pt] &= \mathbf{h}_{t-1} + \mathbf{D}_t \big(f_\theta(\mathbf{h}_{t-1}, \mathbf{x}_t) - \mathbf{h}_{t-1}\big) \\[-2pt] &= (\mathbf{I}-\mathbf{D}_t)\mathbf{h}_{t-1} + \mathbf{D}_t f_\theta(\mathbf{h}_{t-1}, \mathbf{x}_t) \\[6pt] \mathbf{J}_t &\triangleq \frac{\partial \mathbf{h}_t}{\partial \mathbf{h}_{t-1}} = \mathbf{I} + \mathbf{D}_t\big(\mathbf{J}^{(f)}_t - \mathbf{I}\big), \qquad \mathbf{J}^{(f)}_t \triangleq \left.\frac{\partial f_\theta}{\partial \mathbf{h}_{t-1}}\right|_{(\mathbf{h}_{t-1},\mathbf{x}_t)} \\[6pt] a_{t,i} &= b_i + \sum_{k=1}^{K} \alpha_{i,k}\,\sin(\omega_k t + \phi_{i,k}), \qquad g_{t,i} = H(a_{t,i}) \quad (i=1,\dots,H) \end{align}$$

Official implementation

This is linked in the paper but no code is there yet
https://anonymous.4open.science/r/suGRU-EB5C/README.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    cellA new recurrent cell found in the literature

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions