Skip to content

LSTM with recurrent projection layer (LSTMPCell) #129

Description

@MartinuzziFrancesco

Paper

https://www.isca-archive.org/interspeech_2014/sak14_interspeech.pdf

Equations

Modifies the PeepholeLSTM with the following addition

$$\begin{align} \mathbf{i}_t &= \sigma\left( \mathbf{W}_{ix} \mathbf{x}_t + \mathbf{W}_{ih} \mathbf{h}_{t-1} + \mathbf{W}_{ic} \mathbf{c}_{t-1} + \mathbf{b}_i \right), \\\ \mathbf{f}_t &= \sigma\left( \mathbf{W}_{fx} \mathbf{x}_t + \mathbf{W}_{fh} \mathbf{h}_{t-1} + \mathbf{W}_{fc} \mathbf{c}_{t-1} + \mathbf{b}_f \right), \\\ \mathbf{c}_t &= \mathbf{f}_t \circ \mathbf{c}_{t-1} + \mathbf{i}_t \circ g\left( \mathbf{W}_{cx} \mathbf{x}_t + \mathbf{W}_{ch} \mathbf{h}_{t-1} + \mathbf{b}_c \right), \\\ \mathbf{o}_t &= \sigma\left( \mathbf{W}_{ox} \mathbf{x}_t + \mathbf{W}_{oh} \mathbf{h}_{t-1} + \mathbf{W}_{oc} \mathbf{c}_t + \mathbf{b}_o \right), \\\ \mathbf{h}_t &= \mathbf{o}_t \circ h(\mathbf{c}_t), \\\ \mathbf{r}_t &= \mathbf{W}_{rh} \mathbf{h}_t, \\\ \mathbf{y}_t &= \phi(\mathbf{W}_{yr} \mathbf{r}_t + \mathbf{b}_y). \end{align}$$

Since this is separate from the actual equations we can also make this a generic wrapper layer. We have to provide

  • Cell as implemented in the paper LSTMPCell
  • Additional wrapper RecurrentProjection() that takes a cell and returns
$$\begin{align} \mathbf{r}_t &= \mathbf{W}_{rh} \mathbf{h}_t, \\\ \mathbf{y}_t &= \phi(\mathbf{W}_{yr} \mathbf{r}_t + \mathbf{b}_y). \end{align}$$

We also have to find a way to stack it I suppose.

Official implementation

none

Metadata

Metadata

Assignees

No one assigned

    Labels

    cellA new recurrent cell found in the literaturewrapperA new wrapper found in the literature

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions