From 05098a98dc894942e5dd794c5bb7f6b69f6484f1 Mon Sep 17 00:00:00 2001 From: MartinuzziFrancesco Date: Sat, 20 Jun 2026 11:00:22 +0200 Subject: [PATCH 1/2] feat: add agents.md --- AGENTS.md | 172 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 172 insertions(+) create mode 100644 AGENTS.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..f35c351 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,172 @@ +# AGENTS.md + +Guidance for AI agents working in the **torchrecurrent** repository. + +## What this project is + +`torchrecurrent` is a PyTorch-compatible collection of recurrent neural network +**cells** and **layers** drawn from the research literature. Every model exposes +a native-PyTorch-style interface (mirroring `torch.nn.RNN`/`RNNCell`) while adding +extra knobs for initialization and customization. It is published on +[PyPI](https://pypi.org/project/torchrecurrent/) and conda-forge, and is intended +primarily for academic research. + +- Package: `torchrecurrent` (version in `pyproject.toml`) +- Python: `>=3.9` (CI runs 3.9–3.14 on Linux/Windows/macOS) +- Single runtime dependency: `torch` +- Companion projects: [RecurrentLayers.jl](https://github.com/MartinuzziFrancesco/RecurrentLayers.jl) + (Flux), [LuxRecurrentLayers.jl](https://github.com/MartinuzziFrancesco/LuxRecurrentLayers.jl) (Lux) + +## Project structure + +Generated with `tree -I '__pycache__|*.egg-info|.venv|.git|.pytest_cache|runs|generated'` +(the `runs/` experiment artifacts and `generated/` autosummary stubs are collapsed): + +``` +. +├── benchmarks # standalone training scripts + saved runs/ (not packaged) +│ ├── adding_problem +│ │ └── adding_problem.py +│ └── copy_memory +│ └── copy_memory.py +├── docs # Sphinx documentation +│ ├── api +│ │ ├── benchmarks.rst +│ │ ├── cells.rst +│ │ ├── index.rst +│ │ └── layers.rst +│ ├── _static +│ │ ├── favicon.ico +│ │ ├── logo-long2.png +│ │ └── logo.png +│ ├── conf.py +│ ├── index.rst +│ ├── make.bat +│ ├── Makefile +│ ├── models.rst # catalog of published models +│ └── requirements.txt +├── tests +│ ├── test_cells.py # per-cell shape/dtype/state checks +│ └── test_layers.py # per-layer stacking/batch_first checks +├── torchrecurrent # the package +│ ├── benchmarks +│ │ ├── adding.py # adding_problem task generator +│ │ ├── copymemory.py # copy_memory task generator +│ │ └── __init__.py +│ ├── cells # each file defines BOTH a Cell and its layer +│ │ ├── antisymmetricrnn_cell.py +│ │ ├── atr_cell.py +│ │ ├── br_cell.py +│ │ ├── cfn_cell.py +│ │ ├── cornn_cell.py +│ │ ├── fastrnn_cell.py +│ │ ├── indrnn_cell.py +│ │ ├── __init__.py # re-exports every cell + its layer +│ │ ├── janet_cell.py +│ │ ├── lem_cell.py +│ │ ├── lightru_cell.py +│ │ ├── ligru_cell.py +│ │ ├── mgu_cell.py +│ │ ├── multiplicativelstm_cell.py +│ │ ├── mut_cell.py # MUT1 / MUT2 / MUT3 +│ │ ├── nas_cell.py +│ │ ├── originallstm_cell.py +│ │ ├── peepholelstm_cell.py +│ │ ├── ran_cell.py +│ │ ├── rhn_cell.py # present but NOT exported (commented out) +│ │ ├── scrn_cell.py +│ │ ├── sgrn_cell.py +│ │ ├── star_cell.py +│ │ ├── ugrnn_cell.py +│ │ ├── unicornn_cell.py +│ │ └── wmclstm_cell.py +│ ├── base.py # abstract base classes (see "Architecture") +│ └── __init__.py # top-level public API (alphabetized re-exports) +├── AGENTS.md +├── LICENSE # MIT (NASCell re-impl carries Apache-2.0) +├── MANIFEST.in +├── pyproject.toml # build, deps, black config, test extras +└── README.md +``` + +Two things worth internalizing: + +- **There is no `layers/` directory.** Each `*_cell.py` file defines *both* the + cell (e.g. `MGUCell`) and its multi-layer wrapper (e.g. `MGU`). The + `cells/__init__.py` and top-level `__init__.py` re-export both. +- `torchrecurrent/benchmarks/` (packaged task generators) is distinct from the + top-level `benchmarks/` (standalone training scripts and saved run artifacts). + +## Architecture + +All models inherit from base classes in `torchrecurrent/base.py`: + +- `BaseRecurrentCell` — common cell machinery: input/state validation, zero-state + init, parameter/buffer registration (`_register_tensors`, + `_default_register_tensors`), and `init_weights()` which dispatches on parameter + name (`weight_ih`, `weight_hh`, `bias_ih`, `bias_hh`). + - `BaseSingleRecurrentCell` — single hidden state `h`; `uses_double_state()` → `False`. + - `BaseDoubleRecurrentCell` — LSTM-style `(h, c)`; `uses_double_state()` → `True`. +- `BaseRecurrentLayer` — stacking, dropout between layers, `batch_first`, + `initialize_cells(CellClass, **kwargs)`. + - `BaseSingleRecurrentLayer` / `BaseDoubleRecurrentLayer` — iterate the cell + stack over the time dimension. + +### Conventions every cell follows + +- Weights are concatenated per-gate into `weight_ih` / `weight_hh` with shape + `(n_gates * hidden_size, ...)` and split with `.chunk(n, 0)` in `forward`. +- Separate input-side (`bias`) and recurrent-side (`recurrent_bias`) bias flags. +- Configurable `nonlinearity` / `gate_nonlinearity` and four init callables + (`kernel_init`, `recurrent_kernel_init`, `bias_init`, `recurrent_bias_init`), + defaulting to `xavier_uniform_` for weights and `zeros_` for biases. +- A cell `forward` accepts `(input_size,)` or `(N, input_size)` and handles the + unbatched case internally via the `_preprocess_*` helpers. +- Extensive Google/NumPy-style docstrings with a math block and an arXiv link — + these feed the Sphinx `generated/` autosummary pages. + +## Adding a new model + +1. Create `torchrecurrent/cells/_cell.py` defining `Cell` (subclass a + `BaseSingle*`/`BaseDouble*` cell) and `` (subclass the matching layer, + calling `self.initialize_cells(Cell, **kwargs)`). Use `mgu_cell.py` as + the reference template, including the docstring style. +2. Re-export both classes from `torchrecurrent/cells/__init__.py` (import + + `__all__`) and from `torchrecurrent/__init__.py` (both import lists + `__all__`). +3. Add the cell to `CELL_CASES` in `tests/test_cells.py` and the layer to + `tests/test_layers.py`. +4. Add docs: an entry under `docs/api/` and an autosummary stub under + `docs/generated/`, plus the model catalog in `docs/models.rst`. + +## Development workflow + +```bash +pip install -e .[test] # editable install with pytest + coverage + +pytest # run the test suite +coverage run -m pytest # how CI runs it + +pre-commit run --all-files # black + ruff --fix +black . # line length 92 +flake8 # excludes docs/, benchmarks/, tests/ +``` + +- Code style: **black**, line length **92** (configured in both `pyproject.toml` + and `.flake8`). Run black/ruff before committing — pre-commit enforces it. +- Tests are parametrized tables of model classes; keep them in sync when you add + or rename a model. + +## Conventions for agents + +- **Keep cell and layer in the same file**, and keep the three export sites + (`cells/__init__.py`, top-level `__init__.py`, and each `__all__`) consistent — + a model missing from any of them won't be importable. +- Match the existing docstring format (math block + arXiv link + Args/Inputs/ + Outputs/Variables); docs generation depends on it. +- Don't commit into `benchmarks/.../runs/` — those are saved experiment artifacts. +- `rhn_cell.py` exists but is intentionally not exported; don't wire it up unless + asked. +- Only `torch` may be added as a runtime dependency without discussion; keep the + package dependency-light. +- Respect third-party licenses: `NASCell` is an Apache-2.0 re-implementation. +``` From 080574c01065481b457d1ac46ec6d3a067a6db14 Mon Sep 17 00:00:00 2001 From: MartinuzziFrancesco Date: Mon, 22 Jun 2026 20:16:54 +0200 Subject: [PATCH 2/2] feat: make agents.md more specific --- AGENTS.md | 278 +++++++++++++++++++++++------------------------------- 1 file changed, 117 insertions(+), 161 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index f35c351..e016141 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -2,171 +2,127 @@ Guidance for AI agents working in the **torchrecurrent** repository. -## What this project is +## Project Snapshot -`torchrecurrent` is a PyTorch-compatible collection of recurrent neural network -**cells** and **layers** drawn from the research literature. Every model exposes -a native-PyTorch-style interface (mirroring `torch.nn.RNN`/`RNNCell`) while adding -extra knobs for initialization and customization. It is published on -[PyPI](https://pypi.org/project/torchrecurrent/) and conda-forge, and is intended -primarily for academic research. +- Package: `torchrecurrent` +- Purpose: PyTorch-compatible recurrent neural network cells and layers from + research literature, primarily for academic research. +- Python: `>=3.9`; CI covers Python 3.9-3.14 on Linux, Windows, and macOS. +- Runtime dependency policy: `torch` is the only runtime dependency. +- Style: Black, line length 92. -- Package: `torchrecurrent` (version in `pyproject.toml`) -- Python: `>=3.9` (CI runs 3.9–3.14 on Linux/Windows/macOS) -- Single runtime dependency: `torch` -- Companion projects: [RecurrentLayers.jl](https://github.com/MartinuzziFrancesco/RecurrentLayers.jl) - (Flux), [LuxRecurrentLayers.jl](https://github.com/MartinuzziFrancesco/LuxRecurrentLayers.jl) (Lux) +## Commands -## Project structure - -Generated with `tree -I '__pycache__|*.egg-info|.venv|.git|.pytest_cache|runs|generated'` -(the `runs/` experiment artifacts and `generated/` autosummary stubs are collapsed): - -``` -. -├── benchmarks # standalone training scripts + saved runs/ (not packaged) -│ ├── adding_problem -│ │ └── adding_problem.py -│ └── copy_memory -│ └── copy_memory.py -├── docs # Sphinx documentation -│ ├── api -│ │ ├── benchmarks.rst -│ │ ├── cells.rst -│ │ ├── index.rst -│ │ └── layers.rst -│ ├── _static -│ │ ├── favicon.ico -│ │ ├── logo-long2.png -│ │ └── logo.png -│ ├── conf.py -│ ├── index.rst -│ ├── make.bat -│ ├── Makefile -│ ├── models.rst # catalog of published models -│ └── requirements.txt -├── tests -│ ├── test_cells.py # per-cell shape/dtype/state checks -│ └── test_layers.py # per-layer stacking/batch_first checks -├── torchrecurrent # the package -│ ├── benchmarks -│ │ ├── adding.py # adding_problem task generator -│ │ ├── copymemory.py # copy_memory task generator -│ │ └── __init__.py -│ ├── cells # each file defines BOTH a Cell and its layer -│ │ ├── antisymmetricrnn_cell.py -│ │ ├── atr_cell.py -│ │ ├── br_cell.py -│ │ ├── cfn_cell.py -│ │ ├── cornn_cell.py -│ │ ├── fastrnn_cell.py -│ │ ├── indrnn_cell.py -│ │ ├── __init__.py # re-exports every cell + its layer -│ │ ├── janet_cell.py -│ │ ├── lem_cell.py -│ │ ├── lightru_cell.py -│ │ ├── ligru_cell.py -│ │ ├── mgu_cell.py -│ │ ├── multiplicativelstm_cell.py -│ │ ├── mut_cell.py # MUT1 / MUT2 / MUT3 -│ │ ├── nas_cell.py -│ │ ├── originallstm_cell.py -│ │ ├── peepholelstm_cell.py -│ │ ├── ran_cell.py -│ │ ├── rhn_cell.py # present but NOT exported (commented out) -│ │ ├── scrn_cell.py -│ │ ├── sgrn_cell.py -│ │ ├── star_cell.py -│ │ ├── ugrnn_cell.py -│ │ ├── unicornn_cell.py -│ │ └── wmclstm_cell.py -│ ├── base.py # abstract base classes (see "Architecture") -│ └── __init__.py # top-level public API (alphabetized re-exports) -├── AGENTS.md -├── LICENSE # MIT (NASCell re-impl carries Apache-2.0) -├── MANIFEST.in -├── pyproject.toml # build, deps, black config, test extras -└── README.md -``` - -Two things worth internalizing: - -- **There is no `layers/` directory.** Each `*_cell.py` file defines *both* the - cell (e.g. `MGUCell`) and its multi-layer wrapper (e.g. `MGU`). The - `cells/__init__.py` and top-level `__init__.py` re-export both. -- `torchrecurrent/benchmarks/` (packaged task generators) is distinct from the - top-level `benchmarks/` (standalone training scripts and saved run artifacts). - -## Architecture - -All models inherit from base classes in `torchrecurrent/base.py`: - -- `BaseRecurrentCell` — common cell machinery: input/state validation, zero-state - init, parameter/buffer registration (`_register_tensors`, - `_default_register_tensors`), and `init_weights()` which dispatches on parameter - name (`weight_ih`, `weight_hh`, `bias_ih`, `bias_hh`). - - `BaseSingleRecurrentCell` — single hidden state `h`; `uses_double_state()` → `False`. - - `BaseDoubleRecurrentCell` — LSTM-style `(h, c)`; `uses_double_state()` → `True`. -- `BaseRecurrentLayer` — stacking, dropout between layers, `batch_first`, - `initialize_cells(CellClass, **kwargs)`. - - `BaseSingleRecurrentLayer` / `BaseDoubleRecurrentLayer` — iterate the cell - stack over the time dimension. - -### Conventions every cell follows - -- Weights are concatenated per-gate into `weight_ih` / `weight_hh` with shape - `(n_gates * hidden_size, ...)` and split with `.chunk(n, 0)` in `forward`. -- Separate input-side (`bias`) and recurrent-side (`recurrent_bias`) bias flags. -- Configurable `nonlinearity` / `gate_nonlinearity` and four init callables - (`kernel_init`, `recurrent_kernel_init`, `bias_init`, `recurrent_bias_init`), - defaulting to `xavier_uniform_` for weights and `zeros_` for biases. -- A cell `forward` accepts `(input_size,)` or `(N, input_size)` and handles the - unbatched case internally via the `_preprocess_*` helpers. -- Extensive Google/NumPy-style docstrings with a math block and an arXiv link — - these feed the Sphinx `generated/` autosummary pages. - -## Adding a new model - -1. Create `torchrecurrent/cells/_cell.py` defining `Cell` (subclass a - `BaseSingle*`/`BaseDouble*` cell) and `` (subclass the matching layer, - calling `self.initialize_cells(Cell, **kwargs)`). Use `mgu_cell.py` as - the reference template, including the docstring style. -2. Re-export both classes from `torchrecurrent/cells/__init__.py` (import + - `__all__`) and from `torchrecurrent/__init__.py` (both import lists + `__all__`). -3. Add the cell to `CELL_CASES` in `tests/test_cells.py` and the layer to - `tests/test_layers.py`. -4. Add docs: an entry under `docs/api/` and an autosummary stub under - `docs/generated/`, plus the model catalog in `docs/models.rst`. - -## Development workflow +Run the narrowest useful command first, then broaden when the change touches +shared behavior. ```bash -pip install -e .[test] # editable install with pytest + coverage - -pytest # run the test suite -coverage run -m pytest # how CI runs it - -pre-commit run --all-files # black + ruff --fix -black . # line length 92 -flake8 # excludes docs/, benchmarks/, tests/ +pip install -e .[test] +pytest +coverage run -m pytest +black . +flake8 +pre-commit run --all-files ``` -- Code style: **black**, line length **92** (configured in both `pyproject.toml` - and `.flake8`). Run black/ruff before committing — pre-commit enforces it. -- Tests are parametrized tables of model classes; keep them in sync when you add - or rename a model. - -## Conventions for agents - -- **Keep cell and layer in the same file**, and keep the three export sites - (`cells/__init__.py`, top-level `__init__.py`, and each `__all__`) consistent — - a model missing from any of them won't be importable. -- Match the existing docstring format (math block + arXiv link + Args/Inputs/ - Outputs/Variables); docs generation depends on it. -- Don't commit into `benchmarks/.../runs/` — those are saved experiment artifacts. -- `rhn_cell.py` exists but is intentionally not exported; don't wire it up unless - asked. -- Only `torch` may be added as a runtime dependency without discussion; keep the - package dependency-light. -- Respect third-party licenses: `NASCell` is an Apache-2.0 re-implementation. -``` +- `pytest` runs the test suite. +- `coverage run -m pytest` matches the CI test command. +- `black .` formats with the configured 92-character line length. +- `flake8` excludes `docs/`, `benchmarks/`, and `tests/`. +- `pre-commit run --all-files` runs Black and Ruff fixes before committing. + +## Repository Map + +- `torchrecurrent/base.py`: abstract base classes for cells and layers. +- `torchrecurrent/cells/`: each `*_cell.py` defines both a cell and its layer. +- `torchrecurrent/benchmarks/`: packaged task generators. +- `benchmarks/`: standalone training scripts and saved runs, not packaged. +- `tests/test_cells.py`: per-cell shape, dtype, and state checks. +- `tests/test_layers.py`: per-layer stacking and `batch_first` checks. +- `docs/`: Sphinx docs and the model catalog in `docs/models.rst`. + +There is no `layers/` directory. Keep cell and layer implementations together in +the relevant `torchrecurrent/cells/_cell.py` file. + +## Architecture Conventions + +- `BaseSingleRecurrentCell` uses one hidden state `h`. +- `BaseDoubleRecurrentCell` uses LSTM-style `(h, c)` state. +- `BaseSingleRecurrentLayer` and `BaseDoubleRecurrentLayer` iterate cell stacks + over the time dimension. +- Weights are concatenated per gate into `weight_ih` and `weight_hh`, then split + with `.chunk(n, 0)` in `forward`. +- Cells support input shaped `(input_size,)` or `(N, input_size)` via the base + `_preprocess_*` helpers. +- Bias controls are separate: `bias` for input-side terms and `recurrent_bias` + for recurrent-side terms. +- Initializers are configurable through `kernel_init`, `recurrent_kernel_init`, + `bias_init`, and `recurrent_bias_init`; defaults are `xavier_uniform_` for + weights and `zeros_` for biases. + +## Adding A Model + +1. Create `torchrecurrent/cells/_cell.py`. +2. Define `Cell` from the matching single-state or double-state base cell. +3. Define `` from the matching layer base and call + `self.initialize_cells(Cell, **kwargs)`. +4. Use `torchrecurrent/cells/mgu_cell.py` as the implementation and docstring + template. +5. Re-export both classes from `torchrecurrent/cells/__init__.py` and + `torchrecurrent/__init__.py`, including each `__all__`. +6. Add the cell to `CELL_CASES` in `tests/test_cells.py`. +7. Add the layer to `tests/test_layers.py`. +8. Add docs under `docs/api/`, generated autosummary coverage, and + `docs/models.rst`. + +## Code Style + +- Format Python with Black before finishing changes that touch code. +- Keep comments sparse. Add comments only when they explain non-obvious math, + paper-specific behavior, numerical stability choices, or API compatibility. +- Do not add comments that merely restate the code. +- Match the existing Google/NumPy-style docstrings with a math block, arXiv link, + Args, Inputs, Outputs, and Variables sections. +- Keep tests table-driven and update the relevant parametrized cases when adding + or renaming public models. + +## Boundaries + +### Always Do + +- Preserve native PyTorch-style interfaces that mirror `torch.nn.RNN` and + `torch.nn.RNNCell` where applicable. +- Keep the three export sites synchronized: + `torchrecurrent/cells/__init__.py`, `torchrecurrent/__init__.py`, and each + `__all__`. +- Respect third-party licenses. `NASCell` is an Apache-2.0 reimplementation in an + MIT-licensed project. + +### Ask First + +- Adding, removing, or changing runtime dependencies. Do not add dependencies + just to simplify an implementation. +- Exporting or otherwise wiring up `rhn_cell.py`; it exists but is intentionally + not part of the public API. +- Large rewrites, API breaks, renamed public classes, or changes to package + metadata and release configuration. +- Broad documentation regeneration if it would create large generated diffs. + +### Never Do + +- Do not create a separate `layers/` package. +- Do not commit or edit saved artifacts under `benchmarks/.../runs/`. +- Do not add unnecessary comments. +- Do not skip tests silently; report any tests that could not be run. +- Do not introduce non-`torch` runtime dependencies without explicit approval. + +## Done Criteria + +Before finishing, check the work against the scope of the change: + +- Code is formatted with Black when Python files changed. +- Relevant tests were run, or the reason they were not run is stated. +- New or renamed public models are exported from both package entry points. +- Tests and docs are updated when public behavior changes. +- The final response summarizes changed files, verification, and any remaining + risk or follow-up.