From d47fcd466d376e0a6f1b161dca59c01806763701 Mon Sep 17 00:00:00 2001
From: Marius Lange <marius.lange@bsse.ethz.ch>
Date: Fri, 24 Apr 2026 14:39:28 -0600
Subject: [PATCH 1/3] docs: adopt agent-neutral AGENTS.md + REVIEW_GUIDE.md
 pattern
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Mirror the agent-neutral documentation structure from avanti:

- AGENTS.md — canonical source of truth for agents: trust order,
  "Where To Find What" table delegating to existing docs, 12 critical
  invariants sourced directly from the current source code, and
  development commands.
- REVIEW_GUIDE.md — standalone PR review guide for GitHub-launched
  agents. No references to local user settings or machine-specific
  paths. Covers review-first workflow, high-risk areas, a changed-path
  test-lookup table, testing/docs-impact checklists, and a review
  checklist.
- CLAUDE.md — thin entry point that includes AGENTS.md and points to
  REVIEW_GUIDE.md.
- .github/PULL_REQUEST_TEMPLATE.md — structured template matching
  avanti's: summary, behavior/invariants changed, tests run, reviewer
  focus, context, open questions.
- .github/copilot-instructions.md — shrunk to a 5-line shim pointing
  to AGENTS.md and REVIEW_GUIDE.md. The previous long-form content is
  either captured in AGENTS.md (as verified invariants) or delegated
  to existing docs (README.md, docs/api.md, docs/notebooks/tutorials/,
  docs/contributing.md) — no duplication.
---
 .github/PULL_REQUEST_TEMPLATE.md |  24 ++++
 .github/copilot-instructions.md  | 207 +------------------------------
 AGENTS.md                        |  71 +++++++++++
 CLAUDE.md                        |   5 +
 REVIEW_GUIDE.md                  | 138 +++++++++++++++++++++
 5 files changed, 242 insertions(+), 203 deletions(-)
 create mode 100644 .github/PULL_REQUEST_TEMPLATE.md
 create mode 100644 AGENTS.md
 create mode 100644 CLAUDE.md
 create mode 100644 REVIEW_GUIDE.md

diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
new file mode 100644
index 0000000..4d095fd
--- /dev/null
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1,24 @@
+## Summary
+
+Describe the change in 2-5 sentences.
+
+## Behavior Or Invariants Changed
+
+Call out any behavior changes and any repo invariants touched.
+If none, say "None".
+
+## Tests Run
+
+List the commands you ran and anything you intentionally did not run.
+
+## Reviewer Focus
+
+Point reviewers to the highest-risk files, code paths, or assumptions.
+
+## Context
+
+Add any domain context, dataset assumptions, or implementation background that is not obvious from the diff.
+
+## Open Questions Or Follow-Ups
+
+List anything intentionally deferred, uncertain, or worth extra scrutiny.
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 4f2fef8..4148fc4 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -1,206 +1,7 @@
 # Copilot Instructions for CellMapper
 
-## Important Notes
-- Avoid drafting summary documents or endless markdown files. Just summarize in chat what you did, why, and any open questions.
-- Don't update Jupyter notebooks - those are managed manually.
-- When running terminal commands, use `uv run` to execute commands within the project's virtual environment (e.g., `uv run python script.py`).
-- **Testing: ALWAYS use `hatch test`, NEVER `uv run pytest` or standalone pytest.** Hatch manages the test matrix (Python versions, dependencies) that CI uses. See "Testing Strategy" section for details.
-- Rather than making assumptions, ask for clarification when uncertain.
-- **GitHub workflows**: Use GitHub CLI (`gh`) when possible. For GitHub MCP server tools, ensure Docker Desktop is running first (`open -a "Docker Desktop"`).
+Canonical repo guidance lives in:
+- `AGENTS.md` — architecture, invariants, and validation commands
+- `REVIEW_GUIDE.md` — PR review workflow, risk areas, and changed-path test lookup
 
-## Project Overview
-
-**CellMapper** is a k-NN-based tool for mapping cells across representations to transfer labels, embeddings, and expression values. It works for millions of cells, on CPU and GPU, across molecular modalities, between spatial and non-spatial data. The core idea is to separate the method (k-NN graph with kernels) from the application (mapping across arbitrary representations).
-
-### Domain Context (Brief)
-- **AnnData**: Standard single-cell data structure. Contains `.X`, `.obs`, `.var`, `.obsm` (embeddings), `.layers`.
-- **k-NN mapping**: Compute k-nearest neighbors between query and reference datasets, apply graph kernel to create mapping matrix, use it to transfer labels/embeddings/expression.
-- **Joint embeddings**: CellMapper expects pre-computed joint embeddings in `.obsm` from tools like scVI, scANVI, GimVI, ENVI, GLUE, or implements baseline methods (PCA, CCA).
-- **Use cases**: Transfer labels from dissociated to spatial data, map embeddings between datasets, compute presence scores in atlases, identify spatial niches, evaluate mapping quality.
-
-### Key Dependencies
-- **Core**: anndata, scanpy, numpy, pandas, scipy, scikit-learn
-- **k-NN backends**: pynndescent, sklearn, faiss (CPU/GPU), rapids (GPU)
-- **Optional**: squidpy (for spatial), scvi-tools, harmony-pytorch (for tutorials)
-
-## Architecture & Code Organization
-
-### Module Structure (follows scverse conventions)
-- Use `AnnData` objects as primary data structure
-- Type annotations use modern syntax: `str | None` instead of `Optional[str]`
-- Supports Python 3.11, 3.12, 3.13 (see `pyproject.toml`)
-- Avoid local imports unless necessary for circular import resolution
-
-### Core Components
-1. **`src/cellmapper/model/cellmapper.py`**: Main `CellMapper` class with `map()` method
-   - Inherits from `EvaluationMixin` and `EmbeddingMixin`
-   - Handles both query-to-reference and self-mapping modes
-   - Core methods: `map()`, `map_obs()`, `map_obsm()`, `map_layers()`
-2. **`src/cellmapper/model/neighbors.py`**: k-NN graph computation with multiple backends
-3. **`src/cellmapper/model/kernel.py`**: Graph kernels for creating mapping matrices
-4. **`src/cellmapper/model/mapping_operator.py`**: Encapsulates mapping matrix with matrix powers for diffusion
-5. **`src/cellmapper/model/evaluate.py`**: Metrics for evaluating label/expression transfer quality
-6. **`src/cellmapper/model/embedding.py`**: Baseline joint embedding methods (PCA, CCA)
-7. **`src/cellmapper/utils.py`**: Utilities (library size adjustment, imputed data creation)
-
-## Development Workflow
-
-### Environment Management (uv-based)
-```bash
-# Create/sync virtual environment
-uv sync                        # install project with default dependencies
-uv sync --extra test           # include test dependencies
-uv sync --extra doc            # include documentation dependencies
-uv sync --all-extras           # include all optional dependencies
-
-# Run commands in virtual environment
-uv run python script.py        # run any Python script
-uv run pytest tests/           # run tests directly (alternative to hatch)
-
-# Testing via hatch (recommended, runs test matrix, uses uv internally)
-hatch test                     # test with highest Python version
-hatch test --all               # test all Python 3.11, 3.13, pre-release deps
-
-# Documentation
-hatch run docs:build           # build Sphinx docs
-hatch run docs:open            # open in browser
-hatch run docs:clean           # clean build artifacts
-
-# Environment inspection
-hatch env show                 # list environments
-```
-
-### Testing Strategy
-- Test matrix defined in `[[tool.hatch.envs.hatch-test.matrix]]` in `pyproject.toml`
-- Tests Python 3.11 & 3.13 with stable deps, 3.13 with pre-release deps
-- CI extracts test config from pyproject.toml (`.github/workflows/test.yaml`)
-- Tests live in `tests/`, fixtures in `tests/conftest.py`
-- **Always run tests via `hatch test`**, NOT standalone pytest
-
-### Code Quality Tools
-- **Ruff**: Linting and formatting (120 char line length)
-- **Biome**: JSON/JSONC formatting with trailing commas
-- **Pre-commit**: Auto-runs ruff, biome. Install with `pre-commit install`
-- Use `git pull --rebase` if pre-commit.ci commits to your branch
-
-## Documentation Conventions
-
-### Docstring Style (NumPy format via Napoleon)
-```python
-def map_obs(
-    self,
-    obs_keys: str | list[str],
-    *,  # keyword-only marker
-    prediction_postfix: str = "_predicted",
-    confidence_postfix: str = "_confidence",
-) -> pd.DataFrame:
-    """Short one-line description.
-
-    Extended description if needed.
-
-    Parameters
-    ----------
-    obs_keys
-        Keys in reference.obs to transfer to query.
-    prediction_postfix
-        Suffix for predicted column names.
-    confidence_postfix
-        Suffix for confidence score column names.
-
-    Returns
-    -------
-    DataFrame with transferred labels and confidence scores.
-    """
-```
-
-### Sphinx & Documentation
-- API docs auto-generated from `docs/api.md` using `autosummary`
-- Tutorials in `docs/notebooks/tutorials/` rendered via myst-nb (`.ipynb` only)
-- Add external packages to `intersphinx_mapping` in `docs/conf.py`
-- See `docs/contributing.md` for detailed documentation guidelines
-
-## Key Configuration Files
-
-### `pyproject.toml`
-- **Build**: `hatchling` with `hatch-vcs` for git-based versioning
-- **Dependencies**: Minimal runtime deps; optional extras for `[test]`, `[doc]`, `[tutorials]`
-- **Ruff**: 120 char line length, NumPy docstring convention
-- **Test matrix**: Python 3.11 & 3.13 (stable), 3.13 (pre-release)
-
-### Version Management
-- Version from git tags via `hatch-vcs`
-- Release: Create GitHub release with tag `vX.X.X`
-- Follows **Semantic Versioning**
-
-## Project-Specific Patterns
-
-### Basic Usage Pattern
-```python
-from cellmapper import CellMapper
-
-# Assume query and reference have joint embedding in .obsm["X_joint"]
-cmap = CellMapper(query, reference).map(
-    use_rep="X_joint",
-    obs_keys="celltype",           # transfer labels
-    obsm_keys="X_umap",             # transfer UMAP
-    layer_key="counts",             # transfer expression
-)
-
-# Self-mapping (for spatial contextualization, denoising)
-cmap_self = CellMapper(query).map(
-    use_rep="X_pca",
-    layer_key="counts",
-)
-```
-
-### k-NN Backends
-- **pynndescent**: Fast approximate k-NN, CPU-only
-- **sklearn**: Exact k-NN, CPU-only, slower for large datasets
-- **faiss**: Exact/approximate k-NN, supports CPU and GPU (via faiss-gpu)
-- **rapids**: GPU-accelerated k-NN using cuML
-
-### Mapping Workflow
-1. Compute k-NN graph between query and reference (or self)
-2. Apply kernel to k-NN graph to create mapping matrix M
-3. Transfer data: `query_data = M @ reference_data`
-4. Optionally apply matrix powers `M^t` for diffusion
-5. Evaluate transfer quality with metrics
-
-### AnnData Conventions
-- Check matrix format: `adata.X` may be sparse or dense
-- Use `adata.layers[key]` for alternative representations (e.g., counts, log-normalized)
-- Joint embeddings stored in `adata.obsm["X_<method>"]`
-- Transferred data goes back into query's `.obs`, `.obsm`, `.layers`
-
-### Testing with AnnData
-```python
-# From conftest.py - example fixture pattern
-@pytest.fixture
-def adata_spatial():
-    """Small spatial AnnData object with spatial coordinates."""
-    adata = ad.AnnData(
-        X=np.random.randn(100, 50).astype(np.float32),
-        obs=pd.DataFrame({"celltype": ["A", "B"] * 50}),
-        obsm={"spatial": np.random.rand(100, 2)},
-    )
-    sc.pp.pca(adata)
-    return adata
-```
-
-## Common Gotchas
-
-1. **Hatch for testing**: Always use `hatch test`, never standalone `pytest`. CI matches hatch test matrix.
-2. **Joint embeddings required**: Most use cases require pre-computed joint embedding in `.obsm`. Don't assume PCA is sufficient for complex mappings.
-3. **Sparse matrices**: Check `scipy.sparse.issparse(adata.X)` before operations. Mapping matrices are typically dense.
-4. **Self-mapping mode**: If `reference` is `None` or same as `query`, automatically enters self-mapping mode.
-5. **k-NN backends**: faiss requires `faiss-cpu` or `faiss-gpu`, rapids requires CUDA environment. Handle gracefully with fallbacks.
-6. **Pre-commit conflicts**: Use `git pull --rebase` to integrate pre-commit.ci fixes.
-7. **Line length**: Ruff set to 120 chars, but keep docstrings readable (~80 chars per line).
-
-## Related Resources
-
-- **Contributing guide**: `docs/contributing.md`
-- **Tutorials**: `docs/notebooks/tutorials/`
-- **scanpy docs**: https://scanpy.readthedocs.io/
-- **faiss docs**: https://github.com/facebookresearch/faiss
-- **squidpy docs**: https://squidpy.readthedocs.io/ (for spatial analysis)
+If this file conflicts with those guides, the guides win.
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000..b41ac97
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,71 @@
+# AGENTS.md — CellMapper
+
+CellMapper is a Python package for k-NN-based mapping of cells across
+representations to transfer labels, embeddings, and expression values. It works
+for millions of cells, on CPU and GPU, across molecular modalities, and between
+spatial and non-spatial data. The core idea is to separate the method (k-NN
+graph + kernel → mapping matrix) from the application (transfer across arbitrary
+representations).
+Key frameworks: AnnData/Scanpy, numpy/scipy, scikit-learn, pynndescent, faiss,
+RAPIDS cuML.
+
+## Trust Order
+
+When sources disagree:
+1. PR description and changed code
+2. This file (`AGENTS.md`)
+3. `REVIEW_GUIDE.md`
+4. Tests and fixtures
+5. Public docs in `docs/` and `README.md`
+
+Every fact should have one owner. This file owns invariants and the reference
+table below — everything else is delegated.
+
+## Where To Find What
+
+| Topic | Source of truth |
+|-------|----------------|
+| User-facing overview, use cases, quickstart | `README.md` |
+| Public API reference | `docs/api.md` and autosummary under `docs/generated/` |
+| Tutorials (query→reference, spatial mapping, spatial smoothing, data denoising) | `docs/notebooks/tutorials/` |
+| Contributor setup, environments, docs build | `docs/contributing.md` |
+| Release notes | `docs/changelog.md` |
+| PR review workflow and risk areas | `REVIEW_GUIDE.md` |
+| Test fixtures | `tests/conftest.py`, `tests/data/` |
+| Kernel taxonomy (which kernels, which are self-mapping-only) | `src/cellmapper/constants.py` |
+| Optional-dependency gating | `src/cellmapper/check.py` |
+| Method-level behavior (parameters, return shapes) | docstrings in `src/cellmapper/model/` |
+
+## Critical Invariants
+
+- **Self-mapping mode** activates when `reference is None` **or** `reference is query` (object identity). The second case logs a warning. See `CellMapper.__init__` in `src/cellmapper/model/cellmapper.py`.
+- **Reference is read-only.** `.map()` never mutates the reference AnnData. `query` is mutated in place for `map_obs` / `map_obsm`. Expression transfer produces a separate `query_imputed` AnnData object, not a view.
+- **Output key naming** follows `{key}{prediction_postfix}` and `{key}{confidence_postfix}` in `query.obs` / `query.obsm`. Postfixes are user-controllable via `.map()` kwargs.
+- **`.map()` auto-chains** `compute_neighbors` → `compute_mapping_matrix` → `map_obs/obsm/layers` based on missing state. Callers that use these methods directly must respect that ordering.
+- **Mapping matrix is row-stochastic CSR** (`scipy.sparse.csr_matrix`, float32). Rows are normalized to sum to 1; zero-neighbor rows are left as-is. Changes to this contract affect every downstream transfer.
+- **Matrix powers `t > 1` are self-mapping-only** (`MappingOperator._validate_power` raises otherwise). Iterative mode preserves input sparsity; spectral mode always returns dense. A warning suggests spectral when `t > 10`.
+- **Optional k-NN backends fail fast.** `check.check_deps()` is called at backend construction with clear install hints — no silent fallback. Supported backends: `sklearn`, `pynndescent`, `faiss-cpu`, `faiss-gpu`, `rapids`. `sklearn` warns above 50k cells.
+- **Kernel taxonomy lives in `constants.py`** (`JACCARD_BASED_KERNELS`, `CONNECTIVITY_BASED_KERNELS`, `SELF_MAPPING_ONLY_KERNELS`). The `umap` kernel requires a square neighbor matrix.
+- **`Neighbors` strips self-edges** from storage for square matrices; `n_neighbors` counts non-self neighbors.
+- **`query_imputed` is always assembled via `utils.create_imputed_anndata`**. Setter accepts `AnnData | ndarray | csr_matrix | DataFrame | None`; result has `obs`/`obsm` from query and `var`/`varm` from reference.
+- **Public API surface = `__init__.py` `__all__`**: `CellMapper`, `Kernel`, `Neighbors`, `logger`. `EvaluationMixin` and `EmbeddingMixin` are internal. Do not re-export helpers from the top-level package.
+- **Tests mirror source layout.** `src/cellmapper/X.py` → `tests/test_X.py`; `src/cellmapper/model/X.py` → `tests/model/test_X.py`. Integration paths: `tests/model/test_query_to_reference_mapping.py` and `tests/model/test_self_mapping.py`.
+
+## Development Commands
+
+Python 3.11 and newer.
+
+```bash
+hatch test                        # run tests (highest Python)
+hatch test --all                  # full matrix
+hatch run docs:build              # build Sphinx docs
+pre-commit run --all-files        # lint and format
+```
+
+Focused runs (with `uv`):
+
+```bash
+uv run pytest tests/model
+uv run pytest tests/model/test_mapping_operator.py
+uv run pytest tests/test_utils.py
+```
diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 100644
index 0000000..d1292a6
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1,5 @@
+# CellMapper Agent Entry Point
+
+@AGENTS.md
+
+Use `REVIEW_GUIDE.md` for automated PR review workflow, risk areas, and changed-path test lookup.
diff --git a/REVIEW_GUIDE.md b/REVIEW_GUIDE.md
new file mode 100644
index 0000000..f41c1eb
--- /dev/null
+++ b/REVIEW_GUIDE.md
@@ -0,0 +1,138 @@
+# CellMapper Review Guide
+
+This file is the canonical, agent-neutral source of truth for automated PR
+review in this repo. It is written for **agents performing PR reviews on
+GitHub** — use the imperative voice and be concrete.
+
+**Scope: review only.** Your job is to produce review comments and suggestions
+on the PR. Do **not** push commits, modify files, or apply fixes yourself. Any
+changes are the author's call. Flag issues, ask questions, and suggest concrete
+diffs in comments when helpful — but leave the decision and the edits to the
+user.
+
+Use `AGENTS.md` for architecture, invariants, and commands. Use this guide for
+review workflow, risk areas, testing checks, documentation-impact checks, and
+test lookup.
+
+## Review-First Workflow
+
+1. Read the PR body first when it is present.
+2. Check CI status (`gh pr checks <num>`, `gh run view <run-id> --log-failed`) and investigate any test or lint failures before commenting.
+3. Identify changed modules and map them to matching tests (see [Changed-Path Test Lookup](#changed-path-test-lookup)).
+4. Check whether the change touches a repo invariant from `AGENTS.md`.
+5. Prioritize behavioral regressions (mapping-matrix semantics, key-naming contract, optional-dep gating) over style feedback.
+6. Verify that docs (human- and agent-facing) did not become stale — see [Documentation Impact](#documentation-impact).
+
+## High-Risk Areas
+
+- **Mapping matrix semantics:**
+  the matrix is a row-stochastic CSR float32. Changes to normalization, dtype,
+  or sparsity handling silently affect every transfer.
+- **Matrix powers and diffusion:**
+  `t > 1` is gated to self-mapping mode via `MappingOperator._validate_power`.
+  Iterative preserves sparsity; spectral always returns dense. Changes here can
+  break the denoising / spatial-smoothing workflows.
+- **Self-mapping detection:**
+  activates on `reference is None` or `reference is query`. Changes to the
+  identity check or to the warning path can regress self-mapping workflows.
+- **Kernel taxonomy:**
+  new kernels must be registered in `src/cellmapper/constants.py`
+  (`JACCARD_BASED_KERNELS`, `CONNECTIVITY_BASED_KERNELS`,
+  `SELF_MAPPING_ONLY_KERNELS`). `umap` requires square neighbor matrices.
+- **k-NN backend gating:**
+  all optional backends (`faiss-cpu`, `faiss-gpu`, `rapids`, `pynndescent`)
+  must route through `check.check_deps()` and fail with a clear install hint.
+  No silent fallback.
+- **Output contract on AnnData:**
+  results land on `query` only (never `reference`), using
+  `{key}{prediction_postfix}` / `{key}{confidence_postfix}`. Renaming these or
+  changing postfix defaults breaks downstream user code and tutorials.
+- **`query_imputed` construction:**
+  all paths must go through `utils.create_imputed_anndata`. Bypassing it
+  breaks the `obs`/`obsm` from query + `var`/`varm` from reference contract.
+- **Public API surface:**
+  only symbols in `src/cellmapper/__init__.py` `__all__` are public. Flag new
+  re-exports and prefer keeping internals (e.g. `EvaluationMixin`,
+  `EmbeddingMixin`) private.
+
+## Changed-Path Test Lookup
+
+Tests mirror the source tree.
+
+| Changed path | Matching tests |
+|--------------|----------------|
+| `src/cellmapper/model/cellmapper.py` | `tests/model/test_query_to_reference_mapping.py`, `tests/model/test_self_mapping.py` |
+| `src/cellmapper/model/kernel.py` | `tests/model/test_kernel.py` |
+| `src/cellmapper/model/mapping_operator.py` | `tests/model/test_mapping_operator.py` |
+| `src/cellmapper/model/neighbors.py` | `tests/model/test_neighbors.py` |
+| `src/cellmapper/model/embedding.py` | `tests/model/test_embedding.py` |
+| `src/cellmapper/model/evaluate.py` | `tests/model/test_evaluate.py` |
+| `src/cellmapper/model/_knn_backend.py` | `tests/model/test_neighbors.py`, `tests/model/test_kernel.py` |
+| `src/cellmapper/check.py` | `tests/test_check.py` |
+| `src/cellmapper/utils.py` | `tests/test_utils.py` |
+| Any end-to-end behavioral change | also check `tests/test_basic.py` |
+
+Cross-cutting fixture changes: inspect `tests/conftest.py` and `tests/data/`.
+
+## Testing
+
+Apply these checks whenever the PR touches code or tests.
+
+**New code.** Confirm that new behavior is covered by tests.
+- Reuse fixtures from `tests/conftest.py` rather than creating parallel ones.
+- Prefer `pytest.mark.parametrize` over many near-identical tests.
+- Favor few meaningful tests over many redundant ones; flag low-value tests that only duplicate existing coverage.
+
+**Failing tests.** If CI is red, do not wave it through.
+- Inspect which tests fail and why (`gh pr checks`, `gh run view --log-failed`).
+- Distinguish critical regressions (mapping-matrix semantics, key-naming contract, backend gating) from trivial or flaky failures.
+- Surface critical failures back to the author and ask them to fix before merge.
+
+**Modified tests.** Scrutinize *how* existing tests were changed.
+- PRs that only relax thresholds, remove assertions, delete cases, or loosen `parametrize` matrices are a red flag — tests-working-around-tests defeats the purpose.
+- Require an explicit justification in the PR body for any weakened assertion; do not accept silently.
+
+## Documentation Impact
+
+A single behavioral or API change often touches docs in multiple places. Check
+both audiences and ask the author to update what is stale. Point to the
+**owning file** for each topic (see the `AGENTS.md` "Where To Find What" table)
+rather than duplicating content in your review.
+
+**Human-facing docs (`docs/`, Sphinx/RTD).**
+- API signature or public symbol changes → `docs/api.md` and autosummary entries; any prose referencing the symbol.
+- Contributor workflow, environment, or build changes → `docs/contributing.md`.
+- Tutorials under `docs/notebooks/tutorials/` → flag stale imports, outputs, or API usage.
+- Method-level behavior → docstrings in `src/cellmapper/`.
+
+**Agent-facing docs (repo root and `.github/`).**
+- Invariants or development commands changed → `AGENTS.md` (Critical Invariants, Development Commands).
+- Review workflow, risk areas, or testing conventions changed → `REVIEW_GUIDE.md` (this file).
+- Repo structure, new top-level docs, or moved pointers → `AGENTS.md` "Where To Find What" table, `CLAUDE.md`, `.github/copilot-instructions.md`.
+
+If behavior changes but the relevant docs do not, call it out explicitly in the
+review and request the update.
+
+## Review Checklist
+
+- Does the change preserve the invariants in `AGENTS.md`?
+- Does CI pass, and were any failures investigated? (See [Testing](#testing).)
+- Is test coverage adequate and non-redundant, and are modified tests not simply weakened? (See [Testing](#testing).)
+- Does it alter the mapping-matrix contract (row-stochastic, CSR, float32) or the matrix-power self-mapping gate?
+- Does it change the output key-naming contract or the `query_imputed` construction path?
+- Are optional-dependency imports properly gated via `check.check_deps()`?
+- Are all affected human- and agent-facing docs updated? (See [Documentation Impact](#documentation-impact).)
+- Is the PR scope tight — no unrelated changes bundled in — and is the public API surface kept minimal?
+
+## PR Metadata
+
+This repo uses a structured PR template
+(`.github/PULL_REQUEST_TEMPLATE.md`).
+
+Reviewers and agents should treat these sections as the preferred summary surface:
+- summary
+- behavior or invariants changed
+- tests run
+- reviewer focus
+- context
+- open questions or follow-ups

From bd51e64b9c4588c0c44e97f177aef06875425b9b Mon Sep 17 00:00:00 2001
From: Marius Lange <marius.lange@bsse.ethz.ch>
Date: Fri, 24 Apr 2026 14:53:58 -0600
Subject: [PATCH 2/3] docs: verify AGENTS.md against code and deduplicate
 REVIEW_GUIDE.md
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Audit pass after the initial port. Two concerns:

1. AGENTS.md had hardcoded values ("sklearn warns above 50k cells",
   "warning suggests spectral when t > 10") that actually live in
   `PackageConstants.SKLEARN_WARNING_CUTOFF` and
   `SPECTRAL_METHOD_THRESHOLD`. Encoding them here means two places
   to update and silent drift. Replaced with a pointer to
   `src/cellmapper/constants.py` in the "Where To Find What" table;
   dropped the hardcoded numbers.
2. REVIEW_GUIDE.md's "High-Risk Areas" restated 7 of 12 invariants
   from AGENTS.md almost verbatim. Same unmaintainability problem as
   sinkhorn_embeddings: every invariant change needs two edits.

Changes:

- AGENTS.md: collapse "Iterative mode preserves sparsity; spectral
  returns dense" (implementation detail — stays in the
  mapping_operator docstring). Drop specific integration test file
  names from the tests invariant (they live in REVIEW_GUIDE's
  changed-path table). Add constants.py to the "Where To Find What"
  row so the SKLEARN_WARNING_CUTOFF / SPECTRAL_METHOD_THRESHOLD
  thresholds have a discoverable home.
- REVIEW_GUIDE.md: rewrite "High-Risk Areas" as file-path pointers
  with a short hazard, not restatements. Add explicit "Do not
  restate them here — link" note at the top. Trim the checklist and
  docs-impact sections to match. Net: 140 -> 83 lines.

Every surviving claim in AGENTS.md was checked against the actual
source:
- Self-mapping detection: `cellmapper.py` lines 41-48.
- Row-stochastic CSR float32: `mapping_operator.py`
  `_validate_and_normalize_mapping_matrix`, lines 206-211.
- `t > 1` self-mapping gate: `_validate_power`, lines 215-221.
- `check_deps` fail-fast: `check.py::check_deps`, line 96.
- Kernel taxonomy: `constants.py` lines 12-17.
- `Neighbors` self-edge stripping: `neighbors.py::__post_init__`
  line 72.
- `create_imputed_anndata` signature: `utils.py` line 17.
- Public API: `src/cellmapper/__init__.py` `__all__`.
---
 AGENTS.md       |  14 ++---
 REVIEW_GUIDE.md | 140 +++++++++++++-----------------------------------
 2 files changed, 45 insertions(+), 109 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index b41ac97..8aec807 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -32,24 +32,24 @@ table below — everything else is delegated.
 | Release notes | `docs/changelog.md` |
 | PR review workflow and risk areas | `REVIEW_GUIDE.md` |
 | Test fixtures | `tests/conftest.py`, `tests/data/` |
-| Kernel taxonomy (which kernels, which are self-mapping-only) | `src/cellmapper/constants.py` |
+| Kernel taxonomy and tunable thresholds (sklearn warning cutoff, spectral threshold) | `src/cellmapper/constants.py` |
 | Optional-dependency gating | `src/cellmapper/check.py` |
 | Method-level behavior (parameters, return shapes) | docstrings in `src/cellmapper/model/` |
 
 ## Critical Invariants
 
-- **Self-mapping mode** activates when `reference is None` **or** `reference is query` (object identity). The second case logs a warning. See `CellMapper.__init__` in `src/cellmapper/model/cellmapper.py`.
+- **Self-mapping mode** activates when `reference is None` **or** `reference is query` (object identity). See `CellMapper.__init__` in `src/cellmapper/model/cellmapper.py`.
 - **Reference is read-only.** `.map()` never mutates the reference AnnData. `query` is mutated in place for `map_obs` / `map_obsm`. Expression transfer produces a separate `query_imputed` AnnData object, not a view.
 - **Output key naming** follows `{key}{prediction_postfix}` and `{key}{confidence_postfix}` in `query.obs` / `query.obsm`. Postfixes are user-controllable via `.map()` kwargs.
 - **`.map()` auto-chains** `compute_neighbors` → `compute_mapping_matrix` → `map_obs/obsm/layers` based on missing state. Callers that use these methods directly must respect that ordering.
-- **Mapping matrix is row-stochastic CSR** (`scipy.sparse.csr_matrix`, float32). Rows are normalized to sum to 1; zero-neighbor rows are left as-is. Changes to this contract affect every downstream transfer.
-- **Matrix powers `t > 1` are self-mapping-only** (`MappingOperator._validate_power` raises otherwise). Iterative mode preserves input sparsity; spectral mode always returns dense. A warning suggests spectral when `t > 10`.
-- **Optional k-NN backends fail fast.** `check.check_deps()` is called at backend construction with clear install hints — no silent fallback. Supported backends: `sklearn`, `pynndescent`, `faiss-cpu`, `faiss-gpu`, `rapids`. `sklearn` warns above 50k cells.
-- **Kernel taxonomy lives in `constants.py`** (`JACCARD_BASED_KERNELS`, `CONNECTIVITY_BASED_KERNELS`, `SELF_MAPPING_ONLY_KERNELS`). The `umap` kernel requires a square neighbor matrix.
+- **Mapping matrix is row-stochastic CSR** (`scipy.sparse.csr_matrix`, float32). Rows are normalized to sum to 1; zero-neighbor rows are left as-is. See `MappingOperator._validate_and_normalize_mapping_matrix`.
+- **Matrix powers `t > 1` are self-mapping-only** (`MappingOperator._validate_power` raises otherwise).
+- **Optional k-NN backends fail fast.** `check.check_deps()` is called at backend construction with clear install hints — no silent fallback. Supported backends: `sklearn`, `pynndescent`, `faiss-cpu`, `faiss-gpu`, `rapids`.
+- **Kernel taxonomy lives in `constants.py`** (`JACCARD_BASED_KERNELS`, `CONNECTIVITY_BASED_KERNELS`, `SELF_MAPPING_ONLY_KERNELS`). Kernels in `SELF_MAPPING_ONLY_KERNELS` require a square neighbor matrix.
 - **`Neighbors` strips self-edges** from storage for square matrices; `n_neighbors` counts non-self neighbors.
 - **`query_imputed` is always assembled via `utils.create_imputed_anndata`**. Setter accepts `AnnData | ndarray | csr_matrix | DataFrame | None`; result has `obs`/`obsm` from query and `var`/`varm` from reference.
 - **Public API surface = `__init__.py` `__all__`**: `CellMapper`, `Kernel`, `Neighbors`, `logger`. `EvaluationMixin` and `EmbeddingMixin` are internal. Do not re-export helpers from the top-level package.
-- **Tests mirror source layout.** `src/cellmapper/X.py` → `tests/test_X.py`; `src/cellmapper/model/X.py` → `tests/model/test_X.py`. Integration paths: `tests/model/test_query_to_reference_mapping.py` and `tests/model/test_self_mapping.py`.
+- **Tests mirror source layout.** `src/cellmapper/X.py` → `tests/test_X.py`; `src/cellmapper/model/X.py` → `tests/model/test_X.py`.
 
 ## Development Commands
 
diff --git a/REVIEW_GUIDE.md b/REVIEW_GUIDE.md
index f41c1eb..c9267f2 100644
--- a/REVIEW_GUIDE.md
+++ b/REVIEW_GUIDE.md
@@ -1,66 +1,36 @@
 # CellMapper Review Guide
 
-This file is the canonical, agent-neutral source of truth for automated PR
-review in this repo. It is written for **agents performing PR reviews on
-GitHub** — use the imperative voice and be concrete.
+Agent-neutral PR review playbook. Written for **review agents running on GitHub** — use the imperative voice.
 
-**Scope: review only.** Your job is to produce review comments and suggestions
-on the PR. Do **not** push commits, modify files, or apply fixes yourself. Any
-changes are the author's call. Flag issues, ask questions, and suggest concrete
-diffs in comments when helpful — but leave the decision and the edits to the
-user.
+**Scope: review only.** Produce comments and suggestions. Do **not** push commits, modify files, or apply fixes. Flag issues and suggest diffs in comments; leave the edits to the author.
 
-Use `AGENTS.md` for architecture, invariants, and commands. Use this guide for
-review workflow, risk areas, testing checks, documentation-impact checks, and
-test lookup.
+Architecture, invariants, and commands live in `AGENTS.md`. Do not restate them here — link.
 
-## Review-First Workflow
+## Workflow
 
-1. Read the PR body first when it is present.
-2. Check CI status (`gh pr checks <num>`, `gh run view <run-id> --log-failed`) and investigate any test or lint failures before commenting.
-3. Identify changed modules and map them to matching tests (see [Changed-Path Test Lookup](#changed-path-test-lookup)).
-4. Check whether the change touches a repo invariant from `AGENTS.md`.
-5. Prioritize behavioral regressions (mapping-matrix semantics, key-naming contract, optional-dep gating) over style feedback.
-6. Verify that docs (human- and agent-facing) did not become stale — see [Documentation Impact](#documentation-impact).
+1. Read the PR body.
+2. Check CI (`gh pr checks <num>`, `gh run view <run-id> --log-failed`) and investigate failures before commenting.
+3. Map changed paths to tests (see below) and check whether the change touches an invariant from `AGENTS.md`.
+4. Prioritize behavioral regressions, numerical correctness, and public-contract changes over style.
 
 ## High-Risk Areas
 
-- **Mapping matrix semantics:**
-  the matrix is a row-stochastic CSR float32. Changes to normalization, dtype,
-  or sparsity handling silently affect every transfer.
-- **Matrix powers and diffusion:**
-  `t > 1` is gated to self-mapping mode via `MappingOperator._validate_power`.
-  Iterative preserves sparsity; spectral always returns dense. Changes here can
-  break the denoising / spatial-smoothing workflows.
-- **Self-mapping detection:**
-  activates on `reference is None` or `reference is query`. Changes to the
-  identity check or to the warning path can regress self-mapping workflows.
-- **Kernel taxonomy:**
-  new kernels must be registered in `src/cellmapper/constants.py`
-  (`JACCARD_BASED_KERNELS`, `CONNECTIVITY_BASED_KERNELS`,
-  `SELF_MAPPING_ONLY_KERNELS`). `umap` requires square neighbor matrices.
-- **k-NN backend gating:**
-  all optional backends (`faiss-cpu`, `faiss-gpu`, `rapids`, `pynndescent`)
-  must route through `check.check_deps()` and fail with a clear install hint.
-  No silent fallback.
-- **Output contract on AnnData:**
-  results land on `query` only (never `reference`), using
-  `{key}{prediction_postfix}` / `{key}{confidence_postfix}`. Renaming these or
-  changing postfix defaults breaks downstream user code and tutorials.
-- **`query_imputed` construction:**
-  all paths must go through `utils.create_imputed_anndata`. Bypassing it
-  breaks the `obs`/`obsm` from query + `var`/`varm` from reference contract.
-- **Public API surface:**
-  only symbols in `src/cellmapper/__init__.py` `__all__` are public. Flag new
-  re-exports and prefer keeping internals (e.g. `EvaluationMixin`,
-  `EmbeddingMixin`) private.
+Pointers only — see `AGENTS.md` for the actual invariants.
+
+- **Mapping-matrix construction** (`model/mapping_operator.py`): normalization, dtype, sparsity handling. Silent shifts possible without test failures.
+- **Matrix powers / diffusion** (`model/mapping_operator.py`, `_validate_power`, `_apply_iterative`, `_apply_spectral`): self-mapping gate, iterative-vs-spectral behavior.
+- **Self-mapping detection** (`model/cellmapper.py::CellMapper.__init__`): identity check drives all downstream mode-dependent logic.
+- **Kernel taxonomy** (`constants.py`, `model/kernel.py`, `model/neighbors.py`): new kernels must land in the right set in `constants.py`.
+- **k-NN backend gating** (`model/_knn_backend.py`, `check.py`): optional deps must route through `check.check_deps()`.
+- **AnnData output contract** (`model/cellmapper.py::map_obs / map_obsm / map_layers`): key naming, what gets written where, `query_imputed` construction via `utils.create_imputed_anndata`.
+- **Public API surface** (`src/cellmapper/__init__.py`): new re-exports commit the project to an API.
 
 ## Changed-Path Test Lookup
 
 Tests mirror the source tree.
 
-| Changed path | Matching tests |
-|--------------|----------------|
+| Changed path | Primary tests |
+|--------------|---------------|
 | `src/cellmapper/model/cellmapper.py` | `tests/model/test_query_to_reference_mapping.py`, `tests/model/test_self_mapping.py` |
 | `src/cellmapper/model/kernel.py` | `tests/model/test_kernel.py` |
 | `src/cellmapper/model/mapping_operator.py` | `tests/model/test_mapping_operator.py` |
@@ -70,69 +40,35 @@ Tests mirror the source tree.
 | `src/cellmapper/model/_knn_backend.py` | `tests/model/test_neighbors.py`, `tests/model/test_kernel.py` |
 | `src/cellmapper/check.py` | `tests/test_check.py` |
 | `src/cellmapper/utils.py` | `tests/test_utils.py` |
-| Any end-to-end behavioral change | also check `tests/test_basic.py` |
-
-Cross-cutting fixture changes: inspect `tests/conftest.py` and `tests/data/`.
+| End-to-end behavioral change | also `tests/test_basic.py` |
+| Fixture changes | `tests/conftest.py`, `tests/data/` |
 
 ## Testing
 
-Apply these checks whenever the PR touches code or tests.
-
-**New code.** Confirm that new behavior is covered by tests.
-- Reuse fixtures from `tests/conftest.py` rather than creating parallel ones.
-- Prefer `pytest.mark.parametrize` over many near-identical tests.
-- Favor few meaningful tests over many redundant ones; flag low-value tests that only duplicate existing coverage.
-
-**Failing tests.** If CI is red, do not wave it through.
-- Inspect which tests fail and why (`gh pr checks`, `gh run view --log-failed`).
-- Distinguish critical regressions (mapping-matrix semantics, key-naming contract, backend gating) from trivial or flaky failures.
-- Surface critical failures back to the author and ask them to fix before merge.
-
-**Modified tests.** Scrutinize *how* existing tests were changed.
-- PRs that only relax thresholds, remove assertions, delete cases, or loosen `parametrize` matrices are a red flag — tests-working-around-tests defeats the purpose.
-- Require an explicit justification in the PR body for any weakened assertion; do not accept silently.
+- **New code** should be covered. Reuse fixtures from `tests/conftest.py`; prefer `pytest.mark.parametrize`; favor few meaningful tests over many redundant ones.
+- **Failing CI** is not to be waved through. Distinguish critical regressions from flakes; escalate critical ones.
+- **Modified tests** — scrutinize *how*. Relaxed tolerances, removed assertions, deleted cases, or loosened matrices are red flags. Require explicit justification in the PR body.
 
 ## Documentation Impact
 
-A single behavioral or API change often touches docs in multiple places. Check
-both audiences and ask the author to update what is stale. Point to the
-**owning file** for each topic (see the `AGENTS.md` "Where To Find What" table)
-rather than duplicating content in your review.
+Behavior or API changes often touch docs in multiple places. Point at the **owning file** (see the `AGENTS.md` "Where To Find What" table) — don't duplicate content in the review.
 
-**Human-facing docs (`docs/`, Sphinx/RTD).**
-- API signature or public symbol changes → `docs/api.md` and autosummary entries; any prose referencing the symbol.
-- Contributor workflow, environment, or build changes → `docs/contributing.md`.
+- Public symbol / API changes → `docs/api.md`, autosummary, `README.md` quickstart, source docstrings.
+- Contributor workflow or env changes → `docs/contributing.md`.
 - Tutorials under `docs/notebooks/tutorials/` → flag stale imports, outputs, or API usage.
-- Method-level behavior → docstrings in `src/cellmapper/`.
-
-**Agent-facing docs (repo root and `.github/`).**
-- Invariants or development commands changed → `AGENTS.md` (Critical Invariants, Development Commands).
-- Review workflow, risk areas, or testing conventions changed → `REVIEW_GUIDE.md` (this file).
-- Repo structure, new top-level docs, or moved pointers → `AGENTS.md` "Where To Find What" table, `CLAUDE.md`, `.github/copilot-instructions.md`.
+- Invariants / commands → `AGENTS.md`.
+- Review workflow / risk areas / test lookup → this file.
+- `CLAUDE.md` and `.github/copilot-instructions.md` should stay thin pointers — flag any PR that re-adds content here.
 
-If behavior changes but the relevant docs do not, call it out explicitly in the
-review and request the update.
+## Checklist
 
-## Review Checklist
-
-- Does the change preserve the invariants in `AGENTS.md`?
-- Does CI pass, and were any failures investigated? (See [Testing](#testing).)
-- Is test coverage adequate and non-redundant, and are modified tests not simply weakened? (See [Testing](#testing).)
-- Does it alter the mapping-matrix contract (row-stochastic, CSR, float32) or the matrix-power self-mapping gate?
-- Does it change the output key-naming contract or the `query_imputed` construction path?
-- Are optional-dependency imports properly gated via `check.check_deps()`?
-- Are all affected human- and agent-facing docs updated? (See [Documentation Impact](#documentation-impact).)
-- Is the PR scope tight — no unrelated changes bundled in — and is the public API surface kept minimal?
+- Invariants in `AGENTS.md` preserved?
+- CI green (or failures investigated)?
+- Test coverage adequate and not silently weakened?
+- Public contracts (AnnData output, mapping matrix format, public API surface) unchanged — or explicitly called out in the PR body?
+- Affected human- and agent-facing docs updated?
+- PR scope tight, no unrelated bundling?
 
 ## PR Metadata
 
-This repo uses a structured PR template
-(`.github/PULL_REQUEST_TEMPLATE.md`).
-
-Reviewers and agents should treat these sections as the preferred summary surface:
-- summary
-- behavior or invariants changed
-- tests run
-- reviewer focus
-- context
-- open questions or follow-ups
+This repo uses `.github/PULL_REQUEST_TEMPLATE.md`. Treat its sections as the preferred summary surface.

From 0cd9555bec50437e543ca87c5ad2b16a025a0265 Mon Sep 17 00:00:00 2001
From: Marius Lange <marius.lange@bsse.ethz.ch>
Date: Fri, 24 Apr 2026 15:01:56 -0600
Subject: [PATCH 3/3] docs: fix three AGENTS.md inaccuracies after final code
 audit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Mapping matrix is not always CSR. `_validate_and_normalize_mapping_matrix`
  returns CSR for sparse inputs and ndarray for dense inputs; both are
  float32, both row-stochastic. Replace "row-stochastic CSR" with the
  accurate "row-stochastic and float32. Sparse inputs are stored as CSR;
  dense inputs stay dense."
- Postfix claim was inaccurate. Only `prediction_postfix` is exposed on
  `.map()`; `confidence_postfix` is only on the per-method entrypoints
  (`map_obs`, `map_obsm`). Rephrase accordingly.
- Python version line was vague ("3.11 and newer"). Match the actual
  `hatch-test` matrix (3.11 and 3.14) and point readers to pyproject.toml.

No information lost from the original `.github/copilot-instructions.md`
that isn't already covered by delegation:
- Domain context / joint-embedding requirement → README.md.
- Ruff / docstring conventions → pyproject.toml + docs/contributing.md.
- `hatch-vcs` / semver release → docs/contributing.md.
- Outdated bits (mapping matrices "typically dense", `_predicted`/
  `_confidence` postfixes) are correctly dropped — the current code says
  sparse CSR / `_pred` / `_conf`.
---
 AGENTS.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index 8aec807..22452c4 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -40,9 +40,9 @@ table below — everything else is delegated.
 
 - **Self-mapping mode** activates when `reference is None` **or** `reference is query` (object identity). See `CellMapper.__init__` in `src/cellmapper/model/cellmapper.py`.
 - **Reference is read-only.** `.map()` never mutates the reference AnnData. `query` is mutated in place for `map_obs` / `map_obsm`. Expression transfer produces a separate `query_imputed` AnnData object, not a view.
-- **Output key naming** follows `{key}{prediction_postfix}` and `{key}{confidence_postfix}` in `query.obs` / `query.obsm`. Postfixes are user-controllable via `.map()` kwargs.
+- **Output key naming** follows `{key}{prediction_postfix}` and `{key}{confidence_postfix}` in `query.obs` / `query.obsm`. Postfixes are user-controllable on the per-method entrypoints (`map_obs`, `map_obsm`); `.map()` also exposes `prediction_postfix`.
 - **`.map()` auto-chains** `compute_neighbors` → `compute_mapping_matrix` → `map_obs/obsm/layers` based on missing state. Callers that use these methods directly must respect that ordering.
-- **Mapping matrix is row-stochastic CSR** (`scipy.sparse.csr_matrix`, float32). Rows are normalized to sum to 1; zero-neighbor rows are left as-is. See `MappingOperator._validate_and_normalize_mapping_matrix`.
+- **Mapping matrix is row-stochastic and float32.** Sparse inputs are stored as `scipy.sparse.csr_matrix`; dense inputs stay dense. Zero-neighbor rows are left as-is. See `MappingOperator._validate_and_normalize_mapping_matrix`.
 - **Matrix powers `t > 1` are self-mapping-only** (`MappingOperator._validate_power` raises otherwise).
 - **Optional k-NN backends fail fast.** `check.check_deps()` is called at backend construction with clear install hints — no silent fallback. Supported backends: `sklearn`, `pynndescent`, `faiss-cpu`, `faiss-gpu`, `rapids`.
 - **Kernel taxonomy lives in `constants.py`** (`JACCARD_BASED_KERNELS`, `CONNECTIVITY_BASED_KERNELS`, `SELF_MAPPING_ONLY_KERNELS`). Kernels in `SELF_MAPPING_ONLY_KERNELS` require a square neighbor matrix.
@@ -53,7 +53,7 @@ table below — everything else is delegated.
 
 ## Development Commands
 
-Python 3.11 and newer.
+Python 3.11 and 3.14 (see the `hatch-test` matrix in `pyproject.toml`).
 
 ```bash
 hatch test                        # run tests (highest Python)