Add incremental sparse RREF C-API (libsparse_rref_inc) by thibautbar · Pull Request #1 · thibautbar/SparseRREF

thibautbar · 2026-05-04T15:15:28Z

Summary

Adds a stateful, row-by-row sparse RREF API alongside the existing one-shot
sparse_mat_rref path, exposed as a separate translation unit
sprref_incremental.{cpp,h} that builds into a shared library
libsparse_rref_inc.{so,dylib}.

The motivation comes from project-numina/project-feynman:
SparseRREFBatchSolver there used to drive the sparserref CLI binary for
end-of-run RREF. That works great when is_solved is queried only once at
the end (check_at_end=True), but for flows that need per-step is_solved
(early termination, FIRE6-style dynamic seed expansion, intermediate logs,
pivot tracking — check_at_end=False) re-running the batch binary on every
step is O(N²). This branch lets project-feynman keep a persistent in-memory
basis that's updated row-by-row.

The legacy CLI (main.cpp) is unchanged. This is purely additive.

Public C API

sprref_inc_t* sprref_inc_init(uint64_t field_order, int n_threads);
void          sprref_inc_free(sprref_inc_t*);

void sprref_inc_resize_nvars(sprref_inc_t*, uint32_t);
void sprref_inc_set_masters(sprref_inc_t*, const uint32_t* cols, size_t n);
void sprref_inc_set_pivot_order(sprref_inc_t*, const uint32_t* cols, const uint64_t* keys, size_t n);
void sprref_inc_add_pivot_keys(sprref_inc_t*, const uint32_t* cols, const uint64_t* keys, size_t n);
void sprref_inc_add_pivot_keys2(sprref_inc_t*, const uint32_t* cols, const uint64_t* k0, const uint64_t* k1, size_t n);

/* Single-call insert (uses the C-side pivot key map). */
int  sprref_inc_insert(sprref_inc_t*, const uint32_t* cols, const uint64_t* vals,
                       size_t nnz, uint64_t rhs, uint32_t preferred_pivot);

/* Two-phase insert: forward elim in C, pivot choice in caller, then commit.
   Designed for callers whose natural pivot ordering is too rich to encode
   in fixed-width C-side keys (e.g. Python tuple Laporta keys). */
int  sprref_inc_insert_forward(sprref_inc_t*, const uint32_t* cols, const uint64_t* vals,
                               size_t nnz, uint64_t rhs,
                               uint32_t** out_candidate_cols, size_t* out_n);
void sprref_inc_commit_pivot(sprref_inc_t*, uint32_t pivot_col);
void sprref_inc_abort_pending(sprref_inc_t*);
void sprref_inc_buffer_free_u32(uint32_t* ptr);

/* Query: returns NOT_PIVOT / HAS_FREE / SOLVED + the stored row's expression. */
int  sprref_inc_is_solved(sprref_inc_t*, uint32_t var_idx,
                          const uint32_t* acceptable_free, size_t n_free,
                          uint32_t** out_cols, uint64_t** out_vals,
                          size_t* out_nnz, uint64_t* out_rhs);
void sprref_inc_buffer_free(uint32_t* cols, uint64_t* vals);

size_t   sprref_inc_rank(const sprref_inc_t*);
uint32_t sprref_inc_nvars(const sprref_inc_t*);
const char* sprref_inc_version(void);

State is held entirely in the opaque sprref_inc_t* handle (no globals), so
multiple handles can coexist for multiprocessing.Pool workers.

How it works

Forward elimination loops over pivot columns present in the incoming
row, saxpy-ing against each basis row until none remain. Modular arithmetic
uses 128-bit intermediate to support primes up to 2^63; modular inverse via
FLINT's n_invmod.
Pivot choice:
- The single-call insert uses the C-side pivot key map (smallest
  lex (key0, key1) wins, masters excluded, column index breaks ties).
- The two-phase insert_forward returns the candidate columns and lets
  the caller (e.g. Python) pick the pivot using its native ordering.
Eager backsub: after a new pivot is chosen and normalised, every
existing basis row touching that column is saxpy-ed to zero out the entry.
Result: the basis is in true RREF after every commit, making is_solved
a near-O(1) lookup.
Reverse index col_to_pivots: col → set<pivot_col> lets eager backsub
find affected basis rows in O(degree) instead of scanning the full basis.

Build

build_incremental.sh produces the shared library:

bash build_incremental.sh
# Output: libsparse_rref_inc.{so,dylib} in the current directory.

Honors CXX, FLINT_PREFIX, GMP_PREFIX env vars. Defaults to:

g++-15 on macOS (Apple Clang lacks <execution> parallel STL — same
caveat as main.cpp)
g++ on Linux
FLINT_PREFIX=/opt/homebrew on macOS, auto-detect on Linux
(/usr/include/flint → empty / system search paths; else /usr/local)

Tested on macOS arm64 (Homebrew) and intended for Ubuntu (apt
libflint-dev libgmp-dev libtbb-dev); the Linux side is exercised by
project-feynman's CI.

Performance (project-feynman play_tri2lx 7D, 5696 steps, mod 2017)

backend	wall time	step count	notes
`sparse_rref_batch` (CLI binary, batch)	13.75s	5696	reference; check_at_end=True
`sparse_rref_batch` + `mode=incremental`	15.55s	5696	this PR; check_at_end=False
`spotlight_solver_sparse` (pure Python)	20.73s	5696	per-step Python sparse RREF

→ Within ~13% of the highly-optimised batch path while supporting per-step
queries; ~25% faster than the pure-Python equivalent.

Pivot decisions match SpotlightSolverSparseGF bit-for-bit (verified by
project-feynman's BL2EM end-to-end test, which compares the resolved
target's coefficients on each master).

Commits in this branch

84ec9a3 — Scaffolding (M1): API surface, stub implementation, build script
5028015 — Forward elimination + insert + eager backsub (M2)
119adcc — is_solved + expression extraction (M3)
fb100a7 — 3-way is_solved return (NOT_PIVOT / HAS_FREE / SOLVED)
077f23c — add_pivot_keys for delta upserts (avoid full-replace traffic)
fb8160d — build_incremental.sh Linux/Ubuntu auto-detect
5f501f2 — Two-phase insert_forward + commit_pivot (caller-side pivot choice)

Notes for future work

Numerator-powers tuple in C-side ordering: the optional 2-uint64 key
pair (add_pivot_keys2) packs scalars + denom_powers but drops
numerator_powers. Workloads that need that field for tie-breaking should
prefer the two-phase API. A byte-buffer key API could be added if needed.
Multi-threading: n_threads is accepted but unused (M2 is
single-threaded); FLINT's parallel kernels could be wired into saxpy /
backsub later.
Upstream PR: this is a self-PR on the fork to track the work for
project-feynman; opening an upstream PR against munuxi/sparse_rref is
a separate follow-up if you want this contributed back.

Introduces a new translation unit `sprref_incremental.cpp` (with public header `sprref_incremental.h`) exposing a C-callable API for stateful, row-by-row sparse RREF over GF(p): sprref_inc_init / free sprref_inc_resize_nvars / set_masters / set_pivot_order sprref_inc_insert sprref_inc_is_solved / buffer_free sprref_inc_rank / nvars / version State lives in an opaque sprref_inc_t* handle (no globals; multiple handles can coexist for mp.Pool workers). Designed to be loaded from Python via ctypes by project-feynman's SparseRREFBatchSolver in "incremental" mode, complementing the existing batch CLI flow. This commit is the M1 scaffolding only: the API surface compiles into a shared library and round-trips through ctypes, but the algorithmic core (forward elimination, lazy backward substitution, expression extraction) is stubbed and will land in M2/M3. Build: bash build_incremental.sh (uses g++-15 + flint + gmp + tbb on macOS arm64; honors CXX/FLINT_PREFIX/GMP_PREFIX env vars). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fills in the algorithmic core left stubbed by M1: forward_eliminate loops over pivot columns present in the incoming row, saxpy'ing against each basis row until none remain. Robust to any RREF state (true RREF or row-echelon). choose_pivot honors the new preferred_pivot parameter (matches SpotlightSolverSparseGF semantics — used for sector operator relations); otherwise picks the lowest pivot-key non-master column (column index breaks ties for determinism). normalize_around_pivot scales the row by inverse(row[pivot]) mod p using FLINT's n_invmod, then drops the pivot entry (its coefficient is implicitly 1). eager backsub for each existing basis row touching the new pivot column (found via col_to_pivots reverse index), saxpy to zero out that entry. Result: basis is left in true RREF after every insert, making is_solved a near-O(1) lookup in M3. API change: sprref_inc_insert gains a `preferred_pivot` parameter (SPRREF_INC_NO_PREF sentinel = no preference). Bumps version to v0.1.0. Modular arithmetic uses 128-bit intermediate for primes up to 2^63. Verified end-to-end via ctypes smoke tests: - independent / dependent / inconsistent classification - rank monotonicity - master columns excluded from pivot selection - eager backsub correctness (manual trace confirmed for x0+x1=1, x0-x1=3 reducing redundant 2x0=4 to dependent and x0=5 to inconsistent against the resolved basis x0=2, x1=16 mod 17). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Now that the basis is maintained in true RREF after every insert (M2's eager backsub), is_solved is a near-O(1) lookup: - var_idx not in basis -> NOT_SOLVED, no buffers populated - var_idx is a pivot: * count off-pivot entries that are neither masters nor in acceptable_free; if any, return NOT_SOLVED (but still emit the buffers, which is useful for intermediate logs / pretty-printing in fire6 mode) * otherwise return SOLVED with full expression Output buffers are heap-allocated via std::malloc and must be released through sprref_inc_buffer_free. nnz==0 returns NULL pointers (no allocation). Coefficients are returned as STORED (positive mod p); the Python wrapper applies (-coeff) % p in M4 to match the existing solver-result convention. Smoke-tested via ctypes: - 2x2 closed-form system (x0+x1=1, x0-x1=3 -> x0=2, x1=16 mod 17) - master columns acceptable as free - acceptable_free overrides - eager backsub correctness on a 3-var system with chained inserts - preferred_pivot honored (forces pivot=1 when normally 0 would win) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous 2-state return (SOLVED vs NOT_SOLVED) couldn't distinguish "var_idx is not a pivot at all" from "var_idx is a pivot but has free columns left". Both look the same to a Python caller that needs to mimic SpotlightSolverSparseGF.is_solved (which returns a string sentinel for non-pivots and a dict for pivots-with-free). Replace with three values: SPRREF_INC_NOT_PIVOT (0) var has no basis row SPRREF_INC_HAS_FREE (1) is a pivot, has unaccounted free columns SPRREF_INC_SOLVED (2) is a pivot, fully reduced Output buffers are still populated whenever var_idx is a pivot (HAS_FREE or SOLVED), enabling intermediate-log use cases. NOT_PIVOT leaves them untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Companion to sprref_inc_set_pivot_order, this upserts (col -> key) entries without clearing the existing map. Designed for monotonically growing pivot orderings (Laporta key inserted as each new integral is added), where the previous full-replace API forced O(n) traffic per step. The Python wrapper drives this via add_pivot_keys + a packed uint64 encoding of the Laporta key — together this fixes the dominant hot path on tri2lx 7D (set_pivot_order was 76% of total runtime, profiled with cProfile; eliminating the full-replace + Python sort gives a 2.6x speedup end-to-end). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

On Ubuntu/Debian where libflint-dev is installed via apt, headers live in /usr/include/flint and libs in a multiarch path that g++ already searches by default. Forcing -I/usr/local/include / -L/usr/local/lib breaks those installs. The script now: - macOS: keep the existing /opt/homebrew default - Linux with /usr/include/flint present: leave include/lib flags empty and rely on the compiler's default search paths - Otherwise: fall back to /usr/local for source-built FLINT installs This unblocks CI on ubuntu-latest where libflint-dev is the simplest install path. macOS arm64 Homebrew users see no change (verified locally — same .dylib output). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the C-side `choose_pivot` (which compared opaque uint64 keys that couldn't fully encode a Python tuple-keyed Laporta ordering) with a two-phase split: sprref_inc_insert_forward Phase 1: forward eliminate the row against existing pivots, stash the reduced row in the handle's pending slot, and return DEPENDENT / INCONSISTENT / NEEDS_PIVOT. On NEEDS_PIVOT the caller receives a heap-allocated array of the reduced row's candidate column indices. sprref_inc_commit_pivot Phase 2 (success): given the caller's pivot choice, normalise around it, do eager backward substitution against existing basis rows, and insert into the basis. sprref_inc_abort_pending Phase 2 (cancel): discard the pending row (used when all candidates are masters and so dependent is the right answer). sprref_inc_buffer_free_u32 Free the candidate array allocated by insert_forward. This keeps the actual pivot ordering decision in the calling language (Python on our side), where a full Laporta tuple key (denom_count, binary_sector, total_power, numerator_sum, denom_powers, numerator_powers) can be compared natively without trying to pack it into a fixed-width C-side type. End result: bit-for-bit pivot decisions matching SpotlightSolverSparseGF on real workloads. The legacy single-call sprref_inc_insert (which uses the C-side pivot key map) is kept for direct C-API users and the existing equivalence tests; it is not used by the project-feynman wrapper anymore. Internal refactor: hoisted parse_input_row + commit_pivot_internal as shared helpers between the legacy single-call insert and the new two-phase path; no behavior change there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Exposes the col_to_pivots reverse index for the RL observation feature solver_freq, which is called per-step over many columns and was the missing API on the incremental backend (vs SpotlightSolverSparseGF's get_variable_freq). The C-side basis stores a pivot row with its own pivot column erased (implicit coefficient 1), so col_to_pivots never contains the self-edge; we add it back when the queried column is itself a pivot to match Spotlight semantics bit-for-bit. Batched form (n columns at a time) amortises the FFI call overhead — single-column queries are rare; the hot path is the per-step graph view.

Returns the set of pivot columns whose stored basis row contains a given column as a non-zero entry — the C-side col_to_pivots map already maintained for eager backsub, exposed for callers that need it (notably project-feynman's master_pruner, which walks basis rows containing a candidate variable to enumerate its Laporta blockers). Matches SpotlightSolverSparseGF.col_rows[col] semantics: when col is itself a pivot, the result includes col (the C-side col_to_pivots drops the self-edge because the pivot coefficient is implicit; this accessor restores it). Bumps version to v0.7.0.

thibautbar and others added 7 commits May 4, 2026 14:42

thibautbar changed the title ~~Incremental api~~ Add incremental sparse RREF C-API (libsparse_rref_inc) May 4, 2026

thibautbar added 2 commits May 11, 2026 10:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add incremental sparse RREF C-API (libsparse_rref_inc)#1

Add incremental sparse RREF C-API (libsparse_rref_inc)#1
thibautbar wants to merge 9 commits into
masterfrom
incremental-api

thibautbar commented May 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thibautbar commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Public C API

How it works

Build

Performance (project-feynman play_tri2lx 7D, 5696 steps, mod 2017)

Commits in this branch

Notes for future work

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thibautbar commented May 4, 2026 •

edited

Loading