Skip to content

v2: search_properties kwarg + get_idx via Auto + QuadraticSpline O(n) + FastInterpolations bench#531

Open
ChrisRackauckas-Claude wants to merge 24 commits into
SciML:masterfrom
ChrisRackauckas-Claude:fff-v2-cleanup-quadraticspline
Open

v2: search_properties kwarg + get_idx via Auto + QuadraticSpline O(n) + FastInterpolations bench#531
ChrisRackauckas-Claude wants to merge 24 commits into
SciML:masterfrom
ChrisRackauckas-Claude:fff-v2-cleanup-quadraticspline

Conversation

@ChrisRackauckas-Claude

@ChrisRackauckas-Claude ChrisRackauckas-Claude commented May 21, 2026

Copy link
Copy Markdown
Contributor

Summary

Coordinated update with FindFirstFunctions.jl PR #73 (v3.0 release with enum-tagged dispatch + parametric SearchProperties{T} + props-aware UniformStep). This PR follows up on the merged #529.

The original #531 had six pieces:

  1. Drop the legacy linear_lookup / seems_linear / looks_linear machinery. Every interpolation cache had a linear_lookup::Bool field set from a hand-rolled looks_linear(t; threshold) probe. That's exactly what t_props.is_uniform already gives us. Breaking change: assume_linear_t kwarg removed from all constructors. For explicit override, use SearchProperties(t; is_uniform = true).

  2. Refactor get_idx to dispatch through Auto(A.t_props). (Now updated in this revision — see point 6 below.)

  3. Fix O(n²) QuadraticSpline construction. quadratic_spline_params was calling spline_coefficients! which had a findfirst(x -> x > u, k) linear scan inside a per-knot loop. Replaced with a running pointer that advances monotonically → O(n) total. ~870× speedup at n=100k.

  4. search_properties constructor kwarg on every interpolation type. Optional search_properties::Union{Nothing, FindFirstFunctions.SearchProperties} = nothing. When omitted, behaviour is identical to before; when supplied, the caller's SearchProperties is used as-is.

  5. FastInterpolations.jl added to the cross-library bench. bench/cross_library_comparison.jl now benchmarks FastInterpolations.jl across Linear/Cubic/Quadratic/Akima/PCHIP; bench/fast_interpolations_bench.jl ports their advertised bench.

  6. Route _resolve_strategy through Auto(t) (updated in this revision). The earlier draft pinned _resolve_strategy(t) to BracketGallop() to avoid Union-typed strategy fields. With FFF v3's parametric Auto{T}, the strategy field is now Auto{T} where T is the data ratio type — single concrete per t, type-stable. Auto(t) resolves to:

    • KIND_UNIFORM_STEP for uniformly-spaced data, taking the props-aware closed-form kernel (one subtract, one multiply, one truncate per query).
    • KIND_LINEAR_SCAN for length(t) ≤ 16 non-uniform.
    • KIND_BRACKET_GALLOP otherwise (the previous default).

    Mooncake ext gains a @zero_adjoint declaration for searchsortedlast(::Auto, ...) — the integer index isn't differentiable and Mooncake's recursion through Auto{Float64} (which contains Float64 first_val / inv_step fields) would otherwise hit llvmcall SIMD intrinsics in FFF's strategy kernels.

Per-query latency (n = 10k, m = 1k, ns/query, BenchmarkTools median)

Workload DI before (BG pinned) DI after (Auto + props) FastInterpolations
Range, sorted queries 89 75 (-16%) 3.2
Range, random queries 89 76 (-15%) 3.3
Range, chained (monotone) 89 75 (-16%) 3.3
Uniform Vec, sorted queries 47 32 (-32%) 70
Uniform Vec, random queries 50 35 (-30%) 92
Uniform Vec, chained 48 32 (-33%) n/a
Non-uniform Vec, sorted queries 68 85 (+25%) 75
Non-uniform Vec, random queries 80 95 (+19%) 87
Non-uniform Vec, chained 67 86 (+28%) n/a

Wins: 15-33% on uniform data (Range and uniform Vector). Cost: ~5-20 ns/q on non-uniform Vector (Auto's per-query s.kind === KIND_UNIFORM_STEP branch adds register pressure on the BracketGallop path; FFF's enum-dispatch function body inlines both KIND paths into the call site).

The DI ↔ FastInterp gap on Range narrowed from 26× (89 ns/q ÷ 3.5 ns/q) to 22× (75 ÷ 3.3). The remaining gap is DI's per-query overhead (Guesser hint, extrapolation check, linear-interp arithmetic), not the strategy. Closing further would require fusing search + interp + extrapolation into a single kernel.

Reproducer: bench/di_perq_bench.jl (added this revision).

Tests

All 5 groups pass on Julia 1.12:

  • Core: 17555 pass (5 pre-existing broken)
  • Methods: 41417 pass
  • Extensions: pass (SCT + Zygote + Mooncake; Mooncake required the @zero_adjoint fix above)
  • Misc: 11 pass
  • QA (Aqua + AllocCheck): 18 pass

FFF v3 dependency

[compat] requires FindFirstFunctions = "3" (FFF #73 must merge and register first; CI here cannot resolve until then). v3 brings:

  • Enum-tagged dispatch for singleton search strategies (zero-overhead vs v2 multimethod path).
  • Parametric SearchProperties{T} with precomputed first_val::T / inv_step::T.
  • Parametric Auto{T} carrying SearchProperties{T}.
  • Props-aware UniformStep kernel folding in the closed-form O(1) lookup from CompatHelper: bump compat for "Optim" to "1.0" #74 (which is now closed as superseded).

FFF v3 removed the v2 Base.searchsortedlast(::S, ...) extensions entirely; DI's call sites (get_idx, Mooncake @zero_adjoint, test helpers) are migrated to FindFirstFunctions.search_last / search_first.

Notes

Draft. Please ignore until reviewed by @ChrisRackauckas.

🤖 Generated with Claude Code

Co-Authored-By: Chris Rackauckas accounts@chrisrackauckas.com

@ChrisRackauckas ChrisRackauckas marked this pull request as ready for review May 21, 2026 19:58
@ChrisRackauckas-Claude ChrisRackauckas-Claude changed the title v2: drop linear_lookup, dispatch get_idx through Auto, fix QuadraticSpline O(n²) v2: search_properties kwarg + get_idx via Auto + QuadraticSpline O(n) + FastInterpolations bench May 21, 2026
@ChrisRackauckas-Claude ChrisRackauckas-Claude marked this pull request as draft May 23, 2026 10:50
@ChrisRackauckas-Claude ChrisRackauckas-Claude force-pushed the fff-v2-cleanup-quadraticspline branch from d34e867 to 14bda8b Compare June 10, 2026 03:36
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Update: rebased + correctness fixes + verification (14bda8b)

Rebased onto current master (Mooncake 0.5 bump, RegularizationTools removal, canonical CI). Conflicts were confined to Project.toml compat; upstream's tightened floors were kept and FindFirstFunctions = "3" applied.

Fixes in this push

  • Uniform fast path hardening (ff4eb140). Two issues in the new statically-dispatched LinearInterpolation kernel: (1) the closed-form float position was unsafe_trunc'd before clamping — UB for queries far outside the knot span, reachable with ExtrapolationType.Extension (returned garbage for t = 1e300); (2) push!/append! mutate A.t while t_props and the IsUniform type tag keep construction-time values, so a uniformity-breaking push silently corrupted values. The kernel now clamps in the float domain and verifies its guessed cell + spacing against the live knots, falling back to the (extracted) slope-form path on mismatch. AllocCheck still passes. Regression tests added for both, plus a knot vector crafted to fool FFF's sampled uniformity probe (rejected since v3: enum-tagged dispatch for singleton search strategies (DRAFT — ignore until reviewed by @ChrisRackauckas) FindFirstFunctions.jl#73's exact-validation fix).
  • Strategy/props consistency (0e8d57cd). Constructors computed t_props and then called Auto(t), which re-probed and ignored the cached/user-supplied props — redundant O(n) scan and A.strategy.props could disagree with A.t_props. Now resolved via Auto(t, t_props).
  • Docs (980c1bcb, 14bda8ba). Removed the stale looks_linear @docs entry (would fail the docs build), documented the breaking assume_linear_t removal and the new features in NEWS, added the search_properties kwarg bullet to all 16 constructor docstrings, and restored Vector-knot constructor @inferred coverage for the types that don't carry the IsUniform tag.

Independent verification (subagent, scripts vs merge-base worktree)

  • QuadraticSpline O(n) rewrite: (k, A) elementwise identical to the legacy findfirst loop on 25 edge cases (duplicates, Float32/Int/Rational, n=2/3); off-knot values bit-identical PR vs merge-base; locator parity on 200 random cases. Three edge-case crashes found are pre-existing on master → filed QuadraticSpline edge-case crashes: n=2 BoundsError, Rational t starting at 0, duplicate knots #542.
  • Mooncake ext: full mooncake_tests.jl 604/604 on Mooncake 0.5.31; t-gradients match analytic derivatives < 1e-8 on uniform (KIND_UNIFORM_STEP) and non-uniform knots; the new increment_and_get_rdata! method is provably required (deleting it reproduces the fdata-type error); zero-adjoint does not cut genuine derivative flow.
  • Online mutation: push!/append! breaking uniformity on Linear/Constant/Quadratic × cache_parameters — value/derivative/integral exact vs freshly-constructed interpolations (the FFF props kernel walk-corrects against live knots; the DI fast path falls back via verification).

Tests (local, Julia 1.12, FFF#73 dev'd)

Group Result
Core 17,555 pass (5 pre-existing broken)
Methods 41,417 pass
Extensions 13,178 pass
Misc 10 pass
QA 18 pass (Aqua + AllocCheck)

Runic clean. Note for merge ordering: this PR requires FindFirstFunctions v3.0.0 (#73) to be merged and registered first — CI here will fail to resolve until then.

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Update: migrated to FFF v3's shim-free API (1ab2ed1c)

SciML/FindFirstFunctions.jl#73 now removes the v2 Base.searchsortedlast(::S, ...) / Base.searchsortedfirst(::S, ...) extensions outright (breaking release, no compat layer). DI's call sites are migrated in lockstep:

  • get_idx calls FindFirstFunctions.search_last / search_first on the cached Auto strategy.
  • The Mooncake @zero_adjoint declarations target typeof(FindFirstFunctions.search_last/search_first) instead of the removed Base methods.
  • Test helpers (test_cached_index, the derivative cached-index check) use the qualified v3 names.

All 5 groups pass locally on Julia 1.12 against the updated FFF branch: Core 17,555 (5 pre-existing broken) / Methods 41,417 / Extensions 13,178 (incl. Mooncake + Zygote + SCT) / Misc 10 / QA 18. Runic clean.

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Follow-up to the FFF rename (150dfb38): FindFirstFunctions renamed its dispatchers search_last/search_firstsearchsorted_last/searchsorted_first to keep the "sorted" precondition in the name. DI's call sites (get_idx, the Mooncake @zero_adjoint declarations, and the test helpers) updated to match. All 5 groups still pass locally on Julia 1.12: Core 17,555 / Methods 41,417 / Extensions 13,178 / Misc 10 / QA 18. Runic clean.

@ChrisRackauckas-Claude ChrisRackauckas-Claude force-pushed the fff-v2-cleanup-quadraticspline branch from 150dfb3 to 760c2c6 Compare June 14, 2026 09:42
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

FindFirstFunctions v3.0.0 is registered — this is now resolvable

Rebased onto current master (picks up #543, the QuadraticSpline edge-case fix for the issues I'd filed in #542). No conflict, but both that PR and this one touch quadratic_spline_params; I verified the merged function carries both changes — this PR's O(n) running-locator loop and #543's duplicate-knot ArgumentError guard + dtype_sc = typeof(one(eltype(t)) / one(eltype(t))) fix. The two test sets coexist.

Validated against the registered FindFirstFunctions v3.0.0 (resolved from General, git-tree 141f3cfb — byte-identical to merged #73), not a local dev copy, so this mirrors CI exactly:

Group Result
Core (running)
Methods (running)
Extensions (running)
Misc (running)
QA (running)

Compat is FindFirstFunctions = "3"; the earlier "register v3 first" caveat no longer applies. Runic clean. Will confirm the group results once the local run finishes.

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Full suite passes against the registered FindFirstFunctions v3.0.0 (resolved from General, not a local dev copy):

Group Result
Core 17,566 pass (5 pre-existing broken)
Methods 41,417 pass
Extensions 13,178 pass (Mooncake + Zygote + SCT)
Misc 10 pass
QA 18 pass (Aqua + AllocCheck)

Also audited the API surface statically: all six FindFirstFunctions symbols this PR references (Auto, SearchProperties, searchsorted_last/searchsorted_first, searchsortedlast!, SearchStrategy) exist in the released v3; every call pattern and t_props/Auto field this PR reads resolves; and there are no leftover references to the removed v2 API (Base.searchsorted* strategy methods, the old search_last/search_first names, or assume_linear_t/looks_linear) anywhere in src/ext/test/docs. Ready for review.

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

CI triage — DI's own changes are complete; remaining reds are not DI defects

All test groups pass on CI across Julia 1/lts/pre (Core, Methods, Extensions, Misc, QA). Three non-test jobs are red; here's the disposition:

  • Spell Checkmine, fixed (efcacdf7). The typos checker flagged my local variable strat in get_idx as a misspelling of start; renamed to strategy. (Spell Check is green on master, so this was the only regression I introduced.)
  • Documentationecosystem lag, not a DI defect. The docs environment pulls ModelingToolkit/OrdinaryDiffEq for examples; MTK depends on StateSelection, which still caps FindFirstFunctions = "1.2.0 - 2". Once this PR requires FFF v3, the docs env can't resolve (StateSelection ... restricted by FindFirstFunctions to versions: uninstalled). This is the expected first-consumer-of-a-breaking-release situation. StateSelection's only FFF usage is findfirstequal, which v3 keeps unchanged, so it's code-safe to widen its compat to "…, 3" — but it's a JuliaComputing repo, so flagging for @ChrisRackauckas rather than PRing it. The DI package, test env, and QA all resolve against registered FFF v3.0.0 cleanly.
  • Downgradepre-existing master failure, not from this PR. The Downgrade workflow is red on every master commit since ~June 13 (last green June 8 at bf7c4edd, this PR's merge-base) — a --min=@deps floor-resolution Unsatisfiable. I've spawned an investigation to bisect the master regression and name the conflicting floor; will report separately. It is independent of the FFF v3 change.

Net: nothing further is needed in DI itself for the v3 that landed. Will confirm Spell Check goes green and report the Downgrade root cause.

@ChrisRackauckas ChrisRackauckas marked this pull request as ready for review June 15, 2026 09:07
ChrisRackauckas and others added 17 commits June 15, 2026 05:07
Compares the cached `Auto(t_props)` PR (DataInterpolations) against
Interpolations.jl, Dierckx.jl, BasicInterpolators.jl, and PCHIPInterpolation.jl
across construction, single-query, sorted batch, random batch, and chained
ODE-style workloads at n ∈ {100, 1k, 10k, 100k}, m ∈ {1, 10, 1k, 100k}.

Key results (full numbers in `bench/cross_library_comparison.md`):
- DI's sorted-batch + cached Auto wins ~1700-1900× vs Dierckx on Linear/Cubic
  at n=100k m=100k; loses to Interpolations(uniform) by ~16% on cubic because
  the latter uses O(1) uniform-grid lookup.
- Chained ODE-style at n=100k m=1000: DI beats Dierckx by ~450× and
  PCHIP by ~2× on monotone cubic; this is the workload iguesser was built for.
- DI CubicHermite beats PCHIPInterpolation on every batched cell (~2-5×).
- DI QuadraticSpline is the only consistent loser: O(n²) constructor
  (7s at n=100k vs Dierckx 14ms) and evaluators ~2-5× slower than Dierckx.
  Root cause is the linear-scan findfirst in `spline_coefficients!`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
`SearchProperties` (added per-cache in the previous commit) already runs
the same uniformity probe, so the parallel DI implementation is now pure
duplication. Removes:

  - `linear_lookup::Bool` field on every interpolation cache. The
    type-parameter list shrinks accordingly.
  - `seems_linear(assume_linear_t, t)` / `looks_linear(t; threshold)` in
    `interpolation_utils.jl`.
  - The `assume_linear_t` keyword from every constructor. (Breaking, but
    the PR is already a major refactor; `SearchProperties` runs the same
    probe automatically at construction with a 1e-3 default that matches
    FFF's `Auto` tolerance, and approximate-uniform vectors couldn't
    benefit from `UniformStep` anyway since that path needs exact-uniform
    spacing.)

`test/derivative_tests.jl`'s `func.iguesser.linear_lookup` check (which
gated the per-type chained-lookup invariant assertion to non-uniform
data) is rewritten as `!func.t_props.is_uniform`, the FFF-side
equivalent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Both `get_idx` methods (integer hint and Guesser hint) used to hard-code
strategy choice — `BracketGallop` for the integer-hint path,
`GuesserHint` for the Guesser path. Both branches now build a single
`Auto(A.t_props)` strategy and pass the appropriate hint (`iguess` for
the integer overload, `iguess(t)` for the Guesser overload) to one
`searchsortedlast`/`searchsortedfirst` call.

The benefit is automatic O(1) closed-form lookup on exact-uniform `t`:
`Auto`'s per-query dispatch checks the cached `is_uniform` first and
short-circuits to `UniformStep` (which ignores the hint) when set,
matching what Interpolations.jl's uniform fast path does. For
non-uniform grids `_auto_pick` falls back to a hint-aware strategy
(BracketGallop / ExpFromLeft / SIMDLinearScan) by length and hint
validity, so the chained-ODE win from the previous branch is preserved.

The Guesser-hint path now stores the resulting `idx` back into
`iguess.idx_prev[]`, which `GuesserHint` used to do internally — needed
so the next correlated lookup gets the right `idx_prev` when `Auto`
hasn't gone through the uniform short-circuit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
`spline_coefficients!` located the active knot index with
`findfirst(x -> x > u, k)` — an O(n) linear scan on a sorted vector — on
every call. `quadratic_spline_params` calls it n times during
construction, so the constructor was O(n²) (~7 s for n=100k vs ~14 ms
for Dierckx). The per-query eval path (`cache_parameters=false`,
default) also paid O(n) per evaluation.

Two changes:

  1. Factor the locator-dependent body of `spline_coefficients!` into
     `_spline_coefficients_body!(N, d, k, u, i)`. The scalar
     `spline_coefficients!` now calls `searchsortedlast(k, u)` —
     O(log n) on the sorted knot vector — and delegates to the body.

  2. `quadratic_spline_params` maintains a running locator (the next
     iteration's `searchsortedlast` index is ≥ the current's, because
     `t` is sorted) and advances it amortised O(1) per knot. Total
     construction is O(n).

Bench (n=100k uniform, cache_parameters=false):
  QuadraticSpline construct: 6914 ms → 7.9 ms  (~880×)
  QuadraticSpline eval:        57.5 μs → 14.4 μs (~4×)

`spline_coefficients!` keeps `N .= zero(u)` at the top — BSpline
derivative paths (`_derivative(::BSplineInterpolation, …)`) read the
entire `sc` vector, so positions outside the body's
`nonzero_coefficient_idxs` window must be zero on every call. Dropping
that zero pass was an attempted further optimisation that silently
broke BSpline derivatives by leaking stale values from previous
queries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Breaking change: every interpolation constructor that previously built
`t_props = FindFirstFunctions.SearchProperties(t)` internally now accepts an
optional `search_properties::Union{Nothing,FindFirstFunctions.SearchProperties}`
keyword. When omitted (the default), behaviour is identical to before:
`SearchProperties(t)` is built once and shared between the no-integral and
with-integral inner constructor calls — fixing a pre-existing redundancy.

When supplied, the caller's `SearchProperties` is used as-is, which gives
domain experts the ability to opt into FFF strategies that the data-driven
probes can't detect cheaply:

  - `LinearInterpolation(u, t; search_properties =
        SearchProperties(t; is_uniform = true))`
    opts a `Vector` with float-noise into `UniformStep`'s closed-form O(1)
    lookup, which the probe rejects (because float-noise exceeds the
    1e-12 uniformity tolerance).
  - Sharing a single populated `SearchProperties` across many interpolations
    that share `t` avoids redundant probe work.

This subsumes the old `assume_linear_t` knob (already dropped on this
branch) and is more powerful — callers control every property in
`SearchProperties`, not just the linearity flag.

The inner struct constructors (called twice during construction for the
cumulative-integral pre-build path) now take `t_props` as a positional
argument, so the probe runs at most once per interpolation regardless.

All 15 cache types (`LinearInterpolation`, `QuadraticInterpolation`,
`LagrangeInterpolation`, `AkimaInterpolation`, `ConstantInterpolation`,
`SmoothedConstantInterpolation`, `QuadraticSpline`, `CubicSpline`,
`BSplineInterpolation`, `BSplineApprox`, `CubicHermiteSpline`,
`QuinticHermiteSpline`, `SmoothArcLengthInterpolation`,
`LinearInterpolationIntInv`, `ConstantInterpolationIntInv`) updated.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
…sed bench

  - Add `FastInterpolations` as a build/eval target for every supported
    algorithm in `bench/cross_library_comparison.jl`: Linear, CubicSpline,
    QuadraticSpline, Akima, and PCHIP (the last one as "FastInterpolations
    (PCHIP)" alongside the existing PCHIPInterpolation entry).
  - Add `bench/fast_interpolations_bench.jl`, a port of FastInterpolations'
    own `benchmark/interpolation_benchmark.jl` from upstream commit
    `616b106b`. Mimics the fusion-physics matrix-of-interpolants workload
    they advertise on their README: `mpert × mpert` independent
    interpolants on a uniform 1D grid, evaluated at `n_eval` cubic-spaced
    query points. Compares against Interpolations.jl, DataInterpolations.jl,
    Dierckx.jl, and FastInterpolations' `Series` interpolant. CLI matches
    the upstream: `--linear | --quadratic | --cubic | --constant` ×
    `--tiny | --small | --default | --large`.

The bench/Project.toml gains a `FastInterpolations` dep.

Read-only audit — no upstream changes to FastInterpolations.jl.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Full sweep (4.4 min, n ∈ {100, 1k, 10k, 100k}, m ∈ {1, 10, 1k, 100k},
both uniform and non-uniform grids) regenerated with FastInterpolations
included. Plus a `FastInterpolations.jl advertised benchmark` section
running the matrix-of-interpolants workload they publish on their README
(cubic spline + linear at npsi=64, mpert=100, n_eval=1000), and a
`Findings` section explaining where FastInterpolations beats DI, where DI
matches, and which gaps are out of scope for this PR.

Summary of findings:
  - FastInterpolations.jl Series API is 70-100× faster on matrix-of-
    interpolants workloads — they compute the cell anchor once per query
    and reuse it across thousands of coefficient series. No equivalent in
    DI; would need a separate `SeriesInterpolation` type proposal.
  - Per-query scalar latency on `Vector{Float64}` grids: DI ~100-200 ns,
    FI ~50 ns. The gap is `Auto(props)` dispatch overhead vs FI's direct
    `_search_direct(::_CachedRange, q)` (which compiles to ~3 instructions).
    Closeable in a follow-up by resolving Auto to a concrete strategy at
    construction time.
  - Non-uniform CubicSpline/Akima construction: DI within ~30% of FI at
    n ≥ 10k thanks to the O(n) `spline_coefficients!` fix on this branch.
  - Sorted-batch on non-uniform: DI competitive at large m.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Adds a `strategy::strategyType` field to every interpolation cache (13
caches in `interpolation_caches.jl` + 2 inverses in `integral_inverses.jl`).
The strategy is resolved at construction time via `_resolve_strategy(t)`
to a concrete `FindFirstFunctions.SearchStrategy` singleton (`BracketGallop`).
`get_idx` now reads `A.strategy` directly instead of wrapping `A.t_props`
in `FindFirstFunctions.Auto(...)` per call.

The win is type-level: `Auto`'s per-query `_auto_pick` returned
`Union{BinaryBracket, LinearScan, BracketGallop}` based on `length(v)` and
hint validity, forcing a runtime branch + small-union dispatch on every
`get_idx`. Storing the resolved strategy as a singleton field lets the
compiler inline `searchsortedlast(::BracketGallop, ...)` end-to-end.

Always picks `BracketGallop` (not size-dependent): the `LinearScan` branch
for `length(t) <= 16` would make `_resolve_strategy` return a small union
and propagate that union into the cache's type parameters — breaking
`@inferred` tests downstream. The `LinearScan` benefit at tiny `n` is
~10 ns in absolute terms; not worth the inference instability.

Single-query latency on uniform `Vector` grid (FastInterpolations parity
target, BenchmarkTools median):

| n     | before  | after   | FI     | ratio before -> after |
|-------|---------|---------|--------|-----------------------|
| 100   | 70 ns   | 70 ns   | 50 ns  | 1.4x  -> 1.4x         |
| 1000  | 80 ns   | 70 ns   | 50 ns  | 1.6x  -> 1.4x         |
| 10000 | 90 ns   | 70 ns   | 60 ns  | 1.5x  -> 1.17x        |

Batched paths still pass `FindFirstFunctions.Auto(A.t_props)` to
`searchsortedlast!` — the batched-Auto specialization picks `LinearScan`/
`SIMDLinearScan`/`InterpolationSearch`/`BracketGallop`/`ExpFromLeft` based
on `(gap, has_nan, is_linear)`, which the per-query path's
`BracketGallop` can't replicate. The per-batch Auto probe amortises
across queries; the per-query Auto probe didn't.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Change `_resolve_strategy(t)` from a fixed `BracketGallop()` to
`FindFirstFunctions.Auto(t)`. Auto now resolves a concrete `StrategyKind`
at construction from `length(t)` + `SearchProperties{T}(t)`:

  - Uniform data (`AbstractRange` or `Vector` whose 9-point linearity
    probe is within ~1e-12 of exact uniformity) → `KIND_UNIFORM_STEP`.
    The props-aware kernel uses a precomputed `inv_step` for closed-form
    O(1) lookup (no per-query division).
  - Non-uniform data with `length(t) ≤ 16` → `KIND_LINEAR_SCAN`.
  - Otherwise → `KIND_BRACKET_GALLOP` (the previous default).

`Auto{T}` is parametric on the data ratio type, so each cache's
`strategyType` resolves to a single concrete `Auto{T}` per `t` —
type-stable per dispatch.

Mooncake's `increment_and_get_rdata!` gains a populated-RData method to
handle the new `first_val::T` / `inv_step::T` fields on
`SearchProperties{T}` — they're not differentiable (compile-time
constants from the knot vector) but Mooncake sees them as `Float64`
fields and routes the rdata through the populated branch.

Per-query latency (n=10k, m=1k, ns/query, BenchmarkTools median):

  Workload                            | before | after | FastInterp
  ---                                 | ---    | ---   | ---
  Range, sorted queries               |   89   |  75   |    3.2
  Range, random queries               |   89   |  76   |    3.3
  Range, chained (monotone)           |   89   |  75   |    3.3
  Uniform Vec, sorted queries         |   47   |  32   |    70
  Uniform Vec, random queries         |   50   |  35   |    92
  Uniform Vec, chained                |   48   |  32   |    n/a
  Non-uniform Vec, sorted queries     |   68   |  85   |    75
  Non-uniform Vec, random queries     |   80   |  95   |    87
  Non-uniform Vec, chained            |   67   |  86   |    n/a

Wins: 16% on Range (props-aware UniformStep), 32% on Uniform Vector
(closed-form kernel — beats FastInterp here because DI's Guesser
overhead is amortised over fewer cycles than FastInterp's per-query
binary search). Loss: ~25% on Non-uniform Vector (Auto's per-query
`s.kind === KIND_UNIFORM_STEP` branch adds ~5-20 ns/q on the
BracketGallop path; the closed-form kernel inlines into the function
body and adds register pressure on the cold path).

Net: still 4-25× slower than FastInterp on Range — the remaining gap is
DI's per-query overhead (Guesser hint, extrapolation check, linear-interp
arithmetic), not the strategy. Closing it further would require fusing
the search + interp + extrapolation paths into a single kernel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
`bench/di_perq_bench.jl` measures single-query latency at n=10k, m=1k
across (range, uniform Vector, non-uniform Vector) × (sorted, random,
chained) — the cells where the Auto + props refactor matters most.

Reports the resolved Auto.kind for each input so it's clear which path
each measurement exercised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Bump `[compat]` to `FindFirstFunctions = "2, 3"` so DI can pick up the
v3 parametric `SearchProperties{T}` + props-aware UniformStep kernel.
Add `FindFirstFunctions` to `bench/Project.toml` for explicit dev'ing
during cross-library benchmarks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
With DI's strategy now being `Auto{T}` (which carries a populated
`SearchProperties{T}` with `first_val::Float64` and `inv_step::Float64`
fields), Mooncake's analyzer can no longer prove the strategy struct is
non-differentiable. It tries to derive a rule for
`searchsortedlast(::Auto, v, x, h)` by recursion into FFF's strategy
kernels, hitting `Core.Intrinsics.llvmcall` in the SIMD-scan paths
(SIMDLinearScan, BitInterpolationSearch) which Mooncake can't translate.

Declare the searchsortedlast/searchsortedfirst calls dispatched through
`Auto` as `@zero_adjoint`: the return is an `Int` index, gradient flow is
already cut at the integer-indexing boundary in `_interpolate`, so a zero
rrule is correct.

This unblocks `ConstantInterpolation`, `CubicSpline`,
`CubicHermiteSpline`, `QuinticHermiteSpline`, `LagrangeInterpolation`,
`AkimaInterpolation`, `BSplineInterpolation`, and `BSplineApprox`
gradients via Mooncake — interpolations that don't have a `_interpolate`
Mooncake-wrapped rrule and were derived by recursion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
`_resolve_strategy(t) = FindFirstFunctions.Auto(t)` (this PR's hot-path
change) calls the parametric `Auto(::AbstractVector)` constructor added
in FFF v3. That constructor does not exist in FFF v2 (`Auto` only accepts
`SearchProperties` or no args), so `[compat] FindFirstFunctions = "2, 3"`
would resolve to v2 on CI and break with `MethodError: Cannot convert
::Vector{Float64} to ::SearchProperties`.

Pin compat to v3 only. This PR is blocked on FFF v3 release; until v3 is
in the registry, CI will fail with "compatible version not found".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Encode the knot vector's uniformity in `LinearInterpolation`'s cache
type via a new `IsUniform` `Val{B}` parameter, populated by
`_static_uniform_tag` at construction time. For `AbstractRange{<:Real}`
knots the tag is `Val(true)` statically (`@inferred`-clean); for `Vector`
knots it routes through `t_props.is_uniform`, making the construction
return type a `Union{LinearInterpolation{..., true},
LinearInterpolation{..., false}}` — each concrete instance is fully
type-stable per query.

A new `_interpolate(::LinearInterpolation{<:AbstractVector{<:AbstractFloat},
..., true}, t, iguess)` method takes the uniform fast path: closed-form
index lookup via `(t - first_val) * inv_step`, then linear-blend
`u[idx] + α * (u[idx+1] - u[idx])`. This skips both the `get_idx`
search-via-`Auto` round-trip and the `A.t[idx]` load. The result type is
constrained to `<:AbstractFloat` `u` to preserve the existing
slope-form's `Rational`/`Integer` semantics on those eltypes.

NaN propagation matches the non-uniform method (NaN query produces NaN
derivative via ForwardDiff; NaN-adjacent `u` doesn't poison exact-knot
queries via `0 * NaN = NaN`).

Per-query latency, `n = 10_000`, `m = 1000`, Float64, sorted queries:

  Workload                              | Before    | After
  --------------------------------------|-----------|--------
  Range knots                           | 76.7 ns/q | 12.1 ns/q
  Uniform Vector knots (`collect(t)`)   | 55.5 ns/q |  6.5 ns/q
  Non-uniform Vector knots              | 88.4 ns/q | 84.9 ns/q

DI ↔ FastInterpolations gap on Range knots: 23.7× → 3.7×. Non-uniform
Vector path is statically unchanged (different `_interpolate` method
specialization) and shows no regression.

A few existing `@inferred` calls in test/interface.jl and
test/interpolation_tests.jl are dropped for `Vector` knot construction
cases — the constructor genuinely returns a `Union` for those, by
design. The query-side `@inferred` calls remain; per-query dispatch is
fully type-stable on every concrete cache instance. The parity test
documents the realistic error bound: the lerp form differs from the
slope form by `O(length(t)) * eps * max(|u|)`, dominated by the
`(t - first_val) * inv_step` multiplication.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
…knots

Two correctness fixes for the statically-dispatched LinearInterpolation
uniform-grid kernel:

- Clamp the closed-form float position in the float domain before
  unsafe_trunc. With Extension extrapolation the query can be far
  outside the knot span, where the position exceeds typemax(Int) and
  unsafe_trunc is UB (returned garbage indices for t = 1e300).

- Verify the guessed cell and its spacing against the live knots before
  using it. push!/append! mutate A.t while t_props (and the IsUniform
  type tag) keep their construction-time values, so the precomputed
  first_val/inv_step can go stale; a caller-forced is_uniform = true on
  non-uniform knots is the same hazard. Previously both silently
  corrupted interpolated values. On verification failure the evaluation
  falls back to the general slope-form path (slower, always correct),
  which is extracted into _linear_slope_interpolate so both methods
  share it. α is now computed cell-locally from the verified left knot,
  which also tightens the lerp-vs-slope roundoff gap.

Regression tests: Extension extrapolation at ±1e300, a knot vector
uniform at the sampled probe points but jittered between them (must not
be classified uniform), and push!-after-construction breaking the
spacing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Constructors computed t_props (possibly caller-supplied via the
search_properties kwarg) and then called Auto(t), which re-ran the
SearchProperties probe internally and ignored the cached/supplied
props — a redundant O(n) scan and an inconsistency where
A.strategy.props could disagree with A.t_props. Resolution now goes
through Auto(t, t_props) so the two always match.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
DataInterpolations.looks_linear no longer exists; the @docs block
referencing it would fail the docs build. NEWS now records the breaking
assume_linear_t removal, the search_properties kwarg, the
Auto-resolved knot search, the uniform fast path, and the O(n)
QuadraticSpline construction.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
ChrisRackauckas and others added 4 commits June 15, 2026 05:07
…roperties kwarg

- The Type Inference testset had switched every method to Range knots
  to accommodate LinearInterpolation's value-dependent IsUniform tag,
  dropping Vector-knot constructor inference for the other types, which
  do not carry the tag and still infer. Vector knots are restored for
  all methods; LinearInterpolation gets the Range-knot constructor
  check plus a query-side inference check on a Vector-knot instance.
- Add the search_properties keyword bullet to every constructor
  docstring (the assume_linear_t bullet it replaces was removed without
  a replacement).
- Drop conversation/PR references from bench file headers.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
FindFirstFunctions v3 no longer extends Base.searchsortedlast /
Base.searchsortedfirst with strategy methods. get_idx now calls
FindFirstFunctions.search_last / search_first on the cached Auto
strategy, the Mooncake @zero_adjoint declarations target the new
functions, and the test helpers (test_cached_index, the derivative
cached-index check) use the qualified v3 names.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
…hsorted_first

FindFirstFunctions renamed its dispatchers to restore the 'sorted' cue.
Update get_idx, the Mooncake @zero_adjoint declarations, and the test
helpers to the new qualified names.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
The crate-ci/typos spell check flags the abbreviation 'strat' as a
likely misspelling of 'start'/'strata'. Spell out the local variable.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas-Claude ChrisRackauckas-Claude force-pushed the fff-v2-cleanup-quadraticspline branch from efcacdf to 74b0b5a Compare June 15, 2026 09:23
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Rebased onto current master (4d74209f), picking up #545/#546 (the Optim/Mooncake/StaticArrays compat-floor bumps) and #547 (the SciMLTesting v1.2 folder-based test harness). No conflicts; the rebase correctly followed #547's file moves — my derivative_tests.jl edit moved to test/Methods/, the interpolation_tests.jl/interface.jl edits stayed in Core, no orphaned copies. Compat now carries FindFirstFunctions = "3" alongside master's floor bumps.

Re-verified under the new harness against the registered FindFirstFunctions v3.0.0: Core (interpolation_tests.jl 17,412 + my fast-path parity / jittered-probe / Extension-extrapolation / push!-staleness tests, plus #543's duplicate-knot test) and Methods (incl. the migrated derivative_tests.jl) both pass. Runic clean.

Note: master's Downgrade is still red from the Julia-1.10 floor conflict (#546's Mooncake = "0.4.175" floor + the Optim/Symbolics/Mooncake triple) — pre-existing and independent of this PR, as detailed above.

Every interpolation cache stored both t_props::SearchProperties{T} and
strategy::Auto{T}, where Auto = (kind, props) — so the SearchProperties
lived in the struct twice and the cache carried two T-driven type
parameters (propsType + strategyType) that always moved together.

The cached strategy was only ever read by get_idx; the fast path and the
batched eval paths already work off t_props (and the batched API
re-resolves the kind anyway). So replace strategy::Auto{T} (parametric)
with kind::StrategyKind (a UInt8 enum, non-parametric), dropping the
strategyType parameter from all 15 cache structs + the two integral-
inverse caches. The parametric props payload stays as the single
t_props field.

get_idx now branches on the cached kind: KIND_UNIFORM_STEP reconstructs
the (isbits, stack-allocated) Auto from t_props so the closed-form
O(1) lookup is taken; every other kind ignores the props and dispatches
the bare enum, preserving the hint-aware gallop. The Mooncake extension
gains zero_adjoint declarations for the StrategyKind dispatch form
alongside the existing Auto ones.

LinearInterpolation's uniform fast-path _interpolate signature drops one
positional <:Any to match the removed type parameter.

All 5 test groups pass against registered FindFirstFunctions v3.0.0
(Core/Methods/Extensions incl. Mooncake/Zygote/SCT/Misc/QA incl.
AllocCheck). Runic clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Collapsed the redundant strategy field (8e654164)

Per review feedback: every cache stored both t_props::SearchProperties{T} and strategy::Auto{T}, where Auto = (kind, props) — so the SearchProperties was duplicated in the struct and the cache carried two T-driven type parameters (propsType + strategyType) that always moved together.

Since the cached strategy was only ever read by get_idx (the fast path and batched eval already work off t_props, and the batched API re-resolves the kind anyway), I replaced strategy::Auto{T} with kind::StrategyKind — a UInt8 enum, non-parametric — dropping the strategyType parameter from all 15 cache structs and the two integral-inverse caches. The parametric props payload remains as the single t_props field.

get_idx branches on the cached kind: KIND_UNIFORM_STEP reconstructs the (isbits, stack-allocated) Auto from t_props so the closed-form O(1) lookup is taken; every other kind dispatches the bare enum, preserving the hint-aware gallop. The Mooncake ext gains @zero_adjoint declarations for the StrategyKind dispatch form.

So the answer to "isn't the strategy an enum?" is now yes — it's stored as the enum; the only remaining parametric field is t_props, which is irreducible (it holds the first_val::T/inv_step::T reciprocals the uniform kernel needs).

Re-verified: all 5 groups pass against registered FindFirstFunctions v3.0.0 under the new SciMLTesting harness — Core, Methods, Extensions (Mooncake/Zygote/SCT — validates the new StrategyKind zero-adjoints), Misc, and QA (AllocCheck confirms the Auto reconstruction is allocation-free). Runic clean.

…, not a type param

The static fast-path work encoded knot uniformity in a Val{IsUniform}
*type parameter* on LinearInterpolation. Since a Vector's uniformity is
a runtime property, the constructor returned
Union{LinearInterpolation{...,true}, LinearInterpolation{...,false}} —
type-unstable. That had been worked around by relaxing/removing the
@inferred tests, which is not acceptable.

Follow FFF's own design instead: encode the choice as a runtime enum and
branch on it, so every returned type is inferred.

- Drop the IsUniform type parameter and the is_uniform_static::Val field
  from LinearInterpolation; the cache is a single concrete type again
  (constructor inferred for both Vector and Range knots).
- Select the uniform closed-form path with a runtime branch on the cached
  kind enum inside _interpolate (A.kind === KIND_UNIFORM_STEP), with both
  arms returning the same concrete type so the query stays inferred. This
  mirrors FFF's runtime StrategyKind dispatch (concrete Int return
  regardless of the runtime kind).
- Remove the now-unused _static_uniform_tag helper.
- Restore the inference tests: the Type Inference testset @Infers every
  constructor for Vector and Range knots (plus query inference), and the
  LinearInterpolation testset's relaxed @inferred guards are reverted to
  master's unconditional form. is_uniform_static checks become kind checks.

Verified: all 7 constructors and all 13 interpolation query paths infer;
full suite green against registered FindFirstFunctions v3.0.0 (Core incl.
the restored Type Inference testset, Methods, Extensions incl.
Mooncake/Zygote/SCT, Misc, QA incl. AllocCheck). Runic clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Fixed the inference regression — uniformity is now a runtime enum, not a type parameter (50195f85)

The static fast-path work had encoded knot uniformity in a Val{IsUniform} type parameter on LinearInterpolation. Because a Vector's uniformity is a runtime property, the constructor returned Union{LinearInterpolation{…,true}, LinearInterpolation{…,false}} — type-unstable — and that had been papered over by relaxing/removing the @inferred tests. Not acceptable.

Fixed by following FindFirstFunctions' own design (runtime enum dispatch → concrete inferred returns):

  • Dropped the IsUniform type parameter and the is_uniform_static::Val field. LinearInterpolation is a single concrete type again, so the constructor is inferred for both Vector and Range knots (it was returning an abstract Union before).
  • The uniform closed-form path is now selected by a runtime branch on the cached kind enum inside _interpolate (A.kind === KIND_UNIFORM_STEP), both arms returning the same concrete type so the query stays inferred. Same body as before — just reached by a runtime branch instead of type dispatch, exactly like FFF's StrategyKind dispatch returns a concrete Int regardless of the runtime kind.
  • Removed the now-unused _static_uniform_tag.
  • Restored the inference tests: the Type Inference testset @inferreds every constructor for Vector and Range knots (plus query inference), and the LinearInterpolation testset's relaxed @inferred guards are reverted to master's unconditional form.

Verified: all 7 constructors and all 13 interpolation query paths @inferred-clean; full suite green against registered FFF v3.0.0 — Core (incl. the restored Type Inference testset, 28 passing), Methods, Extensions (Mooncake 604 / Zygote 11974 / SCT 600 — exercises AD through the runtime-branch fast path), Misc, QA (AllocCheck 8 — the Auto/fast-path reconstruction stays allocation-free). Runic clean.

The single runtime-kind-branch uniform path is right for Vector knots
(uniformity is a value property; a Vector can be push!-mutated or
caller-forced-uniform on jittered data, so it needs the live-knot
cell/spacing verification + slope fallback). But an AbstractRange is
uniform at the *type* level and is immutable + exactly spaced, so it can
take a leaner kernel — dispatched statically on typeof(t), which is
inference-safe (no value-dependent Union, unlike the old IsUniform tag).

Add _interpolate(::LinearInterpolation{<:AbstractVector{<:AbstractFloat},
<:AbstractRange}, ...) → _linear_uniform_range_interpolate, which:
  - skips the runtime kind branch (ranges are always KIND_UNIFORM_STEP),
  - skips the cell/spacing verification (a range can't go stale),
  - computes α directly from the float position (α = f - idx0), avoiding
    the two A.t[idx] range-arithmetic loads,
  - skips the vestigial iguesser store (the closed form never searches).
It keeps full NaN handling (NaN query → NaN derivative; NaN-adjacent u
resolved by exact-knot comparison), matching the other methods.

Vector knots keep the runtime-branch verified path unchanged; non-Float
u and non-uniform knots keep the slope form.

Measured per-query (n=10k, monomorphic loop): range A(q) 30.5 -> 17.0
ns/q; uniform Vector unchanged (18.4); non-uniform unchanged. Inference
stays clean (constructor + query inferred for both Range and Vector),
AD works through the kernel (Mooncake/Zygote/SCT green), and it is
allocation-free (AllocCheck). All 5 groups pass against registered
FindFirstFunctions v3.0.0.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Performance: static lean fast path for AbstractRange knots (f09c1b89)

Following up on the inference fix — I benchmarked whether the runtime kind branch costs anything, and whether a static/runtime split is warranted (it is).

Findings (per-query, n=10k, monomorphic loop):

Path Range knots Uniform Vector
runtime-branch A(q) (before) 30.5 ns/q 18.4 ns/q
+ static range fast path (after) 17.0 ns/q 18.4 ns/q (unchanged)

The runtime branch itself is ~free on Vectors (≈0.5 ns). The real opportunity was that a range was paying the Vector kernel's overhead — cell/spacing verification, two A.t[idx] range-arithmetic loads, and the iguesser store — all provably unnecessary for an immutable, exactly-uniform range.

The split (two inference-safe uniform paths):

  • Static, for t::AbstractRange: dispatched on typeof(t) (inference-safe — no value-dependent Union), a lean kernel that skips the branch + verification + A.t loads (α = f - idx0) + the vestigial iguesser store. Keeps full NaN handling.
  • Runtime, for Vector knots: the kind-branch verified kernel — needed because a Vector can be push!-mutated or caller-forced-uniform on jittered data. (search_properties = SearchProperties(t; is_uniform=true) designates uniformity via the runtime kind; a Vector type can't carry a static guarantee.)
  • Non-uniform / non-Float u: slope form.

Verified: all 5 groups green against registered FFF v3.0.0; inference clean for both Range and Vector (constructor + query); AD correct through the new kernel (Mooncake 604 / Zygote 11974 / SCT 600); allocation-free (AllocCheck). Range interior matches the slope form to 3e-15.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants