Skip to content

Formally prove the correctness of Sqrt.sol and Cbrt.sol#511

Open
duncancmt wants to merge 144 commits intodcmt/newton-raphson-optimizationfrom
dcmt/codex-prove-sqrt-cbrt
Open

Formally prove the correctness of Sqrt.sol and Cbrt.sol#511
duncancmt wants to merge 144 commits intodcmt/newton-raphson-optimizationfrom
dcmt/codex-prove-sqrt-cbrt

Conversation

@duncancmt
Copy link
Collaborator

No description provided.

duncancmt and others added 28 commits February 26, 2026 18:02
Machine-checked proof that _sqrt converges to within 1 ULP of isqrt(x) for
all uint256 inputs, and that the floor-correction step yields exactly isqrt(x).

Proof structure (zero sorry, no Mathlib):

- FloorBound.lean: Each truncated Babylonian step >= isqrt(x) (AM-GM +
  integrality). Absorbing set {isqrt, isqrt+1} is preserved.

- StepMono.lean: Step is non-decreasing in z for overestimates (z^2 > x),
  justifying the max-propagation upper-bound strategy.

- SqrtCorrect.lean: Definitions matching EVM semantics, native_decide
  verification of all 256 bit-width octaves, lower bound chain through 6
  steps, and floor correction proof.

Also includes verify_sqrt.py, the Python prototype that guided the Lean proof.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Proves the cubic AM-GM inequality (3m-2z)*z^2 <= m^3 and the resulting
floor bound: a single truncated Newton-Raphson step for cube root never
undershoots icbrt(x). This is the core mathematical lemma needed for the
full cbrt convergence proof.

Key technique: the witness identity m^3 - (3m-2z)*z^2 = (m-z)^2*(m+2z)
is proved by expanding both sides via simp [add_mul, mul_add, mul_assoc,
mul_comm, mul_left_comm] then closing with omega -- a 4-line ring-substitute
that works without Mathlib.

Also includes verify_cbrt.py (Python convergence prototype) and updates the
formal/ README to track both sqrt (complete) and cbrt (in progress).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds CbrtCorrect.lean with full convergence and correction proofs:
- native_decide verification of all 256 bit-width octaves
- Lower bound chain through 6 NR iterations (icbrt(x) <= _cbrt(x))
- Floor correction proof (cbrt returns exactly icbrt(x))
- Seed and step positivity invariants

Combined with the cubic AM-GM floor bound from the previous commit,
this gives a complete machine-checked proof that Cbrt.sol is correct
for all uint256 inputs (0 sorry, no Mathlib).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prove `innerCbrt x ≤ icbrt x + 1` unconditionally for all x < 2^256,
completing the end-to-end formal verification of Cbrt.sol.

The proof uses a per-octave finite certificate scheme (cribbed from the
sqrt proof):

- FiniteCert.lean: 248-entry lookup tables with native_decide proofs
  of error bounds d1..d6 ≤ 1 for octaves 8..255
- CertifiedChain.lean: chains 6 NR steps through the error recurrence
  d_{k+1} = d_k^2/lo + 1, using an analytic d1 bound derived from the
  cubic identity m^3+2s^3-3ms^2 = (m-s)^2(m+2s)
- Wiring.lean: maps x to its certificate octave and produces the
  unconditional theorems floorCbrt_correct_u256 and
  floorCbrt_correct_u256_all

Zero sorry. Full project builds with `lake build`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add .github/workflows/cbrt-formal.yml patterned after sqrt-formal.yml:
  generates FiniteCert.lean from generate_cbrt_cert.py then runs lake build
- Remove FiniteCert.lean from git tracking (auto-generated, like sqrt's
  GeneratedSqrtModel.lean) and add to .gitignore
- Add --output flag to generate_cbrt_cert.py for CI usage
- Rewrite README.md with full proof architecture, end-to-end verify
  instructions, and file inventory

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add the final layer that formally links the Solidity implementation to
the proven-correct mathematical spec:

- generate_cbrt_model.py: parses Cbrt.sol assembly and emits EVM-faithful
  and normalized Nat Lean models of _cbrt, cbrt, cbrtUp (forked from
  the sqrt analog)
- GeneratedCbrtSpec.lean (1015 lines, 0 sorry): proves
  - model_cbrt_evm = model_cbrt on uint256 (no overflow)
  - model_cbrt = innerCbrt (Nat model matches hand-written spec)
  - model_cbrt_floor_evm_correct: EVM floor model = icbrt
  - model_cbrt_up_evm_upper_bound: EVM ceiling model rounds up correctly
- Updated CI to generate model from Cbrt.sol before building
- Updated README with full end-to-end verification instructions

The proof is now end-to-end: from Cbrt.sol Solidity assembly through
auto-generated Lean model to machine-checked correctness on uint256.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
duncancmt and others added 30 commits March 4, 2026 16:33
…t (sorrys 5→4)

Bridge the EVM Karatsuba quotient model to Nat division, showing it
correctly computes floor((res * 2^86 + limb_hi) / d). Handles both the
no-carry case (dividend fits in 256 bits) and the carry case (257-bit
dividend decomposed via three-part division identity).

Tighten preconditions: replace 0 < d / d < WORD_MOD with 2^86 ≤ d and
d < 2^172 to ensure intermediate additions don't overflow. Add evmMod_eq'
to EvmBridge.lean and reusable mul_mod_mul_right / div_of_mul_add helpers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… 5→4→3)

Bridge the EVM base case model to icbrt, proving that 6 NR iterations
from the fixed seed plus floor correction compute
(icbrt(w), w - icbrt(w)³, 3·icbrt(w)²) where w = x_hi_1 / 4.

Proof structure:
- nr_step_ok: one NR step matches cbrtStep and stays in [2^83, 2^88)
- Range bounds via lo_le_icbrt_of_cube_le_pow / icbrt_le_hi_of_pow_lt_cube
- 6-step NR chain using rw (not simp only, which can't match through
  let-bound variables)
- Floor correction via case split: z6=m (cube ≤ w) or z6=m+1 (cube > w)
- EVM bridge for remaining mul/sub/gt operations on m

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ocused)

Replace the unfocused sorry in composition_within_1ulp with a proof that
derives all 6 conjuncts from a well-defined sorry'd helper (r_qc_eq_icbrt).

The sorry'd helper states r_qc = icbrt(x_norm) exactly, combining:
1. No-overshoot: floor divisions ensure r_qc³ ≤ x_norm
2. No-undershoot: (r_qc+1)³ > x_norm from Karatsuba remainder bounds

New proved lemmas:
- r_qc_lt_pow172: r_qc < 2^172 from m < 2^85, r_lo < 2^87
- composition_within_1ulp: all 6 conjuncts via r_qc_eq_icbrt + r_qc_lt_pow172

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…3→2)

Prove the critical-path bridge theorem connecting the EVM cbrt model to
icbrt ± 1. Three private helpers decompose the proof:

- norm_no_overflow: x * 2^(3*shift) < 2^512 via log2 bounds
- three_msq_plus_3m_lt: 3m²+3m < 2^171 via native_decide certificate
  (K = 30704801884924481767496090, verifying K³ ≥ 2^254)
- evm_pipeline_within_1ulp: chains BaseCase → Karatsuba → QC bridge
  theorems to show normalized EVM pipeline = Nat composition

The main theorem unfolds model_cbrt512_evm through u256 wrappers,
applies normalization + pipeline + within_1ulp_denorm/overshoot_denorm.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… 2→4, focused)

Replace the monolithic `r_qc_eq_icbrt` (1 sorry claiming exact equality,
which was FALSE — r_qc = icbrt+1 in ~20% of cases) with 4 correct sub-lemmas:

  A. r_qc_succ_cube_gt: x_norm < (r_qc+1)³ (lower bound)
  B. r_qc_pred_cube_le: (r_qc-1)³ ≤ x_norm (upper bound)
  E1. r_qc_le_r_max: r_qc ≤ R_MAX (cube bound)
  E2. r_qc_no_overshoot_on_cubes: overshoot → not perfect cube

Prove r_qc_properties (composes A+B+E1+E2) and composition_within_1ulp
from these sub-lemmas. Remove unused baseCase_NR_exact_on_perfect_cube.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The _cbrt pipeline (base case → Karatsuba quotient → quadratic correction)
can return icbrt(x) - 1 for certain inputs, but the cube-and-compare step
only corrects +1 overshoots. This test demonstrates the failure with a
specific input found via numerical search.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same input as cbrt undershoot: _cbrt returns icbrt-1, so cbrtUp's
correction (r += x > r³ ? 1 : 0) gives icbrt instead of icbrt+1
for this non-perfect-cube input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The _cbrt pipeline could return icbrt(x)−1, causing cbrt() and cbrtUp()
to return wrong results. The root cause: the quadratic correction
c = ⌊r_lo²/R⌋ over-corrects by 1 when the Karatsuba remainder (dropped
bits of x) is large relative to the fractional part of r_lo²/R.

Fix: propagate `rem` from _cbrt_karatsubaQuotient (mirroring how _sqrt
already propagates its remainder), then detect undershoot via the
split-limb comparison ε·3·r_hi < rem·2⁸⁶ (matching _sqrt_correction's
pattern of avoiding 512-bit intermediate products). When detected,
subtract one less in the quadratic correction.

The check is gated on c ≥ 2 because for c ≤ 1 (only near R_MAX),
undershoot is impossible and a false positive would overflow the
cube-and-compare step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… QC takes rem)

Regenerate the Lean model after the cbrt undershoot fix:
- model_cbrtKaratsubaQuotient_evm now returns Nat × Nat (quotient, remainder)
- model_cbrtQuadraticCorrection_evm now takes 3 params (r_hi, r_lo, rem)

Update bridge proofs:
- CbrtKaratsubaQuotient: prove both .1 = div and .2 = mod (split into
  kq_fst_correct + kq_snd_correct, fully proved, 0 sorry)
- CbrtComposition: QC bridge updated for 3-param signature (sorry, +1)
- GeneratedCbrt512Spec: pipeline destructures KQ tuple, passes rem to QC (sorry, +1)

Sorry count: 4 → 6 (2 new from QC bridge restructure, pending proof of
undershoot correctness)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prove model_cbrtQuadraticCorrection_evm_correct: the quadratic
correction with undershoot prevention returns a value in
[R + r_lo - c, R + r_lo - c + 1] where c = r_lo²/R.

Key technique: show the undershoot check (split-limb comparison
via evmOr/evmAnd/evmEq/evmLt) produces a value ≤ 1, giving the
±1 bound without needing to prove undershoot *correctness*.

Helper lemmas added:
- qc_const_86: constant fold evmAnd(evmAnd(86,255),255) = 86
- or_le_one, and_le_one: bitwise ops on {0,1} stay in {0,1}
- evmLt_le_one, evmEq_le_one, evmAnd_le_one, evmOr_le_one
- qc_undershoot_le_one: the full undershoot check ≤ 1

The pipeline sorry (evm_pipeline_within_1ulp) remains: it needs
r_1 ≤ icbrt + 1 which requires proving undershoot correctness,
not just boundedness.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The original r_qc_succ_cube_gt (x_norm < (r_qc+1)³) was numerically
false for c≥2 — the quadratic correction can over-subtract. Replaced
with r_qc_succ2_cube_gt proving x_norm < (r_qc+2)³, which gives
icbrt ≤ r_qc + 1. The tighter bound icbrt ≤ r_qc_actual comes from
the undershoot correction in the QC (proved separately).

New file CbrtAlgebraic.lean provides shared algebraic infrastructure:
cube_sum_expand, R_cube_factor, d_pow172_eq_3R_sq, x_norm_decomp.

The proof splits on c ≤ 1 (where 12R > 2^172 suffices) vs c ≥ 2
(where s'² > R(c-1) via the bound 2c·r_lo < R, derived from
c < 32 and r_lo < 2^87).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prove `r_qc_pred_cube_le`: (r_qc - 1)³ ≤ x_norm, showing the
composition pipeline's result is at most icbrt(x_norm) + 1.

Proof strategy:
- Lower bound x_norm ≥ R³ + 3R²·r_lo (from decomposition, drop remainder)
- Case r_lo ≤ c: trivially (R-1)³ ≤ R³ ≤ x_norm via cube_monotone
- Case r_lo > c: expand (R+t)³ via cube_sum_expand where t = r_lo-c-1,
  reduce to showing 3Rt² + t³ ≤ 3R²(c+1). Use sq_sum_expand to get
  surplus: (c+1)R - t² > 2t(c+1), then chain t³ < t(c+1)R < 3R·surplus.

Also adds sq_sum_expand to CbrtAlgebraic.lean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prove r_qc_le_r_max: the composition pipeline output never exceeds
R_MAX = icbrt(2^512 - 1). Two-case proof:

Case A (m < M_TOP): tight_numerator_bound shows r_lo ≤ 2^86 + 8,
so r_qc ≤ R + r_lo ≤ M_TOP·2^86 + 8 ≤ R_MAX (since delta ≥ 9).

Case B (m = M_TOP): if r_lo ≤ delta then r_qc ≤ R_MAX trivially;
if r_lo = delta + 1 then correction ≥ 1 compensates.

Helper lemmas added:
- M_TOP definition and cube bounds (native_decide)
- r_max_ge_r_top, r_lo_max_at_m_top (native_decide)
- tight_numerator_bound: (3m+1)·2^86 ≤ 27m² for m ≥ 2^83
  via 3m(9m - 2^86) ≥ 3·2^166 ≥ 2^86

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prove r_qc_no_overshoot_on_cubes: when the cbrt algorithm overshoots
(r_qc³ > x_norm), x_norm is not a perfect cube. This is needed by the
cbrtUp wrapper to correctly handle the ceiling case.

Proof structure:
- quad_correction_ge_delta: algebraic helper showing (t+delta)² ≥ delta·R
  when 3R²·delta ≤ 3Rt² + t³ (perfect cube constraint). Two cases:
  Case 1 (t² < 6R): cancel 3R to get R·delta < t²+2t ≤ (t+delta)²
  Case 2 (t² ≥ 6R): show t³ ≤ 6Rt·delta via delta ≥ (t²-2R)/R

- perfect_cube_no_overshoot: factored helper that derives the
  contradiction on perfect cubes using the algebraic helper. Factored
  out to avoid kernel deep recursion in the main theorem.

- r_qc_no_overshoot_on_cubes: main theorem. If x_norm = s³ and
  r_qc = s+1 (overshoot by 1), the quadratic correction c = ⌊r_lo²/R⌋
  satisfies c ≥ r_lo - (s-R) = c+1, which is impossible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split the single sorry in evm_pipeline_within_1ulp into structured
case analysis on r_1 = r_qc vs r_1 = r_qc + 1. Proved 2 of 6
conjuncts for the r_qc+1 case (WM bounds). Remaining 4 sorrys:
- P2: icbrt ≤ r_qc when no undershoot
- P1: r_qc ≤ icbrt when undershoot fires
- (r_qc+1)³ < WORD_MOD² (from P1 + c>1 implies r_qc < R_MAX)
- overshoot correctness (from P1 strict)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added sorry'd helper lemmas qc_no_undershoot (P2) and
qc_undershoot_correct (P1) with clean signatures. Wired them
into evm_pipeline_within_1ulp to close all 4 structural sorrys.
The 2 remaining sorrys are purely algebraic:
- P2: when QC returns r_qc, icbrt ≤ r_qc (no undershoot below icbrt)
- P1: when QC returns r_qc+1, r_qc ≤ icbrt ∧ (r_qc+1)³ < WORD_MOD²
      ∧ overshoot implies non-perfect cube

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Proved the overshoot conjunct of qc_undershoot_correct:
  (r_qc+1)³ > x_norm → icbrt³ < x_norm
from r_qc³ < x_norm (strict, still sorry'd) by showing icbrt = r_qc.

Also proved r_qc ≤ icbrt from r_qc³ < x_norm via cube_monotone
contradiction. Aligned composition variables with local m/nat_r_lo
using conv+rw pattern.

Remaining 4 sorrys:
- P2 (qc_no_undershoot): icbrt ≤ r_qc when no undershoot
- c > 1 extraction from hr1_eq (QC model unfolding)
- Algebraic core: r_qc³ < x_norm when undershoot fires
- Cube bound: (r_qc+1)³ < WORD_MOD² (needs c>1 → r_qc < R_MAX)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added model_cbrtQuadraticCorrection_evm_exact_when_c_le1 theorem:
when c ≤ 1, the QC model returns exactly r_qc (no undershoot).
Used this to prove c > 1 from hr1_eq by contradiction.

Proved the overshoot conjunct using icbrt = r_qc (from r_qc ≤ icbrt
and icbrt ≤ r_qc+1, ruling out icbrt = r_qc+1 via cube bound).

Remaining 3 sorrys:
- P2: icbrt ≤ r_qc when no undershoot (line 118)
- Algebraic core: r_qc³ < x_norm when undershoot fires (line 215)
- Cube bound: (r_qc+1)³ < WORD_MOD² (line 241)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Moved the P1 algebraic core and cube bound sorrys from
GeneratedCbrt512Spec into a single sorry'd theorem
qc_undershoot_cube_lt in CbrtComposition.lean, which has access
to the private sub-lemmas needed for the proof.

Total sorry count across all files: 2
- CbrtComposition: qc_undershoot_cube_lt (algebraic core + cube bound)
- GeneratedCbrt512Spec: qc_no_undershoot (P2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Set up the full proof structure for qc_undershoot_cube_lt in
CbrtComposition.lean (after all private sub-lemmas). Proved:
- Base case bounds (m, nat_r_lo, nat_rem)
- c > 1 extraction from hr1_eq (reusing exact_when_c_le1)
- r_qc ≤ R_MAX (from E1 sub-lemma)

Two sorry'd conjuncts remain:
1. r_qc³ < x_norm (algebraic core)
2. (r_qc+1)³ < WORD_MOD² (c>1 → r_qc < R_MAX)

Total sorrys across project: 2
- CbrtComposition: qc_undershoot_cube_lt (2 conjuncts)
- GeneratedCbrt512Spec: qc_no_undershoot (P2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prove (r_qc+1)³ < WORD_MOD² by showing R + r_lo ≤ R_MAX + 1
(new lemma r_plus_rlo_le_rmax_succ) and combining with c ≥ 2
to get r_qc + 1 ≤ R_MAX, then using cube_monotone + r_max_cube_lt_wm2.

Scaffold r_qc³ < x_norm proof (Conjunct 1) via:
- r_qc_cube_lt_x_norm: separate lemma with full algebraic proof structure
- undershoot_implies_rem_gt_3Reps: key inequality from undershoot check
  (3Rε ≤ rem·2^172 when split-limb comparison fires)

Both new lemmas are sorry'd; the algebraic chain structure is verified.
Sorrys: 3→2 in CbrtComposition (Conjunct 2 proved, 2 new stubs).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prove both conjuncts of qc_undershoot_cube_lt:
- Conjunct 2 ((r_qc+1)³ < WORD_MOD²): via r_plus_rlo_le_rmax_succ
  showing R+r_lo ≤ R_MAX+1, combined with c≥2 for r_qc+1 ≤ R_MAX.
- Conjunct 1 (r_qc³ < x_norm): via r_qc_cube_lt_x_norm using:
  - x_norm decomposition: x_norm ≥ R³ + 3R²·r_lo + rem·2^172
  - Algebraic chain: 3Rt² + t³ < 3R·r_lo² (from sq_sum_expand, t²<(c+1)R)
  - Undershoot: 3R·r_lo² ≤ 3R²c + rem·2^172 (from div_add_mod + undershoot)

One sorry remains: undershoot_implies_rem_gt_3Reps (the split-limb
comparison extraction from hr1_eq). P2 in GeneratedCbrt512Spec unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… sorrys 2→1

Fix the split-limb comparison mask in _cbrt_quadraticCorrection from
0x3ffffffffffffffffffffff (2^90-1) to 0x3fffffffffffffffffffff (2^86-1).
The comment said "mask86" but the hex literal was 90 bits. Both values are
correct (the 90-bit mask is sound due to the 4-bit overlap being harmless),
but 2^86-1 gives an exact split at the shift boundary, simplifying the proof.

Prove undershoot_implies_rem_gt_3Reps: when the QC undershoot check fires,
eps3 * R ≤ rem * 2^172. The proof unfolds the QC EVM model, reduces all
operations to Nat, extracts check=1, then case-splits:
- Case 1 (high parts differ): eps3 < rem → eps3*R ≤ rem*2^172
- Case 2 (high parts equal): exact split-limb → product inequality

Helper lemmas: mask_86_eq, evmAnd_mask86_eq_mod, or_and_eq_one_cases,
eps3_lt_rem_implies_prod_le, div_lt_implies_lt, split_limb_implies_prod_le.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant