Skip to content

feat(bn254): skip infinity pairs in pairing check#659

Draft
0xVolosnikov wants to merge 9 commits into
draft-0.4.0from
vv/bn254-pairing-skip-infinity
Draft

feat(bn254): skip infinity pairs in pairing check#659
0xVolosnikov wants to merge 9 commits into
draft-0.4.0from
vv/bn254-pairing-skip-infinity

Conversation

@0xVolosnikov
Copy link
Copy Markdown
Contributor

What ❔

Skips degenerate pairs (G1 == O or G2 == O) in the BN254 pairing-check precompile after subgroup validation. Filtering happens in the parse loop, so malformed encodings still hit the existing rejection path. When every pair turns out to be degenerate, the result is short-circuited to true (empty product = identity in F_T) instead of relying on the upstream multi_pairing contract for empty input.

Gas / native accounting is unchanged (still per-pair as in EIP-1108). State-transition output is bit-identical to the pre-change path on every input.

Unit tests added in basic_system/src/system_functions/bn254_pairing_check.rs:

  • single (O, Q) / (P, O) / (O, O)true
  • batch of 7 degenerate pairs → true
  • e(G1, G2) ≠ 1, and prefixing / suffixing degenerate pairs must keep it false
  • e(G1, G2) · e(-G1, G2) = 1 with interleaved degenerate pairs → still true
  • malformed (x≠0, y=0) G1 encoding stays rejected (filter cannot bypass curve/subgroup check)

Why ❔

Symmetric to the BLS12-381 optimization in #657. The BN254 pairing precompile has the same shape: every encoded pair flows through the Miller-loop precomputation (G1Affine → G1Prepared / G2Affine → G2Prepared), which is the dominant cost on degenerate inputs. Skipping degenerate pairs at the affine level avoids that work without changing observable output.

Mathematically safe: in the BN254 optimal-Ate pairing e(O, Q) = e(P, O) = 1_F_T, so dropping any pair whose G1 or G2 is the point at infinity does not change the multi-pairing product.

Is this a breaking change?

  • Yes
  • No

Checklist

  • PR title corresponds to the body of PR.
  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • Code has been formatted.

Stacked on

This PR is based on vv/bench-precompiles. Land that first.

Follow-ups not in this PR

  • Bench cycle savings on a degenerate-input BN254 sample via bench_scripts/compare_precompile_stats.py against the base branch.

0xVolosnikov and others added 9 commits May 15, 2026 07:43
…markers

Add two cycle-marker labels that fire only on EE-driven invocations:

- `keccak_execution_environment` around the EVM SHA3 opcode dispatch
  (including the `len == 0` empty-slice shortcut), so per-execution
  cycles can be joined 1:1 with the opcode-level SHA3 sample stream.
  The inner `"keccak"` marker (from `Keccak256Impl::execute`) still
  fires, so bootloader/intrinsic keccak invocations remain attributable
  to the existing `"keccak"` label only.

- `ecrecover_execution_environment` around the EE-precompile ecrecover
  dispatch via a new `EcRecoverEEInvocation` wrapper. Intrinsic
  signature-recovery calls from the bootloader do not go through this
  path, so the marker fires only for EE-triggered calls — no positional
  intrinsic-filter heuristic needed in joiner scripts.

Both markers are pure observability: the underlying system-function
calls are unchanged. Required as STF-side instrumentation for the
follow-up benchmarking-pipeline PR (CI workflow, joiner scripts, and
docs land separately).

Also extract `install_precompile_hook` from `add_precompile_ext` so the
`PRECOMPILE_ADDRESSES_LOWS` sanity check stays centralized for any
future custom invocation type.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…→ state_commitment_update

Wrap the per-block `write_pubdata` call inside the three proving
post-tx-op handlers (single-block, multi-block, sequencing) in a new
`da_commitment` cycle marker. For keccak DA this is where the bulk of
keccak delegations fire (bytes are absorbed into the
`Keccak256CommitmentGenerator` state); for blob DA the call just
appends to a buffer and the KZG work shows up under the pre-existing
`blob_versioned_hash` marker — both cases are now observable as a
single labelled phase.

Rename `verify_and_apply_batch` → `state_commitment_update` to reflect
what the marker actually wraps (`IOTeardown::update_commitment` — the
state-tree merkle commit, which is Blake-heavy and distinct from the
DA commit and the per-blob KZG commit). Applies to all four
post-tx-op variants so the label string is consistent across them.

No state-transition behavior change — pure instrumentation rename.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `post_tx_op_sequencing.rs`: complete the rename
  `verify_and_apply_batch` → `state_commitment_update` for the
  Ethereum-sequencing handler (the 5th post-tx-op site, missed in the
  prior sweep) and clean up the leftover `// // 3.` double-slash
  comment artifact, restoring cross-STF parity for the
  `state_commitment_update` label.
- `evm_interpreter/src/instructions/system.rs::sha3`: move
  `cycle_marker::wrap!("keccak_execution_environment", ...)` to envelop
  the whole opcode dispatch — including the `pop_2()? / cast_to_usize?
  / spend_gas_and_native?` prelude — so the marker fires on every
  dispatch (incl. stack-underflow / invalid-operand / OOG-on-base-cost
  short-circuits), matching `EvmOpcodeStatsTracer`'s per-dispatch
  sample count. This was the invariant the inline comment claimed but
  the code didn't actually uphold; positional pairing in
  `cycles_per_native_report.py` / `join_precompile_samples.py` now
  truly matches 1:1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ghting

Adds per-execution cycle-sample dumps for both opcode-level and
non-opcode (label) markers, plus an effective-cycles formula that
weights raw RISC-V cycles by delegation counts so dump consumers see
the true prover cost rather than raw cycles alone.

- New `OPCODE_CYCLE_SAMPLES_DIR` env var: writes one `<OPCODE>.cycles`
  file per opcode with raw cycles per execution (append-mode).
- New `LABEL_CYCLE_SAMPLES_DIR` env var: writes both `<label>.cycles`
  (raw) and `<label>.effective.cycles` (raw + delegation weights) per
  non-opcode marker. Effective is required to attribute prover cost to
  delegation-heavy phases (precompiles, keccak SF call).
- Shared `effective_of` closure: `raw + 16×Blake + 4×BigInt + 4×Keccak`
  delegation counts, matching the existing `process_block` headline.
- Aggregate `.bench` table semantics (total/min/max/median over raw)
  unchanged; effective is only consumed via the per-execution dump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-precompile gas/native observability for the forward (sequencer)
runtime, mirroring what `EvmOpcodeStatsTracer` does at opcode level.

- `PrecompileStatsTracer`: tracks count + min/median/avg/max gas and
  native for each precompile invocation routed through the EE
  precompile dispatch path. Skips the bootloader's intrinsic ecrecover
  call (signature verification) since it doesn't go through the same
  hook surface. `dump_samples` writes `<precompile>.samples` files
  (`gas,native` per line) parallel to `EvmOpcodeStatsTracer` output.
- `Pair<A, B>` tracer combinator: forwards each tracer hook to both
  inner tracers so a single run can collect both opcode and precompile
  stats. Lets `eth_runner` (and bench callers) compose tracers without
  duplicating run logic.

Both tracers are pure observability — no impact on state-transition or
proving paths. Consumed by the benchmark pipeline that follows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ompile sweeps

Test-side scaffolding so the new tracers and DA schemes can be exercised:

- `tests/rig/src/chain.rs`: snapshot/revert LABELS around the sequencing
  forward run (was around the prover-input run). Under
  `BlobsZKsyncOS` DA the `blob_versioned_hash` marker fires only in
  proving, so without flipping the snapshot the post-RISC-V count
  match trips. Implemented as a Drop guard so an early-return from
  `run_forward_no_panic` still reverts and the next run on this thread
  starts with clean LABELS.
- `tests/instances/eth_runner/src/single_run.rs`: compose
  `EvmOpcodeStatsTracer` + `PrecompileStatsTracer` via `Pair` when both
  are requested via env vars; thread `PRECOMPILE_STATS_PATH` and
  `PRECOMPILE_SAMPLES_DIR` through the dump path; honor
  `BENCH_DA_SCHEME` to pick `BlobsAndPubdataKeccak256` (default) or
  `BlobsZKsyncOS` for dual-DA-scheme bench runs.
- `tests/instances/eth_runner/Cargo.toml`: `bench-fast` profile mirror
  (excluded from root workspace so the root `[profile.bench-fast]`
  doesn't cover it).
- `tests/instances/precompiles/{Cargo.toml,src/lib.rs}`: install
  `PrecompileStatsTracer` inside `run_precompile_inner` so the
  precompile test crate emits per-call gas/native stats during the
  bench pipeline. BLS12-381 / KZG sweeps added under `pectra`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end Python tooling consumed by the bench CI to produce the PR
comment from the artifacts emitted by `cycle_marker`, the new
forward-system tracers, and the eth_runner / precompile test binaries.

- `benchlib.py`: shared helpers (`median_int`, `pct`, `fmt_pct`,
  delegation IDs/coefficients, sample loaders, label-file listing).
  Adopted by all comparison + join scripts to keep the math consistent
  across the pipeline.
- `compare_bench.py`: base/head `.bench` diff, alias `verify_and_apply_batch`
  to `state_commitment_update` and strip `_execution_environment` suffix
  so the diff joins cleanly across the boundary commit; `--no-title`
  and `--sort-by-symbol` flags for the PR-comment "Block sub-phases"
  table.
- `compare_opcode_stats.py` / `compare_opcode_cycles.py` /
  `compare_precompile_stats.py`: per-opcode and per-precompile diff
  tables.
- `join_samples.py` / `join_precompile_samples.py`: per-execution
  joins of gas, native, and cycles. `join_precompile_samples.py`
  prefers `.effective.cycles` and falls back to raw; handles synthetic
  precompile labels (keccak from SHA3 opcode samples); strips the first
  ecrecover marker per `process_transaction` boundary (intrinsic
  sig-verify) when consuming `--bench-file`.
- `cycles_per_native_report.py`: local-only ad-hoc tool that pools one
  or more `(samples_dir, cycles_dir)` pairs into a Markdown report of
  per-execution `cycles / native` ratios per opcode and per precompile
  (median / p95 / max).
- `bench.sh`: convenience wrapper with `baseline`, `quick`, `run`,
  `compare`, `flamegraph` subcommands; threads `PRECOMPILE_STATS_PATH`
  and the new sample-dir env vars through.
- `docs/benchmarking.md`: rewrite as an agent-targeted reference
  describing the effective-cycles formula, env-var-gated dump pipeline,
  comparison script index, and ecrecover intrinsic filter caveat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI plumbing that runs the bench pipeline on each PR and composes the
PR comment from the artifacts produced by the bench scripts.

- `.github/workflows/bench.yml`: full bench job. Builds base + head
  with `for-tests-benchmarking-pectra` proving binary; runs eth_runner
  under both keccak and blob DA schemes; runs `tests/instances/precompiles`
  with `--test-threads=1` (avoids `MARKER_PATH` truncate-race when
  multiple precompile tests write the same path concurrently); composes
  the PR comment as headline + collapsible sections (Block sub-phases
  via `compare_bench.py --sort-by-symbol`, Precompiles test-crate,
  per-opcode / per-precompile gas/native, per-execution cycles/native
  ratios). Has merge-base fallbacks so it gracefully degrades when the
  base commit predates new dump_bin.sh types / Cargo profiles / env
  vars / script flags.
- `Cargo.toml` (workspace root): `[profile.bench-fast]` (inherits
  release, `lto = false`, `codegen-units = 16`) for the in-workspace
  `-p precompiles` build path; covers everything except `eth_runner`
  which has its own mirror profile.
- `zksync_os/Cargo.toml`: add `pectra` feature gating to
  `proof_running_system/pectra` so the new precompiles can be enabled
  for benchmarking-binary builds without forcing them into the default
  feature set.
- `zksync_os/dump_bin.sh`: new `--type for-tests-benchmarking-pectra`
  builds the proving binary with `pectra` feature for KZG / BLS12-381
  precompile coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Skips degenerate pairs (G1 == O or G2 == O) in the BN254 pairing-check
precompile after subgroup validation. Filtering happens in the parse
loop, so malformed encodings still hit the existing rejection path. When
every pair turns out to be degenerate, the result is short-circuited to
`true` (empty product = identity in F_T) instead of relying on the
upstream multi_pairing contract for empty input.

Gas / native accounting is unchanged (still per-pair as in EIP-1108).

Mathematically safe: in the BN254 optimal-Ate pairing
`e(O, Q) = e(P, O) = 1_F_T`, so dropping any pair whose G1 or G2 is the
point at infinity does not change the multi-pairing product.

Symmetric to the BLS12-381 change in #657.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Block-level effective cycles

Benchmark Symbol Base Eff Head Eff (%) Base Raw Head Raw (%) Base Blake Head Blake (%) Base Bigint Head Bigint (%) Base Keccak Head Keccak (%)
block_19299001 (keccak DA) process_block 210,919,899 210,919,045 (-0.00%) 160,850,051 160,849,197 (-0.00%) 410,630 410,630 (+0.00%) 7,681,862 7,681,862 (+0.00%) 3,193,080 3,193,080 (+0.00%)
block_19299001 (blobs DA) process_block 259,680,540 259,679,686 (-0.00%) 197,969,124 197,968,270 (-0.00%) 414,340 414,340 (+0.00%) 10,690,989 10,690,989 (+0.00%) 3,079,505 3,079,505 (+0.00%)
block_22244135 (keccak DA) process_block 139,506,613 139,506,613 (+0.00%) 111,033,801 111,033,801 (+0.00%) 172,040 172,040 (+0.00%) 5,054,163 5,054,163 (+0.00%) 1,375,880 1,375,880 (+0.00%)
block_22244135 (blobs DA) process_block 188,902,256 188,902,256 (+0.00%) 148,522,128 148,522,128 (+0.00%) 174,090 174,090 (+0.00%) 8,085,096 8,085,096 (+0.00%) 1,313,576 1,313,576 (+0.00%)
Block-level sub-phases
Benchmark Symbol Base Eff Head Eff (%) Base Raw Head Raw (%) Base Blake Head Blake (%) Base Bigint Head Bigint (%) Base Keccak Head Keccak (%)
block_19299001 (blobs DA) blob_versioned_hash 49,568,581 49,568,581 (+0.00%) 37,472,713 37,472,713 (+0.00%) 3,710 3,710 (+0.00%) 3,009,127 3,009,127 (+0.00%) 0 0 (+0.00%)
block_22244135 (blobs DA) blob_versioned_hash 49,849,981 49,849,981 (+0.00%) 37,693,449 37,693,449 (+0.00%) 2,050 2,050 (+0.00%) 3,030,933 3,030,933 (+0.00%) 0 0 (+0.00%)
block_19299001 (blobs DA) da_commitment 1,871,973 1,871,973 (+0.00%) 1,783,013 1,783,013 (+0.00%) 5,560 5,560 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
block_19299001 (keccak DA) da_commitment 2,675,608 2,675,608 (+0.00%) 2,134,944 2,134,944 (+0.00%) 5,560 5,560 (+0.00%) 0 0 (+0.00%) 112,926 112,926 (+0.00%)
block_22244135 (blobs DA) da_commitment 1,138,910 1,138,910 (+0.00%) 1,087,390 1,087,390 (+0.00%) 3,220 3,220 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
block_22244135 (keccak DA) da_commitment 1,590,322 1,590,322 (+0.00%) 1,292,182 1,292,182 (+0.00%) 3,220 3,220 (+0.00%) 0 0 (+0.00%) 61,655 61,655 (+0.00%)
block_19299001 (keccak DA) run_tx_loop 192,093,394 192,092,540 (-0.00%) 144,007,042 144,006,188 (-0.00%) 316,840 316,840 (+0.00%) 7,681,862 7,681,862 (+0.00%) 3,072,366 3,072,366 (+0.00%)
block_22244135 (keccak DA) run_tx_loop 128,127,192 128,127,192 (+0.00%) 100,839,992 100,839,992 (+0.00%) 115,300 115,300 (+0.00%) 5,054,163 5,054,163 (+0.00%) 1,306,437 1,306,437 (+0.00%)
block_19299001 (blobs DA) state_commitment_update 13,043,057 13,043,057 (+0.00%) 11,918,097 11,918,097 (+0.00%) 70,310 70,310 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
block_19299001 (keccak DA) state_commitment_update 13,043,045 13,043,045 (+0.00%) 11,918,085 11,918,085 (+0.00%) 70,310 70,310 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
block_22244135 (blobs DA) state_commitment_update 7,494,676 7,494,676 (+0.00%) 6,841,556 6,841,556 (+0.00%) 40,820 40,820 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
block_22244135 (keccak DA) state_commitment_update 7,493,320 7,493,320 (+0.00%) 6,840,200 6,840,200 (+0.00%) 40,820 40,820 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
block_19299001 (keccak DA) system_init 36,786 36,786 (+0.00%) 36,786 36,786 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
block_22244135 (keccak DA) system_init 36,786 36,786 (+0.00%) 36,786 36,786 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
Precompiles test-crate bench (synthetic workload, all labels)
Benchmark Symbol Base Eff Head Eff (%) Base Raw Head Raw (%) Base Blake Head Blake (%) Base Bigint Head Bigint (%) Base Keccak Head Keccak (%)
precompiles bn254_ecadd 53,315 53,315 (+0.00%) 47,863 47,863 (+0.00%) 0 0 (+0.00%) 1,363 1,363 (+0.00%) 0 0 (+0.00%)
precompiles bn254_ecmul 731,892 731,892 (+0.00%) 567,704 567,704 (+0.00%) 0 0 (+0.00%) 41,047 41,047 (+0.00%) 0 0 (+0.00%)
precompiles bn254_pairing 71,468,694 71,464,456 (-0.01%) 56,940,550 56,936,312 (-0.01%) 0 0 (+0.00%) 3,632,036 3,632,036 (+0.00%) 0 0 (+0.00%)
precompiles da_commitment 16,431 16,431 (+0.00%) 13,355 13,355 (+0.00%) 30 30 (+0.00%) 0 0 (+0.00%) 649 649 (+0.00%)
precompiles ecrecover 367,587 369,109 (+0.41%) 239,839 240,921 (+0.45%) 0 0 (+0.00%) 31,288 31,398 (+0.35%) 649 649 (+0.00%)
precompiles id 925 925 (+0.00%) 925 925 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
precompiles keccak 31,674 31,674 (+0.00%) 10,902 10,902 (+0.00%) 0 0 (+0.00%) 1 1 (+0.00%) 5,192 5,192 (+0.00%)
precompiles modexp 31,888,536 31,888,536 (+0.00%) 21,230,716 21,230,716 (+0.00%) 0 0 (+0.00%) 2,664,455 2,664,455 (+0.00%) 0 0 (+0.00%)
precompiles p256_verify 747,278 747,278 (+0.00%) 468,586 468,586 (+0.00%) 0 0 (+0.00%) 69,673 69,673 (+0.00%) 0 0 (+0.00%)
precompiles process_block 144,531,964 144,523,277 (-0.01%) 114,924,800 114,929,437 (+0.00%) 5,350 5,340 (-0.19%) 7,328,471 7,325,180 (-0.04%) 51,920 51,920 (+0.00%)
precompiles process_transaction 72,057,106 72,057,001 (-0.00%) 57,307,766 57,313,125 (+0.01%) 160 160 (+0.00%) 3,664,629 3,663,263 (-0.04%) 22,066 22,066 (+0.00%)
precompiles ripemd 8,010 8,010 (+0.00%) 8,010 8,010 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
precompiles run_tx_loop 144,064,544 144,057,512 (-0.00%) 114,573,848 114,579,980 (+0.01%) 180 180 (+0.00%) 7,328,471 7,325,180 (-0.04%) 43,483 43,483 (+0.00%)
precompiles sha256 13,315 13,315 (+0.00%) 13,315 13,315 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
precompiles state_commitment_update 187,859 188,367 (+0.27%) 148,179 148,687 (+0.34%) 2,480 2,480 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)
precompiles system_init 41,514 41,514 (+0.00%) 41,514 41,514 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%) 0 0 (+0.00%)

Per-opcode

Per-precompile

Per-precompile per-execution ratios (head)
cycles = effective (raw + Blake×16 + BigInt×4 + Keccak×4)
precompile                count    med c/g    p95 c/g    p99 c/g    max c/g    med n/g    p95 n/g    p99 n/g    max n/g
------------------------------------------------------------------------------------------------------------------------
modexp                      105       70.6      713.4     2846.8     2847.5      300.0     1200.3     4814.0     4814.0
point_eval                    2     1025.1     1025.1     1025.1     1025.1     1262.1     1262.1     1262.1     1262.1
blake2f                       2      803.7      803.7      803.7      803.7        0.0        0.0        0.0        0.0
ecadd                        57      335.9      358.4      360.0      360.0      350.7      350.7      350.7      350.7
bls12_pairing_check           2      217.2      217.2      217.2      217.2        0.0        0.0        0.0        0.0
ecpairing                    31      168.4      185.6      185.6      185.6      398.2      428.6      428.6      428.6
keccak                     2497      111.7      126.6      139.3      150.6      478.8      558.6      626.8      684.2
ecmul                        37      119.0      124.1      126.5      126.5      127.3      127.3      127.3      127.3
ecrecover                    59      119.1      122.3      123.5      123.5      174.0      174.0      174.0      174.0
sha256                        4       68.4      123.3      123.3      123.3       80.6      131.5      131.5      131.5
p256_verify                  16      107.3      108.3      108.3      108.3      113.6      113.6      113.6      113.6
bls12_g1msm                   2      100.3      100.3      100.3      100.3        0.0        0.0        0.0        0.0
bls12_g2msm                   2       88.1       88.1       88.1       88.1        0.0        0.0        0.0        0.0
bls12_g2add                   2       45.0       45.0       45.0       45.0        0.0        0.0        0.0        0.0
identity                      5       22.7       34.3       34.3       34.3       31.4       48.1       48.1       48.1
bls12_g1add                   2       28.1       28.1       28.1       28.1        0.0        0.0        0.0        0.0
ripemd160                     4        4.4        7.4        7.4        7.4        8.1       13.1       13.1       13.1

@0xVolosnikov 0xVolosnikov force-pushed the vv/bench-precompiles branch 2 times, most recently from f58aaf1 to 8b5affe Compare May 18, 2026 13:41
Base automatically changed from vv/bench-precompiles to draft-0.4.0 May 18, 2026 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant