fix(modexp): skip leading zero bits in first exponent byte#618
Open
antoniolocascio-bot wants to merge 13 commits into
Open
fix(modexp): skip leading zero bits in first exponent byte#618antoniolocascio-bot wants to merge 13 commits into
antoniolocascio-bot wants to merge 13 commits into
Conversation
## What ❔ Reimplement the hints for ecrecover from ethproofs. Instead of having a new full implementation of ecrecover, this PR introduces "hooks" for 3 secp256k1 field operations. These hooks have two implmentations: default, where the implementation is straightforward (same implementation as before this PR) and oracle-based, where we use the new callable oracles. This way, most of the logic for ecrecover is reused. Given that this is a critical part of the system, the PR includes: * PBT for "good" case (showing equivalence when using the right oracle) * Tests for the "bad" case (showing panics if the oracle lies) * Modifies the fuzz target for ecrecover to compare the forward and oracle runs. I've ran the fuzzer for over 1h without finding issues. <!-- What are the changes this PR brings about? --> <!-- Example: This PR adds a PR template to the repo. --> <!-- (For bigger PRs adding more context is appreciated) --> ## Why ❔ <!-- Why are these changes done? What goal do they contribute to? What are the principles behind them? --> <!-- The `Why` has to be clear to non-Matter Labs entities running their own ZK Chain --> <!-- Example: PR templates ensure PR reviewers, observers, and future iterators are in context about the evolution of repos. --> ## Is this a breaking change? - [ ] Yes - [ ] No ## Checklist <!-- Check your PR fulfills the following items. --> <!-- For draft PRs check the boxes as you complete them. --> - [ ] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [ ] Tests for the changes have been added / updated. - [ ] Documentation comments have been added / updated. - [ ] Code has been formatted.
…coding (matter-labs#572) ## What ❔ Remove the `artifacts_len` field and artifact bytes from the pubdata diff compression encoding: - **Format 0** (publish bytecode): remove `artifacts_len` field (4 bytes), publish only the raw code bytes (`unpadded_code_len`) instead of the full bytecode blob (code + padding + artifacts) - **Format 4** (not_publish_bytecode): remove `artifacts_len` from the length calculation — fixes a pre-existing bug where `diff_compression_length()` counted +4 bytes for `artifacts_len` but `diff_compression()` never actually wrote it - Bump `PUBDATA_ENCODING_VERSION` from 1 to 2 - Update format documentation comments and unit test ## Why ❔ Bytecode artifacts (jump table, etc.) are fully deterministic from the raw bytecode and do not need to be published in pubdata. Removing them reduces pubdata size for contract deployments without losing any information — the receiver can recompute artifacts from the code. ## Is this a breaking change? - [x] Yes - [ ] No This changes the pubdata encoding format (version 1 → 2). The settlement layer / receiver must be updated to parse the new format. ## Checklist - [x] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [x] Tests for the changes have been added / updated. - [x] Documentation comments have been added / updated. - [x] Code has been formatted. --------- Co-authored-by: Claude Code <claude-code@anthropic.com>
## What ❔ Per-opcode EVM benchmarking infrastructure that collects gas, native resource, and RISC-V cycle stats for every EVM opcode execution. Enables computing cycles/gas and native/gas ratios to guide proving cost optimizations. **Components:** - `EvmOpcodeStatsTracer` — forward-mode tracer collecting per-opcode gas, native, count with min/max/median - Per-opcode cycle markers in the interpreter loop (`opcode_start!/opcode_end!`) — aggregated in `print_cycle_markers()` with min/max/median - Per-execution sample dump mode — write `(gas, native)` and `cycles` per execution for detailed ratio analysis - `join_samples.py` — joins tracer + cycle samples, computes per-execution cycles/gas with p50/p95/p99/max - `compare_opcode_stats.py` — CI integration: compact diff table when per-opcode stats change - `visualize_opcode_stats.py` — total cycle bar chart + sorted cycles/gas ratio curves - Integration into `eth_runner` single-run and CI bench workflow **Usage:** ```bash # Basic run (prints per-opcode stats table) bash bench_scripts/bench.sh quick # Full per-execution analysis OPCODE_SAMPLES_DIR=samples OPCODE_CYCLE_SAMPLES_DIR=cycle_samples bash bench_scripts/bench.sh quick python3 bench_scripts/join_samples.py samples/ cycle_samples/ --summary --out-dir joined/ python3 bench_scripts/visualize_opcode_stats.py joined/ --out-dir charts/ ``` **Benchmark impact:** +0.34% effective cycles with benchmarking features enabled. Zero overhead in production builds (all behind `#[cfg(feature = "cycle_marker")]`). ## Why ❔ To compute per-opcode `cycles/gas` and `native/gas` ratios for verifying the native resource model against actual RISC-V proving costs. ## Is this a breaking change? - [ ] Yes - [x] No All code is behind feature flags or in benchmarking-only paths. Production proving builds are unaffected — verified by audit of all changed files and benchmark comparison. ## Checklist - [x] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [x] Tests for the changes have been added / updated. - [x] Documentation comments have been added / updated. - [x] Code has been formatted. --------- Co-authored-by: Antonio Locascio <antonio.locascio1@gmail.com> Co-authored-by: vv-dev-ai <vv@matterlabs.dev> Co-authored-by: Claude Code <claude-code@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…er-labs#596) ## What Add `compare_opcode_cycles.py` script and wire it into the bench CI workflow so that per-opcode RISC-V cycle diffs are posted on PRs. The bench CI already collected per-opcode cycle stats in `.bench` files on both the base and head sides (via `opcode_start!/opcode_end!` cycle markers), but had no script to compare them. The existing `compare_opcode_stats.py` only compared forward-mode gas/native stats, which don't change when the execution implementation changes without modifying resource accounting (e.g. the custom U256 PR). ## Why Without this, PRs that affect RISC-V execution performance (like the custom U256 migration) show overall effective cycle improvements in the benchmark report but lack visibility into which specific EVM opcodes improved or regressed. This makes it harder to review and reason about performance changes. ## Is this a breaking change? - [ ] Yes - [x] No ## Checklist - [x] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [x] Tests for the changes have been added / updated. - [x] Documentation comments have been added / updated. - [x] Code has been formatted. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## What ❔ This PR changes the way we compute prover input. Before, we needed to run the slow risc-v simulator to do so. Now, we can do on native architecture. This run is also responsible for computing the pubdata. <!-- What are the changes this PR brings about? --> <!-- Example: This PR adds a PR template to the repo. --> <!-- (For bigger PRs adding more context is appreciated) --> ## Why ❔ <!-- Why are these changes done? What goal do they contribute to? What are the principles behind them? --> <!-- The `Why` has to be clear to non-Matter Labs entities running their own ZK Chain --> <!-- Example: PR templates ensure PR reviewers, observers, and future iterators are in context about the evolution of repos. --> ## Is this a breaking change? - [ ] Yes - [ ] No ## Checklist <!-- Check your PR fulfills the following items. --> <!-- For draft PRs check the boxes as you complete them. --> - [ ] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [ ] Tests for the changes have been added / updated. - [ ] Documentation comments have been added / updated. - [ ] Code has been formatted.
## What ❔ We had an invalid worst case calculation + small cleanup ## Why ❔ <!-- Why are these changes done? What goal do they contribute to? What are the principles behind them? --> <!-- The `Why` has to be clear to non-Matter Labs entities running their own ZK Chain --> <!-- Example: PR templates ensure PR reviewers, observers, and future iterators are in context about the evolution of repos. --> ## Is this a breaking change? - [ ] Yes - [ ] No ## Checklist <!-- Check your PR fulfills the following items. --> <!-- For draft PRs check the boxes as you complete them. --> - [ ] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [ ] Tests for the changes have been added / updated. - [ ] Documentation comments have been added / updated. - [ ] Code has been formatted. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat(ci): parallelize bench workflow
## What ❔ Replace the removed `risc_v_simulator` crate with `riscv_transpiler` from zksync-airbender dev branch (v2). This migrates the entire RISC-V execution and oracle infrastructure to the new VM API. ### Key changes **Dependencies & toolchain:** - `risc_v_simulator` removed from workspace, replaced by `riscv_transpiler` + `common_constants` (git deps pointing to airbender `dev` branch) - Toolchain bumped to `nightly-2026-02-10` in both root and `zksync_os/` (required by airbender v2's `blake2s_u32`) - `zksync_os/reproduce/Dockerfile` updated to match **oracle_provider:** - `ZkEENonDeterminismSource`, `OracleQueryProcessor`, and `ReadWitnessSource` are **no longer generic** over memory type - Processors take `&dyn RamPeek` instead of `&M` where `M: MemorySource` - `DummyMemorySource` now implements `RamPeek` - Implements new `NonDeterminismCSRSource` trait from `riscv_transpiler::vm` (with `write_with_memory_access`, `write_with_memory_access_dyn`) **callable_oracles:** - All query processors (`ArithmeticQuery`, `BlobCommitmentAndProofQuery`, `FieldOpsQuery`, `HashToPrimeSource`, and their `Native*` variants) simplified to non-generic structs - Memory reads use `peek_word(addr)` instead of `memory.get(addr, AccessType, &mut trap)` - Fixed pre-existing overflow check in `read_memory_as_u64` (now uses byte length for bounds check) **zksync_os_runner:** - Rewritten to use `VM::run_basic_unrolled` with `SimpleTape` + `RamWithRomRegion` - Requires `.bin` + `.text` file pair (already produced by `dump_bin.sh` via `objcopy --only-section=.text`) - `DiagnosticsConfig` parameter removed; flamegraph support via feature-gated `run_basic_unrolled_with_flamegraph` - `simulate_witness_tracing` rewritten using `SimpleSnapshotter` **cycle_marker:** - Fully wired to new `CycleMarkerHooks` API from [zksync-airbender#237](matter-labs/zksync-airbender#237) - `print_cycle_markers` now takes a `CycleMarker` argument (scoped via `CycleMarkerHooks::with()`) - Runner uses `VM::<DelegationsCounters, CycleMarkerHooks>` when feature is enabled - Re-exports `CycleMarkerHooks`, `CycleMarker`, `Mark` from `riscv_transpiler::cycle` - Inline asm now uses hardcoded `x0` to match transpiler's strict `csrrw x0, 0x7ff, x0` decoder **forward_system:** - All 12 query processors and system type aliases updated to non-generic oracle types **tests/rig:** - `flamegraph_output` wired through to `run_default_with_flamegraph_path` (replaces old `ProfilerConfig`) - Oracle factory trait methods return non-generic `ZkEENonDeterminismSource` - `run_prover` rewritten to use `execution_utils::unrolled::prove_unrolled_for_machine_configuration_into_program_proof` **Other:** - Removed stabilized `ptr_as_ref_unchecked` feature gate from `evm_interpreter` - Updated `Cargo.lock` for latest `zksync_os_interface` (adds `pubdata_used` to `BlockOutput`) - Rebuilt blake2s test binaries and committed `.text` files - Fixed pre-existing clippy lints (`sort_by_key`, `default_constructed_unit_structs`, `manual_checked_ops`, conditional `P256VerifyErrors` import) ## Why ❔ The `risc_v_simulator` crate was removed from zksync-airbender in [zksync-airbender#233](matter-labs/zksync-airbender#233) and [zksync-airbender#235](matter-labs/zksync-airbender#235). The new `riscv_transpiler` provides a more efficient VM with batch execution, optional JIT compilation (x86_64), and integrated witness snapshotting. This migration is required to stay on the airbender dev branch. ## Is this a breaking change? - [x] Yes - [ ] No **Breaking API changes:** - `zksync_os_runner::run()` and `run_and_get_effective_cycles()` no longer take a `DiagnosticsConfig` parameter - `run_and_get_effective_cycles_from_bytes()` now requires both `img_bytes` and `text_bytes` - `ZkEENonDeterminismSource`, `ReadWitnessSource`, `OracleQueryProcessor` are no longer generic over memory type - `cycle_marker::print_cycle_markers()` now takes a `CycleMarker` argument instead of reading from a global - `MemorySource` trait replaced by `RamPeek` (re-exported from `riscv_transpiler::vm`) ## Checklist - [x] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [x] Tests for the changes have been added / updated. - [x] Documentation comments have been added / updated. - [x] Code has been formatted. --------- Co-authored-by: antoniolocascio <antonio.locascio1@gmail.com>
## What ❔
Move RISC-V binary artifacts from flat files in `zksync_os/` (e.g.
`for_tests.bin`) to a structured `dist/<app>/app.{bin,elf,text}` layout
(e.g. `dist/for_tests/app.bin`).
## Why ❔
This prepares for the airbender-platform migration where `cargo
airbender build` outputs to this layout natively. Splitting this out as
a separate PR keeps the platform migration PR focused on the actual API
changes.
## Changes
- `dump_bin.sh`: output to `dist/<app>/` instead of flat files
- `chain.rs`: add `get_zksync_os_dist_dir()`, update path resolution
- `binary_checker`: update paths to `dist/<app>/app.text`
- `reproduce.sh`: copy from `dist/<app>/app.bin`
- `.gitignore`: add `/dist`
## Is this a breaking change?
- [ ] Yes
- [x] No
The build script output changes but all consumers are updated in this
PR.
## Checklist
- [x] PR title corresponds to the body of PR (we generate changelog
entries from PRs).
- [x] Tests for the changes have been added / updated.
- [x] Documentation comments have been added / updated.
- [x] Code has been formatted.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: vibelyova <265594496+vibelyova@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## What ❔ - Removed unused storage-type markers/bitflags. - Split `storage_types` into focused modules. - Moved `MAX_EVENT_TOPICS` from storage types into system constants and updated imports. ## Why ❔ - Reduces dead code and improves module structure. - Places constants with the correct domain ownership for easier maintenance. ## Is this a breaking change? - [ ] Yes - [x] No ## Checklist - [x] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [ ] Tests for the changes have been added / updated. - [ ] Documentation comments have been added / updated. - [x] Code has been formatted.
## What ❔ Add tests that verify the system correctly validates and rejects invalid/malicious oracle responses. Covers two oracle validation areas: **Block metadata gas limit validation (4 tests):** - `test_block_rejects_excessive_gas_limit` — gas limit just above `MAX_BLOCK_GAS_LIMIT` is rejected - `test_block_accepts_max_gas_limit` — exact boundary value is accepted - `test_block_rejects_u64_max_gas_limit` — extreme overflow value is rejected - `test_empty_block_rejects_excessive_gas_limit` — validation fires before any transactions are processed **DA commitment scheme validation (2 tests):** - `test_da_commitment_scheme_accepts_all_valid_ids` — all valid IDs 0–4 are accepted - `test_da_commitment_scheme_rejects_invalid_ids` — IDs 5, 128, 255 are rejected ## Why ❔ Oracle responses come from an untrusted host and must be validated. These tests confirm that existing validation logic correctly rejects malicious metadata. Gas limit validation prevents resource accounting overflow (ergs = gas × 256). DA commitment scheme validation prevents use of undefined commitment modes. ## Is this a breaking change? - [ ] Yes - [x] No ## Checklist - [x] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [x] Tests for the changes have been added / updated. - [x] Documentation comments have been added / updated. - [x] Code has been formatted. --------- Co-authored-by: Claude Code <claude-code@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The forward modexp exponentiation paths (modpow_montgomery and big_wrapping_pow) always iterated all 8 bits of the first nonzero exponent byte, while pricing uses the exponent's adjusted bit length. This caused a bounded metering mismatch where sparse exponents (e.g. 0x01) were charged for fewer bits than actually iterated. Initialize the first-byte mask at the highest set bit instead of always starting at bit 7, aligning execution with pricing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a788be8 to
eef48f3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Skip leading zero bits in the first nonzero exponent byte before entering per-bit iteration in
modpow_montgomeryandbig_wrapping_pow.Why
The forward modexp exponentiation paths always iterated all 8 bits of the first nonzero exponent byte (
mask = 1 << 7), while gas pricing uses the exponent's adjusted bit length. For exponents like0x01, pricing charges for 1 bit of work, but execution performed 8 bit-iterations (7 redundant squarings of the identity). This created a bounded metering mismatch.The fix initializes the first-byte mask at the highest set bit (
1 << (7 - leading_zeros)) instead of always at bit 7, so execution work aligns with what's charged.The mismatch is bounded (at most 7 extra no-op squarings per call) and is an optimization, not a correctness issue.
Is this a breaking change?
Checklist