Skip to content

fix(l1): close EIP-8025 stateless witness validation gaps (re-enable 9 zkevm tests)#6541

Open
avilagaston9 wants to merge 6 commits intofeat/l1-ef-tests-bump-zkevm-fixtures-v0.3.3from
fix/l1-stateless-witness-validation-gaps
Open

fix(l1): close EIP-8025 stateless witness validation gaps (re-enable 9 zkevm tests)#6541
avilagaston9 wants to merge 6 commits intofeat/l1-ef-tests-bump-zkevm-fixtures-v0.3.3from
fix/l1-stateless-witness-validation-gaps

Conversation

@avilagaston9
Copy link
Copy Markdown
Contributor

Motivation

PR #6527 widened stateless coverage to all for_amsterdam fixtures and skipped 9
validation_* tests that exposed gaps in ethrex's witness handling. This PR
closes those gaps so the 9 tests pass.

Description

Four spec-aligned changes (refs to EELS execution-specs@projects/zkevm and
geth core/stateless/database.go are inline in the code):

  • Toleranceinto_execution_witness and from_witness drop entries that
    fail RLP-decode instead of failing the whole conversion. Bad/extra entries
    can't be looked up by hash; if execution requires one, the trie or BLOCKHASH
    walk errors there.
  • Header chain contiguityfrom_witness rejects when
    headers[i].parent_hash != keccak(headers[i-1]). Mirrors EELS
    validate_headers.
  • Codes completenessget_account_code / get_code_metadata error on
    missing bytecode instead of silently returning empty code.
  • Generator alignmentgenerate_witness_for_blocks now emits ancestors in
    ascending order (was descending). Required so ethrex-generated witnesses
    satisfy the contiguity check above.

How to test

make -C tooling/ef_tests/blockchain test-stateless

Expected: 8720 passed / 0 failed / 0 ignored, ~155 s. test-levm unaffected.

Checklist

  • Updated STORE_SCHEMA_VERSION — N/A

so the resulting list satisfies `headers[i].parent_hash == keccak(headers[i-1])`,
matching the EELS contract for `validate_headers`. The generator walked the
chain backward (newest -> oldest), so reverse the byte list before returning.

This is a no-op for current consumers because the headers are stored in a
`BTreeMap<u64, BlockHeader>` keyed by number, but it makes the witness valid
for any spec-conformant stateless verifier and is a precondition for adding
the EIP-8025 contiguity check on the consumer side.

Reference: https://github.com/ethereum/execution-specs/blob/projects/zkevm/src/ethereum/forks/amsterdam/stateless.py#L171-L191
…teness:

- Tolerance: when decoding state nodes and ancestor headers in `into_execution_witness`
  and `from_witness`, drop entries that fail to RLP-decode instead of failing the
  whole conversion. A bad/extra entry cannot be looked up by hash anyway; if the
  trie walk or BLOCKHASH path actually requires the dropped entry, the lookup
  fails explicitly there. Mirrors EELS `witness_state.build_node_db` and geth
  `MakeHashDB`, which both store entries keyed by hash without pre-validation.

- Header chain contiguity: in `from_witness`, walk the header byte list in order
  and reject when `headers[i].parent_hash != keccak(headers[i-1])`. A reordered
  or fragmented header chain is not a valid witness even if the by-number
  lookup would otherwise resolve to the right header. Mirrors EELS
  `stateless.validate_headers`. Malformed entries are treated as a chain break
  (subsequent headers won't satisfy the parent_hash check), preserving the
  tolerance behavior for blocks that do not actually request the bad header.

- Codes completeness: `get_account_code` and `get_code_metadata` now error on
  missing bytecode instead of silently defaulting to empty code. EIP-8025 mandates
  that a stateless executor reaching a code lookup whose hash is not in the witness
  treat the witness as incomplete and reject. Matches EELS `witness_state.get_code`
  (raises KeyError on miss) and geth's documented "bytecode lookup will error on
  junk" model.

References:
- https://github.com/ethereum/execution-specs/blob/projects/zkevm/src/ethereum/forks/amsterdam/witness_state.py#L37-L42
- https://github.com/ethereum/execution-specs/blob/projects/zkevm/src/ethereum/forks/amsterdam/witness_state.py#L204-L212
- https://github.com/ethereum/execution-specs/blob/projects/zkevm/src/ethereum/forks/amsterdam/stateless.py#L171-L191
- https://github.com/ethereum/go-ethereum/blob/master/core/stateless/database.go#L26-L67
…ed under

`feature = "stateless"`. They now pass with the witness-consumer alignment
(tolerance + contiguity + codes-completeness) and the generator-side ascending
header order:

  - validation_state_extra_unused_trie_node
  - validation_headers_malformed_rlp_header
  - validation_headers_missing_oldest_blockhash_ancestor
  - validation_headers_missing_parent_header
  - validation_headers_non_contiguous_chain
  - validation_codes_missing_delegated_code_on_insufficient_balance_call
  - validation_codes_missing_external_code_read_target
  - validation_codes_missing_redelegation_old_marker
  - validation_codes_missing_sender_delegation_marker

`make test-stateless`: 8720 passed / 0 failed / 0 ignored (~155 s).
`make test-levm`: unchanged.
…; keep

the spec link but drop the prose. No behavior change.
@github-actions
Copy link
Copy Markdown

Lines of code report

Total lines added: 2
Total lines removed: 4
Total lines changed: 6

Detailed view
+-------------------------------------------------------+-------+------+
| File                                                  | Lines | Diff |
+-------------------------------------------------------+-------+------+
| ethrex/crates/blockchain/blockchain.rs                | 2489  | +2   |
+-------------------------------------------------------+-------+------+
| ethrex/crates/common/types/block_execution_witness.rs | 566   | -4   |
+-------------------------------------------------------+-------+------+

…nversion

comments — keep the prose and the spec links, just remove the symbol-plus-word
labels. Restore the `This is an optimized path for EXTCODESIZE opcode.` doc
line on `get_code_metadata` that was dropped while trimming. No behavior
change.
…ader.parent_hash != expected_parent { ... } }` in `from_witness` into a single let-chain `if`. Required to satisfy `clippy::collapsible_if` under `-D warnings` on rust 1.91 — the CI Lint and Lint L2 jobs were failing on this. Behavior unchanged.
@avilagaston9 avilagaston9 marked this pull request as ready for review April 28, 2026 16:52
@avilagaston9 avilagaston9 requested a review from a team as a code owner April 28, 2026 16:52
@ethrex-project-sync ethrex-project-sync Bot moved this to In Review in ethrex_l1 Apr 28, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 28, 2026

Greptile Summary

This PR closes four EIP-8025 stateless-witness validation gaps, re-enabling 9 previously-skipped validation_* ef-tests: (1) malformed trie nodes/headers are now silently dropped instead of aborting conversion; (2) from_witness validates that the header list forms a contiguous chain in list order; (3) get_account_code / get_code_metadata now hard-error on missing bytecode; and (4) the witness generator emits ancestor headers in ascending order to satisfy the new contiguity check.

Confidence Score: 4/5

Safe to merge; only P2 style issues found, core logic is correct and all targeted tests pass.

All findings are P2 (a misleading comment and using Custom instead of the typed NoncontiguousBlockHeaders variant). The spec-aligned logic changes are well-reasoned and backed by EELS references.

crates/common/types/block_execution_witness.rs — comment accuracy and error variant usage in from_witness.

Important Files Changed

Filename Overview
crates/common/types/block_execution_witness.rs Four EIP-8025 changes: tolerance for malformed trie nodes/headers, contiguous-chain validation in from_witness, and hard errors on missing bytecode. Minor: a misleading comment on the malformed-entry path and a Custom error where the typed NoncontiguousBlockHeaders variant already exists.
crates/blockchain/blockchain.rs Adds block_headers_bytes.reverse() in both generate_witness_for_blocks code paths so the emitted ancestor list is ascending, satisfying the new contiguity check in from_witness.
tooling/ef_tests/blockchain/tests/all.rs Removes the stateless-specific skip list (9 tests) and simplifies to a single #[cfg(not(feature = "sp1"))] empty slice, re-enabling all formerly-skipped validation tests.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["RpcExecutionWitness\n(from RPC / fixture)"] -->|"into_execution_witness()"| B["ExecutionWitness\n(block_headers_bytes in ascending order)"]
    GEN["generate_witness_for_blocks\n(walks chain backward)"] -->|"reverse() → ascending"| B

    B -->|"from_witness()"| C{For each header bytes}
    C -->|"decode fails"| D["prev_hash = None\nskip (tolerance)"]
    D --> C
    C -->|"decode ok\nprev_hash != None AND\nheader.parent_hash != prev_hash"| E["❌ NoncontiguousBlockHeaders\nerror"]
    C -->|"decode ok\nchain link valid"| F["insert into block_headers\nprev_hash = keccak(bytes)"]
    F --> C
    C -->|"done"| G["GuestProgramState"]

    G --> H["get_account_code(hash)"]
    H -->|"hash missing in codes_hashed"| I["❌ Database error\n(was: silent empty code)"]
    H -->|"hash found"| J["✅ Return Code"]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/common/types/block_execution_witness.rs
Line: 330-334

Comment:
**Misleading inline comment — check is skipped, not failed**

The comment says "the next parent_hash check fails," but when `prev_hash` is set to `None`, the subsequent `if let Some(expected_parent) = prev_hash` guard in the next iteration simply does not fire — the check is bypassed, not triggered-and-failed. A validator reading this could mistake the skipping for active rejection.

```suggestion
            let Ok(header) = BlockHeader::decode(bytes.as_ref()) else {
                // Malformed entry: skip it and reset the chain cursor so the
                // next decodable header is accepted without a parent-hash check.
                prev_hash = None;
                continue;
            };
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/common/types/block_execution_witness.rs
Line: 337-340

Comment:
**Use the existing `NoncontiguousBlockHeaders` error variant**

There is already a typed variant `GuestProgramStateError::NoncontiguousBlockHeaders` (defined a few lines above) with the message "Non-contiguous block headers (there's a gap in the block headers list)". Using `Custom(...)` here bypasses that variant and makes error matching on the call-site harder.

```suggestion
            {
                return Err(GuestProgramStateError::NoncontiguousBlockHeaders);
            }
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Collapse the nested `if let Some(expecte..." | Re-trigger Greptile

Comment on lines +330 to +334
let Ok(header) = BlockHeader::decode(bytes.as_ref()) else {
// Malformed entry is a chain break; the next parent_hash check fails.
prev_hash = None;
continue;
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Misleading inline comment — check is skipped, not failed

The comment says "the next parent_hash check fails," but when prev_hash is set to None, the subsequent if let Some(expected_parent) = prev_hash guard in the next iteration simply does not fire — the check is bypassed, not triggered-and-failed. A validator reading this could mistake the skipping for active rejection.

Suggested change
let Ok(header) = BlockHeader::decode(bytes.as_ref()) else {
// Malformed entry is a chain break; the next parent_hash check fails.
prev_hash = None;
continue;
};
let Ok(header) = BlockHeader::decode(bytes.as_ref()) else {
// Malformed entry: skip it and reset the chain cursor so the
// next decodable header is accepted without a parent-hash check.
prev_hash = None;
continue;
};
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/common/types/block_execution_witness.rs
Line: 330-334

Comment:
**Misleading inline comment — check is skipped, not failed**

The comment says "the next parent_hash check fails," but when `prev_hash` is set to `None`, the subsequent `if let Some(expected_parent) = prev_hash` guard in the next iteration simply does not fire — the check is bypassed, not triggered-and-failed. A validator reading this could mistake the skipping for active rejection.

```suggestion
            let Ok(header) = BlockHeader::decode(bytes.as_ref()) else {
                // Malformed entry: skip it and reset the chain cursor so the
                // next decodable header is accepted without a parent-hash check.
                prev_hash = None;
                continue;
            };
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +337 to +340
{
return Err(GuestProgramStateError::Custom(
"witness headers are not contiguous".to_string(),
));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Use the existing NoncontiguousBlockHeaders error variant

There is already a typed variant GuestProgramStateError::NoncontiguousBlockHeaders (defined a few lines above) with the message "Non-contiguous block headers (there's a gap in the block headers list)". Using Custom(...) here bypasses that variant and makes error matching on the call-site harder.

Suggested change
{
return Err(GuestProgramStateError::Custom(
"witness headers are not contiguous".to_string(),
));
{
return Err(GuestProgramStateError::NoncontiguousBlockHeaders);
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/common/types/block_execution_witness.rs
Line: 337-340

Comment:
**Use the existing `NoncontiguousBlockHeaders` error variant**

There is already a typed variant `GuestProgramStateError::NoncontiguousBlockHeaders` (defined a few lines above) with the message "Non-contiguous block headers (there's a gap in the block headers list)". Using `Custom(...)` here bypasses that variant and makes error matching on the call-site harder.

```suggestion
            {
                return Err(GuestProgramStateError::NoncontiguousBlockHeaders);
            }
```

How can I resolve this? If you propose a fix, please make it concise.

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Now I have a full picture of the PR. Here is my review:


PR #6541 Review: fix(l1): close EIP-8025 stateless witness validation gaps

Overview

This PR closes four stateless witness validation gaps that were preventing 9 validation_* EF-test fixtures from passing under the stateless feature flag. The changes are well-motivated, commit messages are thorough, and spec references (EELS + geth) are provided inline. All 9 previously-skipped tests are re-enabled.


Correctness Analysis

1. Header contiguity check — comment is inaccurate (minor bug)

block_execution_witness.rs, from_witness:

let Ok(header) = BlockHeader::decode(bytes.as_ref()) else {
    // Malformed entry is a chain break; the next parent_hash check fails.
    prev_hash = None;
    continue;
};
if let Some(expected_parent) = prev_hash
    && header.parent_hash != expected_parent
{
    return Err(...);
}

Setting prev_hash = None means the if let Some(expected_parent) guard on the next valid header evaluates to false, so the check is skipped, not "failed" as the comment claims. The real behavior is: a malformed entry resets the chain anchor, and the subsequent valid header starts a new unverified segment.

The commit message repeats the same inaccuracy: "subsequent headers won't satisfy the parent_hash check." They actually bypass it. The behavior is spec-correct (tolerance) but the comment will mislead future readers. Suggested replacement:

// Drop the malformed entry; reset the chain anchor so the next valid
// header starts a new contiguous segment (EIP-8025 tolerance).
prev_hash = None;

2. Malformed-entry reset creates a potential contiguity bypass window

As a consequence of Point 1: a witness shaped as [valid_A, malformed_blob, valid_B_not_contiguous_with_A] passes the contiguity check because the malformed blob resets prev_hash. Whether the spec intends this is not entirely clear from the EELS link cited. Given that validation_headers_non_contiguous_chain (a pure-valid-header case) is now correctly rejected and validation_headers_malformed_rlp_header is correctly accepted, this is likely intentional — but worth a brief note or a reference to the EELS line that makes the "chain-restart-on-junk" behavior explicit.

3. into_execution_witness — removal of the 0x80 (RLP null) special case

Old code:

if b == Bytes::from_static(&[0x80]) {
    return None;
}
let hash = keccak(&b);
Some(Node::decode(&b).map(|node| (hash, node)))

New code:

let node = Node::decode(&b).ok()?;
Some((keccak(&b), node))

The 0x80 case is now implicitly handled because Node::decode on [0x80] (RLP null) returns an error, so ok()? drops it. This is cleaner, but the old comment explained why some implementations emit 0x80 (other implementations allow a Null node). Dropping that comment removes useful cross-client context. Consider keeping one line like:

// EIP-8025 tolerance: drop entries that don't decode (e.g. RLP-null 0x80 from some clients).

4. get_account_code / get_code_metadata — error variant mismatch

The new missing-bytecode paths use GuestProgramStateError::Database, while the header contiguity failure uses GuestProgramStateError::Custom. If the caller dispatches on error type, this inconsistency could matter. Worth confirming these are intentionally different, or aligning them.


Code Quality

5. Duplicated block_headers_bytes.reverse() in blockchain.rs

The identical fix appears at two call sites (~line 1439 and ~line 1676). This is fine for now, but the accompanying comments (and the EELS URL) are copy-pasted verbatim. If a third call site is added later, it's easy to miss. Not blocking, but worth noting.

6. hex::encode(code_hash) on H256

H256 likely already implements Display / LowerHex, so format!("{code_hash:x}") or format!("{code_hash:#x}") may avoid the hex::encode dependency. Minor.


Test Coverage

The test simplification in all.rs is correct: the #[cfg(feature = "stateless")] override with 9 skips is removed, and the previously separate #[cfg(not(any(feature = "sp1", feature = "stateless")))] empty slice collapses into #[cfg(not(feature = "sp1"))]. The result is that the stateless path now runs with the same (empty) extra-skip list as the baseline, which is exactly right.

No new unit tests are added for the individual behaviors (contiguity check, decode tolerance, missing-code error). Given that the EF fixture suite covers these paths, this is acceptable — but a targeted unit test for from_witness contiguity rejection would be a cheap safety net for regressions.


Summary

Area Verdict
Spec alignment (EELS / geth) Correct
EF test re-enablement Correct
Ascending header order fix Correct
Contiguity check logic Correct, comment misleading
Tolerance (drop malformed entries) Correct
Codes completeness (error on miss) Correct
Comment clarity One inaccurate comment (Point 1)
Error variant consistency Minor inconsistency (Point 4)

The logic is sound and the spec alignment looks right. The main ask before merge is fixing the comment in the malformed-header branch — it currently says the next check "fails" when it is actually skipped.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

  1. GuestProgramState::from_witness still lets some non-contiguous header sets through. At crates/common/types/block_execution_witness.rs:329 the new check only enforces parent_hash == keccak(previous header bytes); it never enforces header.number == prev.number + 1. On top of that, a malformed entry resets prev_hash at crates/common/types/block_execution_witness.rs:330, so the next decoded header is not checked against the previous chain at all. That would still accept witnesses like [98, <garbage>, 100] or forged-number chains like [98, 100(parent=hash98)] unless block 99 is actually queried. The fallback check does exist, but its Err(NoncontiguousBlockHeaders) path is ignored in crates/guest-program/src/common/execution.rs:72. I’d add an explicit number-adjacency check here, or propagate that error instead of discarding it.

  2. The header-order change is not backward-compatible with cached witnesses. Newly generated witnesses are reversed into ascending order at crates/blockchain/blockchain.rs:1439 and crates/blockchain/blockchain.rs:1676, but cached RPC witnesses are still served back verbatim from crates/storage/store.rs:2025 via crates/networking/rpc/debug/execution_witness.rs:84. RpcExecutionWitness::into_execution_witness also preserves self.headers unchanged at crates/common/types/block_execution_witness.rs:222. After an upgrade, the same node can therefore return old descending witnesses for cached blocks and new ascending witnesses for fresh ones, and the new stateless path will reject the cached ones as “not contiguous”. This needs cache invalidation/migration or read-side normalization.

Assumption: I’m reading the EIP-8025 intent here as “tolerate unused malformed entries, but require the usable header chain itself to be contiguous”. Under that interpretation, Item 1 is still a real validation gap.

I couldn’t run cargo check here because rustup tried to write under /home/runner/.rustup, which is read-only in this sandbox. Otherwise, the stricter missing-bytecode handling looks directionally correct.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

1 participant