perf(l1): batch receipt retrieval in single read txn by azteca1998 · Pull Request #6548 · lambdaclass/ethrex

azteca1998 · 2026-04-29T10:30:53Z

Summary

Batch all receipt point lookups into a single spawn_blocking task with one shared read transaction, eliminating per-receipt spawn_blocking + transaction setup overhead
Refactor get_all_block_rpc_receipts and get_all_block_receipts to fetch all receipts in one pass instead of one-by-one
Add cursor-based benchmark alongside point lookups to compare retrieval strategies

Context

Targets feat/automatic-db-migrations (#6519) so we can add a DB migration if the final approach requires schema changes (e.g. new meta entries or prefix-based storage layout).

Test plan

Verify RPC receipt endpoints return correct results (eth_getTransactionReceipt, eth_getBlockReceipts, eth_getRawReceipts)
Check benchmark logs on a synced node to compare point lookup vs cursor performance
Verify eth/70 partial receipt fetching still works

github-actions · 2026-04-29T10:33:27Z

Lines of code report

Total lines added: 143
Total lines removed: 1
Total lines changed: 144

Detailed view

+-------------------------------------------------+-------+------+
| File                                            | Lines | Diff |
+-------------------------------------------------+-------+------+
| ethrex/crates/networking/rpc/eth/block.rs       | 371   | +6   |
+-------------------------------------------------+-------+------+
| ethrex/crates/networking/rpc/eth/transaction.rs | 576   | -1   |
+-------------------------------------------------+-------+------+
| ethrex/crates/storage/migrations.rs             | 184   | +106 |
+-------------------------------------------------+-------+------+
| ethrex/crates/storage/store.rs                  | 2678  | +31  |
+-------------------------------------------------+-------+------+

github-actions · 2026-04-29T11:57:15Z

Benchmark Block Execution Results Comparison Against Main

Command	Mean [s]	Min [s]	Max [s]	Relative
`base`	60.329 ± 0.158	60.013	60.525	1.00 ± 0.00
`head`	60.191 ± 0.210	59.971	60.602	1.00

… iteration Change RECEIPTS key from RLP-encoded (BlockHash, u64) to raw block_hash (32B) || index (8B big-endian u64). This enables cursor-based prefix iteration by block hash, replacing the previous point-lookup loop. - Add receipt_key() helper for the new fixed-width key format - Rewrite get_receipts_for_block_from_index to use prefix_iterator - Add v1→v2 migration (batch-processes old RLP keys, crash-safe) - Bump STORE_SCHEMA_VERSION to 2 - Remove benchmark code from the previous iteration

Switch get_all_block_rpc_receipts and get_all_block_receipts from per-receipt point lookups to a single get_receipts_for_block() call, which uses prefix_iterator for cursor-based batch retrieval.

The cursor-based batch retrieval is slower for RPC because iterators bypass RocksDB block cache optimizations that point lookups benefit from. Keep cursor iteration only for p2p (get_receipts_for_block), restore per-receipt get() for the RPC handlers.

eth_getTransactionReceipt previously fetched ALL receipts for a block (N point lookups) just to return one. Now uses cursor iteration with a max_count limit to fetch only receipts 0..=index, stopping the cursor early. For a block with 200 txs and a target at index 10, this fetches 11 receipts instead of 200. - Add max_count parameter to get_receipts_for_block_from_index - Add target_index parameter to get_all_block_rpc_receipts - eth_getTransactionReceipt passes Some(index) to stop early - eth_getBlockReceipts passes None to fetch all - get_all_block_receipts uses cursor for raw receipt retrieval

Without a RocksDB prefix extractor, prefix_iterator_cf seeks to the correct position but doesn't stop at the prefix boundary. The loop was iterating through the entire remaining TRANSACTION_LOCATIONS table after finding the match, causing eth_getTransactionReceipt to take seconds instead of milliseconds.

greptile-apps · 2026-04-29T12:45:34Z

Greptile Summary

This PR batches receipt retrieval into a single read transaction using cursor-based prefix iteration over a new fixed-width 40-byte key format (block_hash || index BE), and introduces a migration framework (v1→v2) to rekey the existing RECEIPTS table in-place. The storage and RPC layers are updated to use the new batch API, and eth_getTransactionReceipt now fetches only receipts 0..=target rather than the whole block.

P1 (transaction.rs): receipts.last() is only correct when all_receipts.len() == index + 1. If the DB is short, the zip silently truncates and the wrong receipt is returned; the previous code raised RpcErr::Internal for missing receipts.

Confidence Score: 3/5

Mergeable once the receipts.last() invariant is enforced; migration and storage changes are otherwise sound.

A P1 logic bug in eth_getTransactionReceipt means a DB integrity issue silently returns the wrong receipt instead of an error. The migration framework itself is well-designed with crash safety and idempotency, but the RPC layer regression warrants a fix before merge.

crates/networking/rpc/eth/transaction.rs — missing receipt-count assertion after batch fetch.

Important Files Changed

Filename	Overview
crates/networking/rpc/eth/transaction.rs	`eth_getTransactionReceipt` now uses `receipts.last()` relying on fetch-count invariant; silently returns wrong receipt if DB is short — should assert receipt count.
crates/networking/rpc/eth/block.rs	Receipt helpers refactored to batch cursor reads; `GetRawReceipts` still deserialises full block body unnecessarily after `get_all_block_receipts` dropped the body parameter.
crates/storage/migrations.rs	New migration framework: re-keys RECEIPTS table from RLP to raw 40-byte keys. Crash-restart idempotency via 40-byte length skip is sound; batch write-then-delete pattern with atomic temp-file metadata update is correct.
crates/storage/store.rs	Key format change to fixed-width 40-byte receipt keys + new cursor-based `get_receipts_for_block_from_index`; migration orchestration added to `Store::new`. No invalid key length guard in the iterator.
crates/storage/error.rs	Added `MigrationFailed` error variant and simplified `NotFoundDBVersion` to a unit variant; clean and correct.
cmd/ethrex/initializers.rs	Added `MigrationFailed` handler and fixed `NotFoundDBVersion` pattern (unit variant, no `{ .. }`); straightforward.
crates/networking/p2p/rlpx/connection/server.rs	Minimal update: adds `None` for new `max_count` parameter to keep existing eth/70 partial-receipt behaviour unchanged.

Sequence Diagram

sequenceDiagram
    participant C as RPC Client
    participant T as transaction.rs
    participant B as block.rs
    participant S as Store
    participant DB as RocksDB

    C->>T: eth_getTransactionReceipt(tx_hash)
    T->>S: get_transaction_location(tx_hash)
    S-->>T: (block_number, block_hash, index)
    T->>S: get_block_by_hash(block_hash)
    S-->>T: Block {header, body}
    T->>B: get_all_block_rpc_receipts(header, body, storage, Some(index))
    B->>S: get_receipts_for_block_from_index(block_hash, 0, Some(index+1))
    S->>DB: prefix_iterator(RECEIPTS, block_hash[32B])
    DB-->>S: receipts[0..=index] cursor scan
    S-->>B: Vec<Receipt> up to index+1
    B->>B: zip(transactions, receipts).take(index+1)
    B-->>T: Vec<RpcReceipt> 0..=index
    T->>T: receipts.last() <- receipt at index
    T-->>C: JSON receipt

Comments Outside Diff (1)

crates/networking/rpc/eth/block.rs, line 266-280 (link)

Block body fetched but not used

get_all_block_receipts no longer takes a body argument, so the storage.get_block_body(block_number).await? call now deserializes the entire block body (with all transactions) only to discard it. Only the header is needed. Consider replacing the two-fetch + destructure pattern with a header-only existence check, or at minimum don't deserialise into body when only existence matters.

Prompt To Fix With AI

This is a comment left during a code review.
Path: crates/networking/rpc/eth/block.rs
Line: 266-280

Comment:
**Block body fetched but not used**

`get_all_block_receipts` no longer takes a `body` argument, so the `storage.get_block_body(block_number).await?` call now deserializes the entire block body (with all transactions) only to discard it. Only the header is needed. Consider replacing the two-fetch + destructure pattern with a header-only existence check, or at minimum don't deserialise into `body` when only existence matters.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: crates/networking/rpc/eth/transaction.rs
Line: 311-313

Comment:
**Silent wrong receipt when DB is short**

`receipts.last()` only equals the receipt at `index` if `all_receipts` contains exactly `fetch_count = index + 1` entries. When the DB returns fewer receipts (e.g. incomplete data or a storage bug), the `zip` in `get_all_block_rpc_receipts` silently truncates, and `receipts.last()` returns the receipt for the *last available* index — not `index` — with no error raised. The previous code explicitly returned `RpcErr::Internal("Could not get receipt")` for this case.

You can guard this in the caller:

```rust
if receipts.len() != (index as usize + 1) {
    return Err(RpcErr::Internal(format!(
        "Expected {} receipts, got {}",
        index + 1,
        receipts.len()
    )));
}
serde_json::to_value(receipts.last())
    .map_err(|error| RpcErr::Internal(error.to_string()))
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/networking/rpc/eth/block.rs
Line: 266-280

Comment:
**Block body fetched but not used**

`get_all_block_receipts` no longer takes a `body` argument, so the `storage.get_block_body(block_number).await?` call now deserializes the entire block body (with all transactions) only to discard it. Only the header is needed. Consider replacing the two-fetch + destructure pattern with a header-only existence check, or at minimum don't deserialise into `body` when only existence matters.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1092-1096

Comment:
**Keys with unexpected lengths are silently decoded as receipts**

When `k.len() != 40`, the `start_index` filter is skipped, but `Receipt::decode(v.as_ref())` is still attempted. Any malformed or non-receipt key that shares the 32-byte prefix (e.g. a future schema extension) will cause a decode error instead of being cleanly skipped. Consider adding an explicit length guard:

```rust
if k.len() != 40 {
    tracing::warn!("Skipping RECEIPTS entry with unexpected key length {}", k.len());
    continue;
}
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(storage): break prefix iterator in g..." | Re-trigger Greptile}

greptile-apps · 2026-04-29T12:45:38Z

+        serde_json::to_value(receipts.last())
            .map_err(|error| RpcErr::Internal(error.to_string()))
    }


Silent wrong receipt when DB is short

receipts.last() only equals the receipt at index if all_receipts contains exactly fetch_count = index + 1 entries. When the DB returns fewer receipts (e.g. incomplete data or a storage bug), the zip in get_all_block_rpc_receipts silently truncates, and receipts.last() returns the receipt for the last available index — not index — with no error raised. The previous code explicitly returned RpcErr::Internal("Could not get receipt") for this case.

You can guard this in the caller:

if receipts.len() != (index as usize + 1) { return Err(RpcErr::Internal(format!( "Expected {} receipts, got {}", index + 1, receipts.len() ))); } serde_json::to_value(receipts.last()) .map_err(|error| RpcErr::Internal(error.to_string()))

Prompt To Fix With AI

This is a comment left during a code review. Path: crates/networking/rpc/eth/transaction.rs Line: 311-313 Comment: **Silent wrong receipt when DB is short** `receipts.last()` only equals the receipt at `index` if `all_receipts` contains exactly `fetch_count = index + 1` entries. When the DB returns fewer receipts (e.g. incomplete data or a storage bug), the `zip` in `get_all_block_rpc_receipts` silently truncates, and `receipts.last()` returns the receipt for the *last available* index — not `index` — with no error raised. The previous code explicitly returned `RpcErr::Internal("Could not get receipt")` for this case. You can guard this in the caller: ```rust if receipts.len() != (index as usize + 1) { return Err(RpcErr::Internal(format!( "Expected {} receipts, got {}", index + 1, receipts.len() ))); } serde_json::to_value(receipts.last()) .map_err(|error| RpcErr::Internal(error.to_string())) ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-29T12:45:42Z

+                if !k.starts_with(&prefix) {
+                    break;
+                }
+                // Skip entries before start_index (for eth/70 partial requests)
+                if k.len() == 40 && start_index > 0 {


Keys with unexpected lengths are silently decoded as receipts

When k.len() != 40, the start_index filter is skipped, but Receipt::decode(v.as_ref()) is still attempted. Any malformed or non-receipt key that shares the 32-byte prefix (e.g. a future schema extension) will cause a decode error instead of being cleanly skipped. Consider adding an explicit length guard:

if k.len() != 40 { tracing::warn!("Skipping RECEIPTS entry with unexpected key length {}", k.len()); continue; }

Prompt To Fix With AI

This is a comment left during a code review. Path: crates/storage/store.rs Line: 1092-1096 Comment: **Keys with unexpected lengths are silently decoded as receipts** When `k.len() != 40`, the `start_index` filter is skipped, but `Receipt::decode(v.as_ref())` is still attempted. Any malformed or non-receipt key that shares the 32-byte prefix (e.g. a future schema extension) will cause a decode error instead of being cleanly skipped. Consider adding an explicit length guard: ```rust if k.len() != 40 { tracing::warn!("Skipping RECEIPTS entry with unexpected key length {}", k.len()); continue; } ``` How can I resolve this? If you propose a fix, please make it concise.

iovoid · 2026-04-29T14:08:30Z

+            let mut receipts = Vec::new();
+            for result in iter {


If the receipts are missing for the block, we return [] instead of an error.

- Skip RECEIPTS entries with unexpected key lengths instead of attempting to decode them - Add receipt count validation in get_all_block_rpc_receipts so missing receipts produce an error instead of a silent empty result - Add guard in eth_getTransactionReceipt for short receipt lists

ElFantasma

LGTM, but left some suggestions

ElFantasma · 2026-04-29T18:09:43Z

                .await?;

-        serde_json::to_value(receipts.get(index as usize))
+        if receipts.len() != (index as usize + 1) {


This length check is now belt-and-suspenders — get_all_block_rpc_receipts already returns RpcErr::Internal when all_receipts.len() != fetch_count (block.rs:66), so the inner zip never silently truncates. Can be dropped. (The greptile P1 about receipts.last() being unsafe is mistaken for the same reason — the length guard is upstream.)

ElFantasma · 2026-04-29T18:09:43Z

+/// Crash safety: if interrupted mid-migration, metadata still says v1,
+/// so the migration restarts from scratch on next boot. Keys that fail
+/// RLP decode are assumed to be already migrated and are skipped.
+fn migrate_1_to_2(backend: &dyn StorageBackend) -> Result<(), StoreError> {


Snapshot semantics worth pinning down: this holds one begin_read() iterator while issuing multiple begin_write() batches that delete old keys and insert new ones into the same RECEIPTS CF. RocksDB transactions backed by snapshots will hold the iterator's view stable, but the contract isn't documented at the StorageBackend trait level. If a backend implementation ever returns a non-snapshot iterator, the migration could either re-read freshly-inserted 40-byte keys (idempotent — they hit the len() == 40 skip below) or skip old keys that were just deleted (data loss). Worth either documenting the snapshot requirement on the trait or restructuring to materialize the old-key list before any writes.

ElFantasma · 2026-04-29T18:09:43Z

+    const BATCH_SIZE: usize = 10_000;
+
+    let txn = backend.begin_read()?;
+    let iter = txn.prefix_iterator(RECEIPTS, &[])?;


No test exercises the migration body itself. migrations_length_matches_schema_version and run_pending_migrations_noop_when_current cover the framework, not v1→v2. A unit test that seeds old-format keys via (BlockHash, u64).encode_to_vec(), runs migrate_1_to_2, and verifies the new keys round-trip via get_receipt/get_receipts_for_block would catch any future regression in the key encoding (e.g., if to_be_bytes ever drifts from the migration's reconstruction).

ElFantasma · 2026-04-29T18:09:43Z

+                    continue;
+                }
+                // Skip entries before start_index (for eth/70 partial requests)
+                if start_index > 0 {


For start_index > 0 (eth/70 partial path), the prefix iterator starts at the block-hash boundary and skips entries below start_index linearly. A seek to block_hash || start_index.to_be_bytes() would jump straight to the target — minor on small batches but matters for partial requests deep into a block. Follow-up if prefix_iterator doesn't expose a seek-to-key today.

- Remove redundant receipt length check in eth_getTransactionReceipt (already validated upstream in get_all_block_rpc_receipts) - Restructure migrate_1_to_2 to materialize old keys before writing, avoiding dependency on snapshot semantics during concurrent read/write - Add test for migrate_1_to_2 that seeds old RLP keys, runs migration, and verifies new fixed-width keys round-trip correctly

ilitteri

We are testing a potential memory usage issue during the migration. Blocking until we know more.

pablodeymo

I change my approve to match my concern about the migration

github-actions Bot assigned azteca1998 Apr 29, 2026

github-actions Bot added L1 Ethereum client performance Block execution throughput and performance in general labels Apr 29, 2026

github-project-automation Bot added this to ethrex_performance Apr 29, 2026

github-project-automation Bot moved this to Todo in ethrex_performance Apr 29, 2026

github-project-automation Bot added this to ethrex_l1 Apr 29, 2026

azteca1998 force-pushed the perf/batch-get-receipts branch from 155c4b4 to f47959c Compare April 29, 2026 11:12

Base automatically changed from feat/automatic-db-migrations to main April 29, 2026 11:53

azteca1998 marked this pull request as ready for review April 29, 2026 12:41

azteca1998 requested a review from a team as a code owner April 29, 2026 12:41

azteca1998 added 5 commits April 29, 2026 14:43

perf(rpc): use cursor-based batch retrieval for eth_getBlockReceipts

69e0acc

Switch get_all_block_rpc_receipts and get_all_block_receipts from per-receipt point lookups to a single get_receipts_for_block() call, which uses prefix_iterator for cursor-based batch retrieval.

azteca1998 force-pushed the perf/batch-get-receipts branch from b550c57 to e020e3f Compare April 29, 2026 12:44

greptile-apps Bot reviewed Apr 29, 2026

View reviewed changes

iovoid requested changes Apr 29, 2026

View reviewed changes

github-project-automation Bot moved this to In Progress in ethrex_l1 Apr 29, 2026

azteca1998 added 2 commits April 29, 2026 16:46

docs: add perf changelog entry for receipt cursor optimization

ee8e0ff

iovoid approved these changes Apr 29, 2026

View reviewed changes

ElFantasma approved these changes Apr 30, 2026

View reviewed changes

avilagaston9 approved these changes Apr 30, 2026

View reviewed changes

github-project-automation Bot moved this from In Progress to In Review in ethrex_l1 Apr 30, 2026

azteca1998 enabled auto-merge April 30, 2026 15:12

azteca1998 disabled auto-merge April 30, 2026 15:50

style: fix formatting in transaction.rs

e5a0586

ilitteri enabled auto-merge April 30, 2026 15:54

pablodeymo approved these changes Apr 30, 2026

View reviewed changes

ilitteri disabled auto-merge April 30, 2026 16:36

ilitteri requested changes Apr 30, 2026

View reviewed changes

github-project-automation Bot moved this from In Review to In Progress in ethrex_l1 Apr 30, 2026

pablodeymo requested changes Apr 30, 2026

View reviewed changes

Conversation

azteca1998 commented Apr 29, 2026

Summary

Context

Test plan

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Lines of code report

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Block Execution Results Comparison Against Main

Uh oh!

greptile-apps Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

iovoid Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

ElFantasma left a comment

Choose a reason for hiding this comment

Uh oh!

ElFantasma Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

ElFantasma Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

ElFantasma Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

ElFantasma Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

ilitteri left a comment

Choose a reason for hiding this comment

Uh oh!

pablodeymo left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

github-actions Bot commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading

greptile-apps Bot commented Apr 29, 2026 •

edited

Loading