diff --git a/docs/entity-db-sidecar-response.md b/docs/entity-db-sidecar-response.md new file mode 100644 index 0000000..2d38203 --- /dev/null +++ b/docs/entity-db-sidecar-response.md @@ -0,0 +1,1148 @@ +# Response: Sidecar + Commitment Model for Arkiv EntityDB + +## Contents + +- [Abstract](#abstract) +- [1. Relationship to the Original Spec](#1-relationship-to-the-original-spec) +- [2. Sidecar Transport and Commitment Model](#2-sidecar-transport-and-commitment-model) +- [3. Data Flow](#3-data-flow) +- [4. Tiered Entry Points](#4-tiered-entry-points) +- [5. Op Struct and Vocabulary](#5-op-struct-and-vocabulary) +- [6. Property Decomposition](#6-property-decomposition) +- [7. Witness Proofs and Challenge Games](#7-witness-proofs-and-challenge-games) +- [8. Reth Integration Points](#8-reth-integration-points) +- [9. Open Questions](#9-open-questions) +- [10. Migration Path](#10-migration-path) +- [11. Summary](#11-summary) + +--- + +> **Status.** This is a design response to the Arkiv EntityDB precompile + ExEx +> specification. It builds on the throughput analysis in +> `payload-commitment-analysis.md` and proposes a refinement of the data +> path. Does not supersede the original spec — most of it is preserved. + +--- + +## Abstract + +The Arkiv EntityDB spec gets the four-component architecture +(contract / precompile / ExEx / DB) right, and the failure semantics are +well-considered. The one bottleneck it leaves untouched is **calldata +cost**: per the throughput analysis, ~99% of `execute()` gas on a +payload-bearing operation is calldata for the payload bytes themselves. A +127 KB CREATE costs ~2.1M gas today, ~5.25M under EIP-7623 floor pricing. +None of that gas pays for state work — it pays for moving bytes through +the EVM that the EVM never reads. + +This document proposes a refinement that decouples payload bytes from the +EVM-visible portion of the transaction. Payloads travel as a transport-level +sidecar, attached at RPC ingress and never entering calldata. Calldata +carries only fixed-size commitments (one `bytes32` per payload). The +Arkiv precompile reads sidecar bytes via reth-provided execution-context +data (modelled on EIP-4844's `BLOBHASH` mechanism) and forwards ops plus +bytes to the DB component in a single call. The contract surface gains +three named entry points sharing a single `Op` struct, distinguished by +how payload bytes reach the precompile (sidecar vs calldata vs none). The +ExEx and DB component are unchanged from the original spec. + +The result is a system that delivers three independently useful properties +through three independent mechanisms: + +- **Integrity** — keccak commitments in signed calldata bind the chain to + specific payload bytes; anyone holding the bytes can verify them. +- **Data availability** — provided by an external DA layer (Celestia, + EigenDA, or a DAC), not by calldata. +- **Censorship resistance** — provided by a tiered entry-point design + where one path admits payloads in calldata for forced inclusion via the + L2 inbox. + +A fourth property — **execution correctness** — is left to future work +and not provided by this design alone. The commitment scheme is the +foundation on which any future fraud-proof or validity-proof system would +build. + +Headline outcomes: + +- ~99% calldata gas reduction on the hot path (per the throughput + analysis, ~39k gas vs ~2.1M for a 127 KB CREATE). +- The three properties above can be shipped independently, in any order. +- No fundamental change to the precompile / ExEx / DB factoring of the + original spec; the changes are confined to the contract surface, the + reth ingress path, and the `Op` struct. + +--- + +## 1. Relationship to the Original Spec + +### What this response preserves + +- Four-component architecture: EntityRegistry contract, Arkiv precompile, + Arkiv ExEx, Go DB component. +- Three-state precompile result (`ok` / `revert` / `fatal`) and + halt-on-fatal semantics. +- ExEx-driven block lifecycle with deferred `arkiv_stateRoot` submission + in block N+1. +- Reorg handling via PathScheme reverse diffs and PebbleDB journal replay. +- Validator and archive node profiles. +- Two-level proof chain (entity → `arkiv_stateRoot` → L3 stateRoot at + block N+1). +- The DB component's internal storage model: `arkiv_payload`, + `arkiv_longstr`, `arkiv_bm`, `arkiv_attr`, `arkiv_id`, `arkiv_addr`, + `arkiv_pairs`, `arkiv_exp`, `arkiv_root`, `arkiv_journal`. +- Bitmap GC, journal pruning, retention window. +- Responsibility matrix from §11 of the original (with adjustments noted + in §8 of this document). + +### What this response changes + +- `Op` struct: `bytes payload` is removed. Commitments and payload bytes + are decoupled from `Op` entirely and travel as parallel arguments to + the entry points. `Op` carries only operation metadata. +- Contract surface: a single `execute()` becomes three named entry points + (§4) sharing internal dispatch. Hot path supports chunked commitments + per op (`bytes32[][]`); cold path uses a single bytes blob per op + (`bytes[]`). +- Calldata gas cost on the hot path drops by ~99% on payload-bearing ops. +- Precompile interface: a single `validate_and_process_ops` call replaces + the previous `applyOps` shape; sidecar reads become an internal + capability of the precompile rather than a separately exposed surface. +- New transport-level concept: a per-block sidecar bundle staged at RPC + ingress. +- DA story is moved out of the contract surface entirely; it lives at the + DAL/DAC integration layer (§6.2). + +### What remains open and unchanged + +- DB gas charging model (folded vs separate value transfer; original + §12). +- ExEx submission tx delivery mechanism (mempool vs privileged system tx). +- Halt-state recovery semantics on fatal conditions. +- `MAX_ATTRIBUTES` calibration and storage cost analysis. + +--- + +## 2. Sidecar Transport and Commitment Model + +### 2.1 Stripping at RPC ingress, not at EVM execution + +Payload bytes are removed from the EVM-visible portion of the transaction +at **RPC ingress** — before the transaction enters the mempool or the +block builder. Stripping mid-execution (rewriting calldata in a +pre-execution hook) is rejected here on principle: it breaks the EVM +property that calldata is what the signer signed, and it pushes +determinism complexity into the executor for no benefit. + +A new RPC method `arkiv_exec` accepts the transaction and its sidecar as +separate fields: + +```json +{ + "method": "arkiv_exec", + "params": [{ + "tx": "0x...", + "sidecar": { + "payloads": ["0x...", "0x...", "..."] + } + }] +} +``` + +At ingress, the sequencer's reth node: + +1. Decodes the transaction; recovers the signer. +2. Decodes the operation batch from calldata. +3. For each declared commitment `c[i]`: + - Asserts `keccak256(sidecar.payloads[i]) == c[i]`. + - On mismatch: rejects the RPC call. The transaction is never + enqueued. +4. Stages the sidecar in a block-scoped sidecar bundle, keyed by tx hash. +5. Hands the transaction to the mempool / block builder. + +By the time the transaction reaches EVM execution, the binding between +calldata commitments and sidecar bytes is already established by the RPC +layer. The Arkiv precompile re-verifies as defence in depth (§2.4) but +relies on the ingress check for the authoritative reject path. + +### 2.2 What is signed + +Standard EVM signature semantics are unchanged. The signer signs the +transaction including its calldata, and the calldata contains the +commitments. The sidecar is not part of the signed envelope — but because +the RPC rejects any submission whose sidecar doesn't match the committed +hashes, the signer's signature transitively binds the sidecar bytes via +the commitments. + +This is the same model as EIP-4844: the transaction signs over +`blob_versioned_hashes`; sidecar blobs must match those hashes; nothing +additional is signed. + +### 2.3 Commitment scheme + +For v0: `commitment = keccak256(payload)`. Cheap, no trusted setup, +universally available, sufficient for the integrity properties described +in §6.1. + +For future upgrade: KZG (EIP-4844-style), +`commitment = versioned_hash(KZG_commit(payload))`. Adds: + +- Cheap **partial opens** via the point-evaluation precompile — useful + for fraud-proof games disputing a chunk of a payload without revealing + the full payload on-chain. +- Native compatibility with EIP-4844 blob posting if Arkiv ever lands + payload data on Ethereum L1 directly. +- Path to data-availability sampling if the sequencer set ever + decentralizes. + +The commitment format should be **versioned from day one**. Reserve the +leading byte of each `bytes32` commitment for a scheme tag: + +- `0x00` — keccak256(payload) +- `0x01` — KZG-versioned hash (future) + +This costs nothing today (the commitment is `bytes32` either way, and the +contract treats commitments as opaque) and eliminates a wire-format break +when KZG is added later. The contract does not interpret the version +byte; only RPC ingress and any future challenge mechanism need to. + +### 2.4 Sidecar access from the precompile + +Reth makes the per-transaction sidecar available to the Arkiv precompile +via execution-context data, modelled on EIP-4844's `BLOBHASH` mechanism. +The precompile reads: + +- The commitment for sidecar entry `i` (a `bytes32`). +- The byte length of sidecar entry `i`. +- The total sidecar entry count. +- The bytes themselves for entry `i`, when forwarding to the DB component. + +This is **not exposed as a separate `SidecarReader` precompile**. There +is one consumer (the Arkiv precompile) and no general use case for other +contracts on the L3 to read sidecars. Folding sidecar access into the +Arkiv precompile keeps the reth-side surface narrow and avoids a +public-facing read interface that would be meaningless from any other +caller. + +The defence-in-depth check the Arkiv precompile performs (full chunked +form in §5.4): + +```text +sidecarOffset = 0 +for each payload-bearing op[i]: + for j in 0 .. commitments[i].length: + assert sidecar.commitment(sidecarOffset + j) == commitments[i][j] + sidecarOffset += commitments[i].length +assert sidecarOffset == sidecar.count() +``` + +This is one comparison per chunk (commitments are already cached by +reth from ingress; no rehashing needed). It catches bugs in the +ingress validation, the sidecar bundle, or the contract's commitment +forwarding without measurable cost. + +--- + +## 3. Data Flow + +### 3.1 Hot path — `exec_with_commitments` + +```mermaid +sequenceDiagram + autonumber + participant C as Client + participant R as arkiv_exec RPC + participant B as Sidecar Bundle
(reth memory) + participant M as Mempool + participant E as EVM
(EntityRegistry) + participant P as Arkiv Precompile + participant D as DB Component
(Go process) + participant X as Arkiv ExEx + participant L as DA Layer + + Note over C,L: Hot path — exec_with_commitments + + C->>R: arkiv_exec(tx, sidecar) + R->>R: verify keccak(sidecar[i])
== commitments[i] + R->>B: stage(txHash, sidecar) + R->>M: enqueue tx + R-->>C: tx hash + + M->>E: include tx in block N + E->>E: validate ops + commitments,
accumulate changesetHash + E->>P: validate_and_process_ops(ops, commitments) + P->>B: walk ops, read sidecar
entries for each chunk + P->>P: defence-in-depth
per-chunk commitment recheck + P->>D: arkiv_applyOps(ops, payloads, changesetHash) + D->>D: stage per-tx state,
compute db_gas + D-->>P: status=ok, db_gas, payloadHashes + P-->>E: charge db_gas, return + E->>E: emit events + + Note over E,X: Block N sealed + E-->>X: ChainCommitted + X->>D: arkiv_commitBlock(N) + D->>D: flush staging to trie + D-->>X: arkiv_stateRoot_N + X->>M: setArkivStateRoot(N, root) tx + Note over M: lands in block N+1 + + par DA publication (async) + D->>L: publish(payloads, commitments) + L-->>D: ack + end +``` + +Key points: + +- Steps 2–3 happen at the RPC layer; the mempool only ever sees the slim + transaction with commitments. +- Step 9 (sidecar bytes reaching the precompile) does not cross the IPC + boundary into the DB until step 10 — the bytes are forwarded as part + of the `arkiv_applyOps` request body. +- Steps 14–18 are the deferred state-root anchoring, identical to the + original spec's ExEx behaviour. +- DA publication runs asynchronously; block sealing does not wait on DA + acknowledgement in v0 (see §9 q1). + +### 3.2 Cold path — `exec_with_payloads` + +```mermaid +sequenceDiagram + autonumber + participant C as Client + participant L2 as L2 Inbox + participant M as Mempool + participant E as EVM + participant P as Arkiv Precompile + participant D as DB Component + participant X as ExEx + + Note over C,X: Cold path — exec_with_payloads (force-included via L2) + + C->>L2: enqueue tx
(calldata = ops + payloads) + L2->>M: forced inclusion after deadline + M->>E: include tx in block N + + E->>E: for each payload-bearing op i,
derive c = keccak(payloads[i]),
enforce size cap + E->>P: validate_and_process_ops(ops, derived commitments, payloads) + P->>D: arkiv_applyOps(ops, payloads, changesetHash) + D-->>P: status, db_gas, payloadHashes + P-->>E: charge db_gas + + E-->>X: ChainCommitted + X->>D: arkiv_commitBlock(N) + D-->>X: arkiv_stateRoot_N + X->>M: setArkivStateRoot tx +``` + +Key differences from the hot path: + +- No sidecar at all — bytes are in calldata. +- No `arkiv_exec` RPC; the transaction can be submitted via standard + paths or via the L2 inbox for forced inclusion. +- The contract performs the commitment check (step 4) before invoking + the precompile; the precompile does not re-verify because the bytes + came from calldata, not a separate channel. + +### 3.3 Metadata-only path + +`exec_metadata_only` is the cold path minus payloads: the contract +rejects any payload-bearing op, then dispatches `Op[]` directly to the +precompile, which forwards to the DB without any payload data. No +diagram needed — the flow is the cold path with steps 4 and any +payload-related work elided. + +--- + +## 4. Tiered Entry Points + +The contract exposes three named entry points. They share validation +logic and dispatch through a single internal `_exec_internal` after +resolving the byte source for each op. + +### 4.1 `exec_with_commitments(Op[] ops, bytes32[][] commitments)` — hot path + +Default path. Expected to carry ~99% of traffic. + +- `commitments[i]` is the chunk list for `ops[i]`. Empty for metadata + ops; one or more entries for payload-bearing ops. +- `commitments.length == ops.length` (dense; see §5.1). +- Each chunk corresponds to one sidecar entry; addressing is implicit + positional with a running offset (§5.4). +- Bytes live in the sidecar; the precompile reads them via the + reth-provided sidecar context. +- Cheap (~39k gas per CREATE per the throughput analysis), independent + of payload size. +- Fast. +- Per-op chunk count capped at `MAX_CHUNKS_PER_OP` (§5.1) — bounds + on-chain validation cost while still admitting multi-MiB entities. +- **Sequencer-cooperation required**: the transaction cannot be + force-included from L2 because the sidecar is not part of the signed + transaction. Forced inclusion via the L2 inbox would deliver only the + calldata, with no sidecar to match against. + +### 4.2 `exec_with_payloads(Op[] ops, bytes[] payloads)` — cold path + +Censorship-resistance escape hatch. + +- `payloads[i]` is the raw byte string for `ops[i]`. Empty bytes for + metadata ops; non-empty for payload-bearing ops. +- `payloads.length == ops.length` (dense; see §5.1). +- **One commitment per op, no chunking.** The contract derives + `c = keccak256(payloads[i])` for each payload-bearing op and + internally constructs `commitments[i] = [c]`. Chunking inline bytes + buys nothing; the bytes are already contiguous in calldata. +- Per-op size capped at `MAX_COLD_PATH_BYTES_PER_OP` (§5.1) — bounds + the worst-case calldata cost on the L2 inbox. +- Expensive (~2M gas per CREATE for 127 KB payload, dominated by + calldata cost — same as today). +- **Self-contained**: the entire transaction can be force-included via + the L2 inbox; no sidecar dependency. + +Large entities (above the cold-path cap) are hot-path only. Forced +inclusion has a natural size ceiling dictated by L2 calldata gas +budgets; this design surfaces the ceiling explicitly rather than +leaving it as a gas-driven implicit limit. + +This path exists specifically for cases where the sequencer is censoring +a user's transaction. It is **not** a data-availability mechanism — DA is +provided by the DAL/DAC integration described in §6.2. The fact that +cold-path bytes also end up in L2 calldata as part of the L3's batch is +incidental and should not be marketed as DA. + +### 4.3 `exec_metadata_only(Op[] ops)` — metadata-only path + +Always cheap, always force-includable. + +- Accepts only `SetAttributeOp`, `DeleteAttributeOp`, `DeleteOp`, + `ExtendOp`, `ChangeOwnerOp`. Reverts if the batch contains any + payload-bearing op. +- No sidecar reads, no payload arguments. +- Lower gas pricing tier possible (chain parameter). +- Useful for tools and integrations that never produce payloads + (housekeeping, batch attribute updates, lifecycle ops), and for + force-including metadata operations during sequencer outages without + paying calldata cost for payload bytes that don't exist. + +### 4.4 Internal dispatch + +The three entry points converge on a single internal call to the +precompile after normalising their inputs to a uniform shape: + +```text +_exec_internal(ops, commitments, payloads_or_null): + // commitments always populated: + // hot path — supplied by caller + // cold path — derived from keccak256(payloads[i]) per op + // metadata — empty inner arrays everywhere + // payloads_or_null: + // hot path — null (precompile reads from sidecar) + // cold path — supplied by caller + // metadata — null + precompile.validate_and_process_ops(ops, commitments, payloads_or_null, changesetHash) +``` + +Consequences: + +- The precompile receives a uniform shape regardless of entry point. +- Mixed batches within a single entry point may include metadata ops + alongside payload-bearing ones (already true in the original spec). +- Adding a future entry point (e.g., one that supports KZG commitments + alongside keccak) does not require changes to internal dispatch. +- Cross-entry-point batches (some via sidecar, some via calldata) are + not supported in v0. The entry point selects the byte source for all + payload-bearing ops in the batch. A genuine mix can use two + transactions. + +### 4.5 Naming + +The external names should communicate the *guarantee*, not the +*mechanism*. Suggested final names (subject to bikeshedding): + +- `exec` — hot path. +- `exec_durable` or `exec_anchored` — cold path. +- `exec_metadata` — metadata-only path. + +`exec_with_payloads` is technically descriptive but invites the +misreading that it is somehow faster because "the payloads are right +there". Users will reach for it for the wrong reason. Naming around the +property (durability / anchoring) avoids this. + +--- + +## 5. Op Struct and Vocabulary + +### 5.1 A single, shared `Op` struct + +All three entry points use the same `Op` struct. **Commitments and +payload bytes are decoupled from the struct entirely** — they travel as +parallel arguments alongside `ops`, indexed positionally. `Op` carries +only operation metadata that every entry point needs. + +```solidity +type LongString is bytes32; // keccak256 commitment to a string ≤256 bytes + +struct Op { + uint8 opType; // CREATE, UPDATE_PAYLOAD, SET_ATTR, ... + bytes32 entityKey; // 20-byte address padded to 32; zero on CREATE + LongString contentType; // commitment to MIME string; full value in DB + Attribute[] attributes; // type-tagged attribute entries + uint64 expiresAt; + address newOwner; // CHANGE_OWNER only +} + +// Hot path: bytes live in sidecar; commitments declared per-op, chunked. +function exec_with_commitments( + Op[] calldata ops, + bytes32[][] calldata commitments // commitments[i] is chunk list for ops[i] +) external; + +// Cold path: bytes live in calldata; one bytes blob per op (no chunking). +function exec_with_payloads( + Op[] calldata ops, + bytes[] calldata payloads // payloads[i] is bytes for ops[i] +) external; + +// Metadata-only: rejects payload-bearing ops; no commitments, no payloads. +function exec_metadata_only(Op[] calldata ops) external; + +// Chain parameters (subject to calibration; see §9). +uint256 constant MAX_CHUNKS_PER_OP = 256; // 32 MiB per entity at 128 KiB chunks +uint256 constant MAX_COLD_PATH_BYTES_PER_OP = 128 * 1024; // 128 KiB; matches natural blob size +``` + +**Why decoupled:** + +- `Op` is shared across all three entry points. Putting commitment data + on `Op` would force metadata ops to carry zero fields they never use, + and would make `Op`'s shape vary by path. +- Hot path needs many commitments per op (chunked); cold path needs one + bytes blob per op (unchunked). Parallel arrays match the actual user + input shape. +- On the cold path, commitments are derived not supplied — putting them + on `Op` would be misleading. +- `entityHash` composition is purely a function of `(metadata, commitments)` + and never of raw bytes (§5.4), so commitments are fundamentally + output-side data rather than input metadata. + +**Dense outer arrays.** Both `commitments` and `payloads` are required +to satisfy `length == ops.length`. Empty entries (zero-length inner +arrays / empty bytes) for non-payload-bearing ops. Calldata overhead is +~32 bytes per empty entry — negligible against the indexing +simplification. + +**Validation rules** the contract enforces before dispatch: + +```text +for each op[i] in ops: + if isPayloadBearing(op[i].opType): + hot path: require commitments[i].length >= 1 + require commitments[i].length <= MAX_CHUNKS_PER_OP + cold path: require payloads[i].length > 0 + require payloads[i].length <= MAX_COLD_PATH_BYTES_PER_OP + else: + hot path: require commitments[i].length == 0 + cold path: require payloads[i].length == 0 +``` + +### 5.2 Which ops are payload-bearing + +| Op | Payload-bearing | Allowed in `exec_metadata_only` | +|---------------------|:---------------:|:-------------------------------:| +| `CreateOp` | Yes | No | +| `UpdatePayloadOp` | Yes | No | +| `SetAttributeOp` | No (longString values are commitments; full value in DB) | Yes | +| `DeleteAttributeOp` | No | Yes | +| `DeleteOp` | No | Yes | +| `ExtendOp` | No | Yes | +| `ChangeOwnerOp` | No | Yes | + +Only two ops carry payloads. The hot/cold-path distinction therefore +applies to two operations, not the whole vocabulary. Five of seven ops +are unconditionally cheap and force-includable. + +### 5.3 Mixed batches + +A single `exec_with_commitments` call may contain `CreateOp`, +`UpdatePayloadOp`, and any metadata ops in any combination, with the +parallel `commitments` array carrying empty inner arrays for the +metadata ops. Same shape on the cold path with `payloads` carrying +empty bytes for metadata ops. Entry-point boundaries are not crossed +within a single batch. + +### 5.4 Sidecar indexing and `entityHash` composition + +**Sidecar indexing on the hot path.** The sidecar is a flat array of +entries. Each payload-bearing op consumes a contiguous slice sized by +its chunk count. The precompile walks ops with a running offset: + +```text +sidecarOffset = 0 +for each op[i] in ops: + if isPayloadBearing(op[i].opType): + for j in 0 .. commitments[i].length: + assert sidecar.commitment(sidecarOffset + j) == commitments[i][j] + // collect bytes from sidecar.payload(sidecarOffset + j) + sidecarOffset += commitments[i].length + // else: no sidecar interaction + +assert sidecarOffset == sidecar.count() // no extras, no shortfall +``` + +The trailing equality check ensures no sidecar entries are smuggled in +and no declared chunks are missing — the same invariant 4844 enforces +on blob counts. + +**`entityHash` composition.** `entityHash` is a function of +`(metadata, commitments[])`, **never** of raw payload bytes. The same +logical entity created via the hot path (1 chunk, 1 commitment) and the +cold path (single bytes blob → derived 1-element commitment list) +produces the same `entityHash`, because the contract sees the same +commitment sequence in both cases. Cold-path bytes never participate in +the hash directly. + +For multi-chunk hot-path entities, the commitment list is the +authoritative record. Reordering, omitting, or substituting any chunk +produces a different `entityHash`. + +--- + +## 6. Property Decomposition + +Four orthogonal properties, four independent mechanisms. Each can be +reasoned about and shipped independently. None of them depend on the +others to deliver their property. + +```text + integrity ← keccak commitments + sidecar match + data availability ← DAL / DAC (Celestia, EigenDA) + censorship resistance ← cold-path entry + L2 inbox + execution correctness ← future: fraud proofs or validity proofs +``` + +### 6.1 Integrity — commitments + +**Provided by:** keccak commitments in signed calldata; sidecar match +enforced at RPC ingress. + +**Gives:** + +- The chain is bound to specific payload bytes. +- Anyone holding the bytes can verify them against the on-chain + commitment. +- Tampering with stored bytes is detectable on re-fetch. +- Equivocation (sequencer serving different bytes to different parties) + produces a publishable proof of misbehaviour. + +**Does not give:** data availability, execution correctness. + +**Status:** deliverable in v0 with the design in §2–§4. + +### 6.2 Data Availability — DAL / DAC + +**Provided by:** external DA-layer integration — Celestia, EigenDA, or a +Data Availability Committee. + +**Mechanism:** + +- Sequencer publishes payload bytes to the DA layer alongside (or as + part of) block production. +- DA-layer commitment is recorded on-chain or referenced by an existing + on-chain commitment. +- Any party can retrieve bytes from the DA layer independent of the + sequencer. + +**Gives:** + +- "Anyone with the data" stops being conditional on sequencer + cooperation. +- Bytes remain retrievable for the DA layer's retention period. +- Integrity proofs (§6.1) become genuinely useful — there is reliable + data to challenge against. + +**Does not give:** execution correctness, censorship resistance. + +**Status:** integration work; not part of the contract surface. Layered +on top of v0 without contract changes. + +### 6.3 Censorship Resistance — cold path + L2 inbox + +**Provided by:** `exec_with_payloads` (and `exec_metadata_only`) plus +the L3's L2-inbox forced-inclusion mechanism. + +**Mechanism:** + +- User submits the transaction to the L2 inbox. +- L2 inbox enqueues the transaction for L3 inclusion. +- L3 protocol enforces inclusion within a deadline; the sequencer cannot + drop it. +- Transaction is self-contained: no sidecar dependency. + +**Gives:** + +- Guaranteed eventual inclusion of payload-bearing operations even when + the sequencer is hostile or unresponsive. +- Operation lands on-chain with correct commitments and reaches the DB + component when the sequencer next produces a block. + +**Does not give:** data availability beyond what DAL/DAC provides; +execution correctness; recovery for entities whose bytes the sequencer +has already served and discarded. + +**Status:** depends on the rollup stack's forced-inclusion +implementation. Contract surface is ready; protocol integration is a +separate workstream. + +### 6.4 Execution Correctness — future + +**Not provided** by this design. + +To deliver execution correctness on top of the commitments, two +ingredients are required: + +- **Reliable data availability** (which DAL/DAC provides) so challengers + can replay disputed blocks. +- **A proving system** — fraud proofs (Optimism / Arbitrum-style) or + validity proofs (ZK) — that takes (prior state root, transaction + batch, payload bytes) and produces (next state root) + deterministically. + +The DB component and the precompile must both be replayable in the +proving environment for fraud proofs to work. This is a substantial +project not contemplated by the original spec or this response. + +The current design does not preclude either system. The commitment +scheme is the data plane on which any future proving system would +build. This is the correct ordering: ship the foundation now; layer +proofs on top later. + +**Status:** future work; out of scope for v0. Mentioned here so the +property decomposition is honest and complete. + +--- + +## 7. Witness Proofs and Challenge Games + +The original spec does not use the term "witness proof", but the +discussion that produced this response did. Its scope and meaning need +disambiguation, because the same word commonly refers to two distinct +things. + +### 7.1 Per-tx integrity check + +What the RPC ingress (and, defensively, the precompile) performs: + +```text +assert keccak256(sidecar.payloads[i]) == calldata.commitments[i] +``` + +A single hash comparison. Calling this a "proof" is a stretch — under +keccak the payload bytes are their own witness. For v0 the term should +be avoided here. Use **commitment match** or **integrity check**. + +### 7.2 Challenge artifact + +What would be exchanged in a fraud-proof game (future, §6.4): + +- Challenger asserts "the sequencer's claimed state at block N is wrong". +- Challenger fetches payload bytes from the DA layer. +- Challenger replays the disputed block locally. +- The disputed step is narrowed (interactive bisection, in classical + fraud-proof designs) to a specific operation. +- Resolution requires verifying that the contested bytes are the bytes + the chain committed to. + +Under keccak, this is still just "reveal the bytes; the verifier hashes +them; compare". Simple. + +Under KZG, this is a point-evaluation: the challenger reveals an opening +at a specific point; the on-chain point-evaluation precompile verifies +it. Cheaper for partial disputes; required if the disputed payload is +too large to put on-chain. + +This artifact is what fraud-proof literature would call a **witness +proof**. In v0 there is no challenge game and therefore no witness proof +in this sense. + +### 7.3 Recommendation + +- Reserve **proof** for §7.2 artifacts. +- Use **commitment** or **integrity check** for §7.1. +- Do not include a `witness_proofs[]` field in the per-tx wire format. + Under keccak, payload bytes are sufficient; under KZG (future), proofs + are challenge-time artifacts, not per-tx data. + +--- + +## 8. Reth Integration Points + +This section describes the reth-side changes required to deliver the +sidecar transport. Three concerns: ingress, sidecar storage, precompile +behaviour. The ExEx and DB component are unchanged. + +### 8.1 `arkiv_exec` JSON-RPC method + +A new JSON-RPC method on the sequencer's reth node. Accepts +`(tx, sidecar)`. Performs ingress validation (§2.1) and stages the +sidecar. + +This is the canonical entry point for hot-path transactions. Standard +`eth_sendRawTransaction` is rejected for transactions whose calldata +declares non-zero commitments without an attached sidecar (the contract +will revert at execution time when the sidecar reads return zero, but +the RPC layer should reject earlier with a clear error). For +`exec_with_payloads` and `exec_metadata_only`, standard transaction +submission paths work as normal — no sidecar is involved. + +### 8.2 Sidecar bundle (block-scoped) + +Block-scoped storage in reth, keyed by transaction hash. Holds sidecar +bytes from RPC ingress until the block is sealed. + +**Lifetime:** + +- **Created** at `arkiv_exec` ingress, after commitment validation. +- **Referenced** during EVM execution by the Arkiv precompile. +- **Persisted** to DAL/DAC alongside block production (§6.2). +- **Dropped** after block sealing or on transaction failure. + +**Restart behaviour.** If the node restarts between EVM execution and +block sealing, the in-memory sidecar bundle is lost. This is fine +because the block was not sealed — on restart, transactions are +re-executed from the mempool and sidecars must be re-submitted. The +mempool layer should reject transactions whose sidecars are missing. + +**Reorg behaviour.** Sidecars are scoped to a specific block; on reorg, +the sidecars for reverted blocks are no longer needed (the operations +have been discarded by the DB via `arkiv_revert`). For the new chain's +blocks, the ExEx already re-reads operation batches from receipts/logs +as in the original spec; sidecar bytes for those operations must be +re-supplied via `arkiv_exec` from the original submitters, or fetched +from the DA layer if already published. This is identical to +mempool-replay semantics for ordinary transactions and should not +require new machinery. + +**Multi-node consideration.** For the current single-sequencer L3, the +sidecar is in the sequencer's memory and no propagation is needed. For +a future multi-node L3, sidecars must propagate alongside transactions +(mempool gossip with an extension field), exactly as 4844 blob +sidecars do today. Out of scope for v0. + +### 8.3 Arkiv precompile + +The Arkiv precompile is the **single bridge** between EVM execution and +the DB component. It also subsumes what would otherwise be a separate +sidecar-reader surface — sidecar access is an internal capability of +this precompile, not exposed to other contracts. + +#### 8.3.1 Entry point + +```text +validate_and_process_ops(input_blob) → return_blob +``` + +`input_blob` is ABI-encoded and conveys: + +- `Op[] ops` — the operation batch (metadata only; no commitments or + bytes inside the struct). +- `bytes32[][] commitments` — chunk list per op, dense over `ops` + (length equal to `ops.length`). Empty inner arrays for non-payload + ops. On the cold path the contract has already populated this from + `keccak256(payloads[i])` per op (single-element inner arrays). +- `bytes[] payloads` — present in cold-path calls, empty in hot-path + and metadata-only calls. Dense over `ops`. When present, the contract + has already verified `keccak256(payloads[i]) == commitments[i][0]` per + payload-bearing op. +- `bytes32 changesetHash` — the contract's accumulated changeset hash, + forwarded for the DB-side integrity recheck (the existing + fatal-on-mismatch tripwire). + +`return_blob` carries: + +- `db_status` ∈ {`ok`, `revert`, `fatal`}. +- `db_gas_used` (charged by the precompile to the EVM gas counter). +- `db_revert_reason` (string, present on `revert`/`fatal`). +- `payloadHashes[]` (consistency check; demoted from authoritative — + see §8.3.5). + +#### 8.3.2 Behaviour + +```text +1. Resolve byte source per op + chunk: + sidecarOffset = 0 + for each op[i] in ops: + if not isPayloadBearing(op[i].opType): continue + for j in 0 .. commitments[i].length: + c = commitments[i][j] + if cold path (payloads non-empty): + // single-chunk; bytes already verified by contract + bytes_ij = payloads[i] // j is always 0 on cold path + else (hot path): + assert sidecar.commitment(sidecarOffset) == c // defence in depth + bytes_ij = sidecar.payload(sidecarOffset) + sidecarOffset += 1 + if hot path: + assert sidecarOffset == sidecar.count() // no extras / shortfall + +2. Build IPC request to DB: + arkiv_applyOps(ops, commitments, payload_bytes_per_chunk, changesetHash) + +3. Call DB synchronously (HTTP/JSON-RPC for v0; UDS+Cap'n Proto later). + +4. Inspect response: + - status == fatal → return Err(...) from the block executor + (halts block production; no EVM state committed). + - status == revert → revert(db_revert_reason); discard staging on DB side. + - db_gas_used > remaining → revert(OutOfGas); signal DB to discard staging. + - Otherwise → charge db_gas_used; return ABI-encoded response. +``` + +The three-state model (`ok` / `revert` / `fatal`) and halt-on-fatal +semantics are unchanged from the original spec. + +#### 8.3.3 Sidecar access — internal, not a separate precompile + +Reth exposes the per-tx sidecar to the Arkiv precompile via +execution-context data. The original `SidecarReader` precompile +proposed in earlier drafts is **dropped**: there is one consumer +(the Arkiv precompile), and exposing sidecar reads to arbitrary +contracts on the L3 has no use case. Folding sidecar access into +the Arkiv precompile keeps the reth-side surface narrow. + +The capabilities the precompile uses internally: + +| Operation | Use | +|---|---| +| `sidecar.commitment(i)` | Read declared commitment for sidecar entry `i`. | +| `sidecar.size(i)` | Read byte length of sidecar entry `i`. | +| `sidecar.payload(i)` | Read the bytes themselves (forwarded to DB IPC). | +| `sidecar.count()` | Total entries staged for the current tx. | + +These are not opcodes; they are reth-internal accessors over the +execution-context sidecar bundle. The mechanism mirrors how +`BLOBHASH` is wired in EIP-4844, but the surface stays inside reth. + +#### 8.3.4 Cold-path bytes + +When called from `exec_with_payloads`, the contract has already derived +`commitments[i][0] = keccak256(payloads[i])` and verified the size cap +for each payload-bearing op. The precompile does not re-hash them (the +contract is the authority that already verified). It forwards bytes +directly to the DB component as part of the `arkiv_applyOps` request. + +#### 8.3.5 `payloadHashes[]` demoted + +In the original spec, the DB returned `payloadHashes[]` so the contract +could populate the entity record. Under sidecar transport, the contract +already has the commitment (from calldata). `payloadHashes[]` becomes +redundant for the contract's primary use, but remains useful as a +**consistency check**: the contract MAY assert that the DB-returned +hash matches the calldata commitment, catching any drift in how the DB +interprets bytes vs how the chain committed to them. Retain the field; +demote it from "authoritative" to "consistency check". + +### 8.4 ExEx unchanged + +No changes to the ExEx from the original spec. Same `ChainCommitted` / +`ChainReverted` / `ChainReorged` handlers. Same deferred submission of +the state-root anchor in block N+1. + +### 8.5 DB component largely unchanged + +No structural changes. The internal storage model, bitmap GC, journal +pruning, retention window, validator/archive profiles are all preserved. + +Two minor adjustments: + +- DA-layer publication is a new responsibility (§6.2). Likely a + separate goroutine triggered on `commitBlock`, publishing payloads + asynchronously to the DA layer. Whether to gate block finalization + on DA-layer confirmation is an open question (§9 q1). +- `arkiv_applyOps` request body now conveys payload bytes (forwarded by + the precompile) rather than the contract embedding them; semantically + equivalent, just different upstream source. + +--- + +## 9. Open Questions + +In addition to the open questions in §12 of the original spec: + +1. **DA-layer publication: synchronous or asynchronous with block + sealing?** Synchronous gives stronger DA guarantees but adds latency + to block production. Asynchronous is faster but introduces a window + where state has progressed but payloads are not yet on the DA layer. + Recommendation: asynchronous in v0 with monitoring; revisit if + integrity-proof flows require stronger synchrony. + +2. **Sidecar size limits at RPC ingress.** Per-tx and per-block limits + should be configurable chain parameters and enforced by the RPC + handler, not just by the contract. The throughput analysis suggests + 25 MiB per tx and 128 MiB per block as starting points; these need + calibration against actual sequencer capacity. + +3. **Cold-path gas pricing.** Calldata for payloads is already expensive + under standard EVM gas rules. Should `exec_with_payloads` apply + additional pricing (higher DB gas multiplier) to discourage casual + use, or rely on calldata cost alone? Calldata cost alone seems + sufficient. + +4. **Mempool gossip of sidecars.** Non-issue for a single-sequencer L3. + Becomes a real concern if the L3 ever has multiple sequencers or + block-builders. Defer until decentralization is contemplated. + +5. **KZG migration trigger.** What conditions justify moving from keccak + to KZG? Likely candidates: (a) intent to land payloads on Ethereum L1 + as 4844 blobs, (b) implementation of a fraud-proof game that benefits + from cheap partial opens, (c) data-availability sampling for + sequencer redundancy. Any of these is a meaningful milestone; none + are v0 requirements. + +6. **`exec_metadata_only` exposure on L2 inbox.** Should this entry + point be reachable from the L2 inbox by default, or behind a separate + gas-priced path? Probably the former — metadata ops are small and + rare on the inbox path, and ensuring they're always force-includable + is a useful invariant. + +7. **Versioned commitment byte from day one.** The first byte of each + `bytes32` commitment is reserved for scheme version (`0x00` for + keccak v0). Costs nothing today and eliminates a wire-format break + later. Confirm this is acceptable before landing the commitment + scheme. + +8. **Pre-image discovery for replays after retention.** A challenger + replaying a historical block needs the payload bytes that fed into + it. If the entity has expired and the DA-layer retention window has + passed, the bytes may be irretrievable. This bounds the practical + depth at which fraud proofs (future) can operate. Worth being + explicit about, but does not affect the v0 design. + +9. **Hot-path tx that bypasses `arkiv_exec`.** A user submitting a + hot-path transaction via standard `eth_sendRawTransaction` (without + an attached sidecar) will be rejected at the contract level (sidecar + reads return zero, commitment mismatch). The error path should be a + clean revert with a clear reason string, not a fatal — this is user + error, not node misbehaviour. + +10. **`MAX_CHUNKS_PER_OP` calibration.** v0 default is 256 (yielding 32 + MiB max payload per op at 128 KiB chunks). This bounds on-chain + validation cost and the sidecar size of any single op. The right + value depends on the largest realistic entity Arkiv intends to + support and the per-op sidecar gas cost in the precompile's + chunk-walk. Calibrate against expected use cases. + +11. **`MAX_COLD_PATH_BYTES_PER_OP` calibration.** v0 default is 128 KiB, + chosen to match the natural blob size and to keep cold-path + transactions within ~2M gas of calldata cost. Larger entities are + hot-path only; this is intentional, but the exact threshold should + be reviewed against the L2 inbox's gas economics and the typical + size of entities users would realistically need to force-include. + +12. **Single-commitment cold path is intentional.** Chunking inline + bytes adds indexing complexity without transport benefit. If a + later need emerges (e.g., partial revelation in a challenge game), + extending the cold path to chunked is a non-breaking change — + `bytes32[][]` could be exposed as a third argument in a future + entry point — but v0 keeps the cold path single-blob. + +--- + +## 10. Migration Path + +Four phases, each independently shippable. Phases 2 and 3 can land in +either order. + +### Phase 0 — Foundation + +The original spec as written. Single `execute()` entry point, payloads +in calldata, no sidecar. Useful as a stepping stone if the precompile + +ExEx + DB plumbing needs to ship before sidecar transport is ready. +Calldata gas cost is high but acceptable for low-throughput integration +testing. + +### Phase 1 — Sidecar (this document) + +- Add the `arkiv_exec` RPC and the block-scoped sidecar bundle. +- Wire reth-internal sidecar access into the Arkiv precompile. +- Remove `bytes payload` from the `Op` struct entirely; commitments and + payload bytes travel as parallel arguments to the entry points. +- Add the three named contract entry points (§4) with chunked + commitments on the hot path and single bytes per op on the cold path. +- Add chain parameters `MAX_CHUNKS_PER_OP` and + `MAX_COLD_PATH_BYTES_PER_OP`. +- Update the precompile entry to `validate_and_process_ops` with the + `(ops, commitments, payloads, changesetHash)` shape. + +This is a contract ABI break; coordinate with all clients. Calldata gas +drops by ~99% on the hot path. + +### Phase 2 — DA-layer integration + +- Wire the sequencer's block production to publish payloads to Celestia + or EigenDA. +- Record DA-layer commitments on-chain (or alongside the existing + state-root anchor). +- Expose DA-proof retrieval via the DB component's API. + +No contract ABI changes. + +### Phase 3 — Force-inclusion via L2 inbox + +- Wire `exec_with_payloads` and `exec_metadata_only` to be reachable + from the L2 inbox. +- Confirm the L3 protocol honours the inbox's inclusion deadline. +- Document the user-facing flow for emergency inclusion. + +No contract ABI changes (entry points already exist from Phase 1). + +### Phase 4 — KZG and proving systems (future) + +- Optional. Add KZG-versioned commitment support (commitment-version + byte `0x01`). +- Layer a fraud-proof or validity-proof system on top of the commitment + plane. + +No contract ABI break (versioned commitments enable this). + +--- + +## 11. Summary + +The original Arkiv EntityDB spec gets the four-component factoring right +and handles failure modes carefully. The single thing it leaves on the +table is calldata cost — per the throughput analysis, ~99% of +`execute()` gas on a payload-bearing op is the payload bytes themselves, +paying calldata gas to traverse an EVM that never reads them. + +This response proposes: + +1. **Sidecar transport** with calldata-only commitments. ~99% gas + reduction on the hot path. EIP-4844-shaped. +2. **Three named entry points sharing a single `Op` struct**, with + commitments and payload bytes decoupled from `Op` and travelling as + parallel arguments. Hot path supports chunked commitments per op + (`bytes32[][]`); cold path uses one bytes blob per op (`bytes[]`). +3. **A single Arkiv precompile** (`validate_and_process_ops`) that + absorbs sidecar access as an internal capability — no separate + reader surface — and bridges to the DB component over IPC. +4. **Honest property decomposition**: integrity (commitment), DA + (DAL/DAC), censorship resistance (cold path + L2 inbox), execution + correctness (future). Four properties, four mechanisms, no + overclaiming. + +The result is a system that delivers three real and useful properties +in v0 — integrity, DA via DAL/DAC, censorship resistance — without +making promises about execution correctness it cannot keep, and with a +clear path to add proving systems later without breaking the wire +format. + +--- + +## References + +- `payload-commitment-analysis.md` — throughput analysis showing 99% of + `execute()` gas is payload calldata. +- `entity-registry-spec.md` — current EntityRegistry contract surface. +- `exex-jsonrpc-interface.md` — ExEx ↔ DB interface. +- [EIP-4844: Shard Blob Transactions](https://eips.ethereum.org/EIPS/eip-4844) +- [EIP-7623: Increase Calldata Cost](https://eips.ethereum.org/EIPS/eip-7623) diff --git a/docs/payload-commitment-analysis.md b/docs/payload-commitment-analysis.md new file mode 100644 index 0000000..b3767a9 --- /dev/null +++ b/docs/payload-commitment-analysis.md @@ -0,0 +1,1511 @@ +# Payload Commitment Analysis + +## TL;DR + +Payload calldata is 96% of the gas cost for entity operations. This document +proposes removing payload bytes from calldata entirely and replacing them +with cryptographic commitments (32 bytes each) that the sequencer's storage +layer backs. + +**Contract changes**: `bytes payload` in the Operation struct becomes +`bytes32[] payloadCommitments` — an array of per-blob KZG commitments +(one per 128 KiB chunk). The commitment array is hashed into `coreHash` +via EIP-712 array encoding. A ~127 KB CREATE drops from 2.1M gas to ~39k +gas (54x reduction; 135x under EIP-7623 floor pricing). + +**Transaction format**: A new transaction type separates calldata (operation +metadata the EVM executes) from a payload sidecar (raw bytes the sequencer +stores). The contract introspects sidecar metadata via `BLOBHASH`, +`BLOBSIZE`, and `BLOBCOUNT` opcodes — enough to verify commitments and +enforce storage limits without touching the payload bytes. + +**Storage economics**: An admin-controlled storage cap sets the upper bound +on total outstanding byte-blocks. A backpressure pricing curve (EIP-1559- +style exponential) increases storage fees as utilization approaches the cap. +Gas fees (EIP-1559 basefee) and storage fees are independent dimensions — +storage cost is per-operation, not per-transaction, so batching decisions +are driven by atomicity needs rather than cost gaming. + +**Sequencer pipeline**: A write-ahead pattern gates block finality on +confirmed storage writes. Payloads are staged in the storage backend +(pebble, postgres, rocksdb, or any store with atomic batch semantics) +before EVM execution; the block is only sealed after the storage commit +succeeds. If storage fails, the block is discarded and transactions are +re-queued. + +**Syncing and verification**: Other nodes fetch payloads from the +sequencer's API and verify them against on-chain commitments. The +commitment is the contract between the sequencer and every consumer — +any mismatch is provable. State roots and changeset hashes settle to L1. + +--- + +## Problem Statement + +In the current EntityRegistry design, payload bytes are passed as calldata to +`execute()` and hashed inline via `keccak256(payload)` to produce `coreHash`. +The payload is never written to contract storage — it exists only in calldata +and event logs for off-chain indexing. + +The issue: **calldata cost dominates everything else**. + +Measured calldata breakdown for a single CREATE with a ~127 KB payload +(`application/json`, 3 attributes): + +``` +Calldata breakdown (single CREATE, ~127 KB payload): +╭──────────────────────┬──────────┬────────────────┬─────────────╮ +│ Section │ Bytes │ Gas (Standard) │ Gas (Floor) │ +├──────────────────────┼──────────┼────────────────┼─────────────┤ +│ Selector │ 4 │ 64 │ 160 │ +│ ABI framing │ 224 │ 1,160 │ 2,900 │ +│ Operation fields │ 256 │ 1,276 │ 3,190 │ +│ Attributes (3×192B) │ 576 │ 3,228 │ 8,070 │ +│ Payload │ 129,888 │ 2,077,836 │ 5,194,590 │ +├──────────────────────┼──────────┼────────────────┼─────────────┤ +│ Total calldata │ 130,948 │ 2,083,564 │ 5,229,910 │ +│ + Intrinsic (21,000) │ │ 2,104,564 │ 5,250,910 │ +╰──────────────────────┴──────────┴────────────────┴─────────────╯ + +Payload share: 129,888 of 130,948 bytes (99.2%) +Payload gas: 2,077,836 of 2,083,564 (99.7% of standard calldata gas) +``` + +The payload is 99.2% of the calldata bytes and 99.7% of the calldata gas. +Everything else — selector, ABI framing, operation fields, three full +attributes — totals 1,060 bytes and 5,728 gas. The payload alone is +2,077,836 gas at standard rates. + +Under EIP-7623 floor pricing (48 gas/non-zero byte for data-heavy +transactions), the same CREATE costs **5.25M gas** — 2.5x the standard +rate, with the payload accounting for 5.19M of that. + +--- + +## Proposal: Commitment-Based Payload Storage + +Replace on-chain payload calldata with an off-chain payload commitment. The +sequencer (as the sole block producer in the application-specific chain) +witnesses the payload data, computes a cryptographic commitment, and makes +that commitment available to the contract. The contract hashes the commitment +into `coreHash` and `entityHash`, preserving the integrity chain. + +``` +Current flow: + Client → execute([...payload bytes...]) → keccak256(payload) → coreHash + +Proposed flow: + Client → submit payload to sequencer → sequencer computes commitment + Client → execute([...commitment only...]) → commitment → coreHash + Sequencer → stores payload on disk, maintains witness proof +``` + +The contract never sees the raw payload. It receives a fixed-size commitment +(32–48 bytes) that cryptographically binds the entity to the exact payload +data the sequencer witnessed. + +--- + +## Commitment Scheme Options + +### Option A: KZG Polynomial Commitment (Blob-Style) + +The sequencer treats each payload as a polynomial over the BLS12-381 scalar +field and computes a KZG commitment. This mirrors EIP-4844's blob handling. + +**How it works:** + +1. Client submits payload to sequencer as a sidecar (not in the transaction). +2. Sequencer encodes payload as a polynomial of up to 4096 field elements + (128 KiB per blob). Larger payloads span multiple blobs. +3. Sequencer computes `commitment = KZG_COMMIT(polynomial, SRS)` — a 48-byte + G1 point on BLS12-381. +4. Sequencer derives `versioned_hash = 0x01 || SHA256(commitment)[1:]` — a + 32-byte value compatible with EIP-4844's `BLOBHASH` opcode. +5. The contract receives `versioned_hash` (via a sequencer-provided opcode or + precompile, analogous to `BLOBHASH`) and hashes it into `coreHash`. +6. Sequencer stores the raw payload and the KZG proof on disk for the + entity's TTL. + +**On-chain cost (same ~127 KB CREATE, payload moved to sidecar):** + +``` +With commitment approach: + Calldata (op metadata + ABI): ~5,728 gas (selector, fields, 3 attrs) + Calldata (1 blob commitment): ~512 gas (32 bytes non-zero) + SSTORE (commitment): ~22,000 gas + Hashing (keccak256): ~600 gas + Execution overhead: ~10,000 gas + ────────── + Total: ~38,840 gas + + vs. current: 2,104,564 gas (54x reduction) + vs. current (EIP-7623 floor): 5,250,910 gas (135x reduction) +``` + +**Verification:** A challenger can invoke the point evaluation precompile +(50,000 gas) to verify that a specific data element is consistent with the +committed polynomial. This proves the sequencer's stored data matches the +on-chain commitment without revealing the full payload. + +**Proof characteristics:** + +| Property | Value | +|------------------|-------------------------------------------| +| Commitment size | 48 bytes (G1 point) → 32 bytes versioned hash | +| Proof size | 48 bytes (constant, regardless of payload size) | +| Prover time | ~42ms per 128 KiB blob | +| Verifier time | ~2ms (2 pairings) | +| On-chain verify | 50,000 gas (point evaluation precompile) | +| Trusted setup | Required (Ethereum's ceremony: 141k+ contributors) | +| Post-quantum | No | +| Erasure coding | Native (homomorphic property) | + +**Trade-offs:** + +- (+) Constant proof size regardless of payload — ideal for large payloads +- (+) Native Ethereum support — `BLOBHASH` opcode, point evaluation precompile +- (+) Homomorphic — supports data availability sampling (DAS) +- (+) Well-audited — powers all EIP-4844 blob transactions today +- (-) Requires trusted setup (mitigated by Ethereum's ceremony) +- (-) Not post-quantum secure +- (-) Payload capped at 128 KiB per blob (multiple blobs for larger payloads) +- (-) Sequencer must run BLS12-381 cryptography (c-kzg-4844 library) + +### Option B: Merkle Root Commitment (Hash-Based) + +The sequencer splits the payload into fixed-size chunks, builds a Merkle tree, +and provides the root as the commitment. + +**How it works:** + +1. Client submits payload to sequencer. +2. Sequencer chunks payload into 32-byte leaves (or a configurable chunk size). +3. Sequencer builds a binary Merkle tree and computes the root. +4. The contract receives the 32-byte Merkle root and hashes it into `coreHash`. +5. For verification, the sequencer provides Merkle inclusion proofs for + individual chunks. + +**Proof characteristics:** + +| Property | Value | +|------------------|-------------------------------------------| +| Commitment size | 32 bytes (Merkle root) | +| Proof size | O(log n) — ~384 bytes for 4096 chunks | +| Prover time | O(n) hashes — fast for typical payloads | +| Verifier time | O(log n) hashes | +| On-chain verify | ~5,000–15,000 gas (log n keccak256 ops) | +| Trusted setup | None | +| Post-quantum | Yes (hash-function security only) | +| Erasure coding | Not natively supported | + +**Trade-offs:** + +- (+) No trusted setup — hash function only +- (+) Post-quantum secure +- (+) Simplest implementation — standard library in every language +- (+) Cheapest on-chain verification for small proofs +- (+) Well-understood, battle-tested pattern +- (-) Proof size grows with payload size (logarithmic) +- (-) No homomorphic property — no DAS, no proof aggregation +- (-) Proving erasure-coding correctness requires additional machinery +- (-) Less standard in Ethereum's DA ecosystem (Ethereum chose KZG) + +### Option C: FRI-Based Commitment (STARK-Friendly) + +The sequencer encodes the payload as Reed-Solomon evaluations, commits via +Merkle tree, and produces a FRI proximity proof. + +**Proof characteristics:** + +| Property | Value | +|------------------|-------------------------------------------| +| Commitment size | 32 bytes (Merkle root of RS evaluations) | +| Proof size | O(log^2 n) — typically 10s of KB | +| Prover time | O(n log n) | +| Verifier time | O(log^2 n) hashes | +| Trusted setup | None | +| Post-quantum | Yes | +| Erasure coding | Native (FRI is a Reed-Solomon proof) | + +**Trade-offs:** + +- (+) No trusted setup, post-quantum secure +- (+) Natively proves data is a valid RS codeword — ideal for DAS +- (+) Aligned with STARK ecosystem (potential ZK proof composition later) +- (-) Larger proofs than KZG or Merkle (~10–50 KB) +- (-) Higher prover complexity +- (-) No Ethereum-native precompile support +- (-) Less mature tooling for data availability specifically + +### Recommendation + +For the initial implementation: **Option A (KZG)** as the primary scheme, +with **Option B (Merkle)** as a fallback or for environments without +BLS12-381 support. + +Rationale: + +1. The sequencer is an application-specific chain targeting Ethereum + settlement. KZG is the Ethereum-native commitment scheme with precompile + support already deployed. + +2. The payload sizes (up to 128 KiB per blob) align naturally with + EIP-4844's blob structure. For payloads larger than 128 KiB, multiple + commitments can be chained or aggregated. + +3. The constant 48-byte proof size means verification cost is independent + of payload size — a sequencer storing 1 KB and 100 KB payloads has the + same proof overhead. + +4. KZG's homomorphic property provides a natural path to DAS if the + sequencer network grows beyond a single node. + +5. Ethereum's trusted setup ceremony (141,000+ participants) is a public + good that the sequencer inherits for free. + +--- + +## Contract Changes + +### CoreHash Typehash + +The `coreHash` EIP-712 type string changes to replace `bytes payload` with +a commitment array: + +``` +Current: + CoreHash( + bytes32 entityKey, + address creator, + uint32 createdAt, + bytes32[4] contentType, + bytes payload, ← dynamic, hashed as keccak256(payload) + bytes32 attributesHash + ) + +Proposed: + CoreHash( + bytes32 entityKey, + address creator, + uint32 createdAt, + bytes32[4] contentType, + bytes32[] payloadCommitments, ← array of per-blob commitments + bytes32 attributesHash + ) +``` + +Per EIP-712 encoding rules, `bytes32[]` is encoded as +`keccak256(abi.encodePacked(elements))` — the keccak256 of the concatenated +array members. This produces a single 32-byte value in the ABI encoding, +same as `keccak256(payload)` did before. The typehash string changes, so +all existing entity hashes are invalidated (this is a breaking change to +the encoding scheme, not a migration). + +For a single-blob payload, the encoding is: +`keccak256(abi.encodePacked(commitment_0))` = `commitment_0` itself (a +single 32-byte element). For multi-blob: +`keccak256(abi.encodePacked(commitment_0, commitment_1, ..., commitment_n))`. + +### Entity.coreHash() + +```solidity +// Current signature: +function coreHash( + bytes32 key, + address creator, + BlockNumber createdAt, + Mime128 calldata contentType, + bytes calldata payload, // ← full payload bytes + Attribute[] calldata attributes +) internal pure returns (bytes32) + +// Proposed signature: +function coreHash( + bytes32 key, + address creator, + BlockNumber createdAt, + Mime128 calldata contentType, + bytes32[] calldata payloadCommitments, // ← per-blob commitment array + Attribute[] calldata attributes +) internal pure returns (bytes32) +``` + +The function body changes from `keccak256(payload)` to EIP-712 array +encoding of the commitments: + +```solidity +// Current: +keccak256(payload) + +// Proposed: +keccak256(abi.encodePacked(payloadCommitments)) +// Commits to the exact sequence and content of all blobs +``` + +### Operation Struct + +```solidity +// Current: +struct Operation { + uint8 operationType; + bytes32 entityKey; + bytes payload; // ← dynamic bytes, unbounded + Mime128 contentType; + Attribute[] attributes; + BlockNumber expiresAt; + address newOwner; +} + +// Proposed: +struct Operation { + uint8 operationType; + bytes32 entityKey; + bytes32[] payloadCommitments; // ← array of 32-byte blob commitments + Mime128 contentType; + Attribute[] attributes; + BlockNumber expiresAt; + address newOwner; +} +``` + +A single payload may span multiple 128 KiB blobs. Rather than a single +commitment over the entire payload (which would require a custom aggregation +scheme), each blob gets its own KZG commitment and the contract receives the +array. This keeps each commitment aligned with the standard 128 KiB blob +structure and avoids inventing a new commitment format for large payloads. + +The `payloadCommitments` array is empty for EXTEND, TRANSFER, DELETE, EXPIRE. +For CREATE/UPDATE, it contains one entry per 128 KiB blob (or partial final +blob) that makes up the payload. + +### Multi-Blob Chunking + +A payload larger than 128 KiB is split into sequential blobs at the sidecar +level. Each blob is independently committed: + +``` +Payload: 350 KiB + + blob 0: bytes[0..128K) → commitment_0 (full 128 KiB) + blob 1: bytes[128K..256K) → commitment_1 (full 128 KiB) + blob 2: bytes[256K..350K) → commitment_2 (94 KiB, zero-padded to field elements) + +Operation.payloadCommitments = [commitment_0, commitment_1, commitment_2] +``` + +Small payloads (<=128 KiB) produce a single-element array. The contract +doesn't need to know the blob size — it just hashes the commitment array +into `coreHash`. The sequencer handles the chunking and reassembly. + +### Commitment Generation Cost + +KZG commitment generation is the sequencer's prover cost — it runs once per +blob when the transaction is included. The contract never performs this +computation. + +| Payload size | Blobs | Sequential (c-kzg/BLST) | Parallelized (est. 8 cores) | +|--------------|-------|-------------------------|----------------------------| +| 64 KiB | 1 | ~42ms | ~42ms | +| 128 KiB | 1 | ~42ms | ~42ms | +| 512 KiB | 4 | ~168ms | ~50ms | +| 1 MiB | 8 | ~336ms | ~85ms | +| 10 MiB | 80 | ~3.4s | ~430ms | + +Benchmarks: c-kzg-4844 / BLST on AMD Ryzen 9 5950X. Batch verification of +64 blobs across 16 cores takes ~18ms total (~0.28ms/blob effective), +demonstrating near-linear parallelism. + +The commitment is the bottleneck only for very large payloads (10+ MiB) on +a single core. With even modest parallelism, the sequencer can commit +hundreds of megabytes per second. Disk I/O and network ingress are more +likely bottlenecks in practice. + +### Hashing the Commitment Array into coreHash + +The commitment array replaces `keccak256(payload)` in the EIP-712 encoding. +The array is hashed per EIP-712 array rules — `keccak256` of the +concatenated elements: + +```solidity +// In Entity.coreHash(): + +// Current: +keccak256(payload) + +// Proposed: +keccak256(abi.encodePacked(payloadCommitments)) +// i.e. keccak256(commitment_0 || commitment_1 || ... || commitment_n) +``` + +This produces a single `bytes32` that commits to the exact sequence of +blobs. Reordering, omitting, or substituting any blob changes the hash. +The encoding is deterministic and cheap — just a keccak256 over +`n * 32 bytes`. + +### Transaction Type: Decoupling Calldata from Payload Data + +A single `execute()` call can batch multiple CREATE and UPDATE operations, +each carrying its own payload. With the current design, all payloads are +packed into calldata — one large blob of bytes. The proposal requires a +clean separation between the two data planes: + +- **Calldata**: Operation metadata the EVM executes against (commitments, + entity keys, attributes, content types). Small, fixed-size per operation. +- **Payload sidecar**: Raw payload bytes the sequencer stores. Arbitrary + size, never enters the EVM. Indexed positionally — payload 0 corresponds + to the first operation that needs one, payload 1 to the second, etc. + +This mirrors EIP-4844's type-3 transaction structure, where blob data travels +alongside but separate from the transaction's calldata. The key difference: +in the Arkiv sequencer, the sidecar protocol is application-defined, not a +consensus-level transaction type. + +``` +Type-3 transaction (EIP-4844 analogy): + + ┌───────────────────────────────────────────────────────────────┐ + │ Transaction envelope │ + │ │ + │ calldata: execute([ │ + │ { CREATE, payloadCommitments: [0xa, 0xb, 0xc], ... } │ ← op 0: 3 blobs + │ { EXTEND, entityKey: 0xdef..., ... } │ ← op 1: no payload + │ { CREATE, payloadCommitments: [0xd], ... } │ ← op 2: 1 blob + │ ]) │ + │ │ + │ sidecar: [ │ + │ blob_0: <128 KiB, op 0 chunk 0>, │ + │ blob_1: <128 KiB, op 0 chunk 1>, │ + │ blob_2: <94 KiB padded, op 0 chunk 2>, │ + │ blob_3: <50 KiB padded, op 2 chunk 0>, │ + │ ] │ + └───────────────────────────────────────────────────────────────┘ +``` + +The sidecar is a flat array of blobs, ordered sequentially across +payload-bearing operations. Op 0 declares 3 commitments and consumes +sidecar blobs 0–2. Op 2 declares 1 commitment and consumes sidecar blob 3. +The contract doesn't need to know sidecar indices — it receives the +commitment array per operation and hashes it into `coreHash`. The sequencer +validates that each commitment matches its corresponding sidecar blob before +block inclusion. + +The total blob count for the transaction is the sum of all +`payloadCommitments.length` across payload-bearing operations. The sequencer +enforces per-tx blob limits at the mempool level. + +### Payload Introspection Opcodes + +With multi-blob payloads, the contract needs to verify commitments and +enforce storage limits across all blobs in an operation. The sidecar is a +flat array of blobs — the contract addresses them by global index. + +**Design: three opcodes** + +``` +BLOBHASH(index) → bytes32 commitment at sidecar blob index (or zero) +BLOBSIZE(index) → uint256 byte length of blob at index (or zero) +BLOBCOUNT() → uint256 total number of blobs in this transaction +``` + +These follow the existing `BLOBHASH` precedent from EIP-4844. `BLOBSIZE` +and `BLOBCOUNT` are additions that let the contract do accounting without +the sequencer having to pass sizes in calldata. + +Gas cost: 3 gas each (same as `BLOBHASH` — reads from tx execution context, +no crypto). + +**What this enables in the contract:** + +```solidity +function _create(Operation calldata op, BlockNumber current, uint256 blobStart) + internal returns (bytes32 key, bytes32 entityHash_, uint256 blobEnd) +{ + uint256 blobCount = op.payloadCommitments.length; + if (blobCount == 0) revert EmptyPayload(); + + // Validate each blob commitment against the sidecar + uint256 totalBytes = 0; + for (uint256 i = 0; i < blobCount; i++) { + uint256 idx = blobStart + i; + require(blobhash(idx) == op.payloadCommitments[i], "commitment mismatch"); + totalBytes += blobsize(idx); + } + + // Contract-enforced storage limits + require(totalBytes <= MAX_PAYLOAD_SIZE, "payload too large"); + + // Storage accounting — charge proportional to actual data + _accountStorage(msg.sender, totalBytes, op.expiresAt); + + // ... hash op.payloadCommitments into coreHash + blobEnd = blobStart + blobCount; +} +``` + +The `blobStart` offset is tracked as the contract iterates through +operations, advancing by each operation's blob count: + +```solidity +uint256 blobOffset = 0; +for (uint32 opSeq = 0; opSeq < ops.length; opSeq++) { + if (ops[opSeq].operationType == Entity.CREATE) { + (, , blobOffset) = _create(ops[opSeq], current, blobOffset); + } else if (ops[opSeq].operationType == Entity.UPDATE) { + (, , blobOffset) = _update(ops[opSeq], current, blobOffset); + } else { + require(ops[opSeq].payloadCommitments.length == 0); + _dispatch(ops[opSeq], current); + } +} +// Final check: all sidecar blobs consumed +require(blobOffset == blobcount(), "unconsumed blobs"); +``` + +The trailing `blobcount()` check ensures the sidecar contains exactly the +blobs the operations reference — no extra blobs smuggled in, no blobs left +unaccounted for. + +**Optional: `PAYLOAD_INFO` precompile (richer metadata)** + +If storage pricing needs non-zero byte granularity, a precompile can +return more metadata per blob: + +``` +Address: 0x0B (or sequencer-assigned precompile slot) +Gas cost: 100 +Input: 32 bytes — uint256 blob index +Output: 96 bytes: + [0:32] bytes32 commitment (versioned hash) + [32:64] uint256 totalBytes (blob length in bytes) + [64:96] uint256 nonZeroBytes (count of non-zero bytes) +``` + +This enables weighted storage pricing (dense data costs more) per the +storage accounting model described below. Add only if the simpler opcode +model proves insufficient. + +### Sidecar Blob Indexing + +The sidecar is a flat array of blobs. Each payload-bearing operation +consumes a contiguous slice of that array, sized by its +`payloadCommitments.length`. The contract tracks a running `blobOffset` +as it iterates through operations. + +``` +Sidecar blob layout for a batch: + + ops: [CREATE_0 (3 blobs), EXTEND_1 (0 blobs), CREATE_2 (1 blob)] + sidecar: [blob_0, blob_1, blob_2, blob_3 ] + ╰─── CREATE_0 ───╯ ╰ CREATE_2 ╯ + + blobOffset after op 0: 3 + blobOffset after op 1: 3 (no blobs consumed) + blobOffset after op 2: 4 + blobcount() == 4 ✓ +``` + +No explicit index field needed in the Operation struct — the offset is +deterministic from the operation ordering and each operation's commitment +array length. + +### Commitment Validation Flow + +End-to-end, for a batch with a 350 KiB CREATE, an EXTEND, and a 50 KiB +CREATE: + +``` +Client submits: + calldata: execute([ + CREATE_0: { payloadCommitments: [0xa, 0xb, 0xc], ... } ← 3 blobs (350 KiB) + EXTEND_1: { payloadCommitments: [], ... } ← no payload + CREATE_2: { payloadCommitments: [0xd], ... } ← 1 blob (50 KiB) + ]) + sidecar: [blob_0 (128K), blob_1 (128K), blob_2 (94K), blob_3 (50K)] + +Sequencer pre-validation: + 1. Total declared blobs: 3 + 0 + 1 = 4 + 2. Assert sidecar.length == 4 + 3. For each blob: compute KZG commitment, assert matches declared commitment + 4. Reject on any mismatch + +Contract execution: + blobOffset = 0 + + op 0 (CREATE_0, 3 blobs): + for i in 0..3: + assert blobhash(blobOffset + i) == op.payloadCommitments[i] + totalBytes += blobsize(blobOffset + i) + hash payloadCommitments into coreHash + blobOffset += 3 → now 3 + + op 1 (EXTEND_1, 0 blobs): + assert op.payloadCommitments.length == 0 + no sidecar interaction + blobOffset unchanged → still 3 + + op 2 (CREATE_2, 1 blob): + assert blobhash(3) == op.payloadCommitments[0] + totalBytes += blobsize(3) + hash payloadCommitments into coreHash + blobOffset += 1 → now 4 + + assert blobOffset == blobcount() → 4 == 4 ✓ + +Sequencer post-execution: + Store payload for CREATE_0 (reassembled from blobs 0-2) → storage backend + Store payload for CREATE_2 (blob 3) → storage backend +``` + +--- + +## Sequencer Responsibilities + +### Atomicity: Storage Before Finality + +The critical failure mode: the EVM commits a transaction (on-chain +commitment exists) but the storage write fails (payload lost). The entity +is on-chain with no backing data — irrecoverable. Disk I/O failures, +backend unavailability, or write errors are all plausible in production. + +The sequencer is the sole block producer. It controls when a block is +finalized. This means it can — and must — gate finality on confirmed +storage. The block production pipeline must be: + +``` +Block production pipeline: + + 1. RECEIVE Collect transactions + sidecars from mempool + 2. VALIDATE For each payload-bearing op: + - Compute commitment from sidecar + - Assert matches declared commitment + - Reject transaction on mismatch + 3. STAGE Write all payloads + blob proofs to storage (status = 'pending') + - Must be an atomic batch write or within a transaction + - Backend examples: postgres BEGIN/COMMIT, pebble WriteBatch, + rocksdb WriteBatch, or any store with atomic batch semantics + 4. EXECUTE Run all transactions in EVM (in-memory state transition) + - If any tx reverts: mark its staged entries for rollback + 5. CONFIRM Commit the storage write + - If commit fails: ABORT — discard entire block + - Do NOT finalize the block + - Re-queue transactions for next block attempt + 6. FINALIZE Seal block header, broadcast to network + - Only reaches here if storage commit succeeded + - On-chain state and storage are consistent +``` + +The invariant: **no block is finalized unless all payload writes are +durable in the storage backend.** The EVM execution (step 4) happens +in-memory — the state transition isn't persisted until the block is sealed +(step 6). If storage fails at step 5, the sequencer discards the +in-memory EVM state and retries with the next block. + +``` +Failure scenarios: + + Storage write fails (step 5): + → Block discarded, transactions re-queued + → No on-chain state change, no data loss + → Client sees transaction not included, resubmits or waits + + EVM execution fails (step 4, individual tx reverts): + → Reverted tx's staged payload entries rolled back + → Other transactions in the block proceed normally + → Standard EVM revert semantics + + Sequencer crashes between steps 5 and 6: + → Storage has the data (committed) + → Block was never finalized (not broadcast) + → On restart: storage has orphaned 'pending' entries + → Cleanup: remove entries with status = 'pending' and no matching block + → Or: promote to 'confirmed' and rebuild block from staged data + + Sequencer crashes after step 6: + → Both storage and chain are consistent + → Normal recovery +``` + +This is a write-ahead pattern: the storage layer is committed before the +execution layer (EVM state) is finalized. The sequencer's advantage over a +general-purpose chain is that it controls both sides — no coordination +protocol needed, just ordering. + +### Storage Backend + +The storage layer needs two properties: + +1. **Atomic batch writes**: All payloads for a block must be written + atomically — either all succeed or none do. This is the foundation of + the write-ahead guarantee. + +2. **Keyed reads by entity key**: The API serves payloads by entity key. + Range scans and complex queries are the indexer's job, not the + sequencer's. + +Several backends fit this profile: + +| Backend | Atomic batches | Character | +|-------------|----------------|------------------------------------| +| Postgres | Transactions | Relational, rich query, mature tooling. Natural if the sequencer already uses it for other state. | +| Pebble | WriteBatch | LSM-tree KV store (Go). Used by go-ethereum (geth) for chain state. Fast sequential writes, embedded (no network hop). | +| RocksDB | WriteBatch | LSM-tree KV store (C++). Pebble is a Go rewrite of RocksDB's design. Broader language support. | +| BadgerDB | WriteBatch | LSM-tree with values separated from keys. Good for large values (payload bytes). | +| SQLite | Transactions | Embedded relational. Simpler than postgres, single-writer by nature. | + +The choice depends on the sequencer's implementation language and what +it already uses for EVM state. If the sequencer is built on geth (Go), +pebble is the path of least resistance — it's already a dependency and +the write patterns (keyed by entity key, atomic batches per block) map +directly to pebble's WriteBatch API. If the sequencer needs richer +querying or already runs postgres for other reasons, postgres works. + +The API layer abstracts this — consumers see HTTP endpoints, not the +storage backend. The backend is an implementation detail that can change +without affecting the protocol. + +### Data Model + +Regardless of backend, the sequencer stores two logical collections: + +``` +Data model (conceptual — adapt to backend's key/value or relational model): + + payloads + ├── entity_key bytes (primary key) + ├── payload bytes (reassembled from blobs) + ├── payload_size uint32 + ├── block_number uint32 (block that created/last updated) + ├── expires_at uint32 (TTL bound — storage obligation) + ├── status enum ('pending' | 'confirmed') + └── updated_at uint32 + + payload_blobs + ├── entity_key bytes (references payloads) + ├── blob_index uint16 (position within payload) + ├── commitment bytes (48-byte KZG commitment) + └── proof bytes (48-byte KZG proof) +``` + +For a KV store like pebble, these map to prefixed keys: + +``` +Key: "payload:" → payload bytes + metadata +Key: "blob::" → commitment + proof +Key: "expiry::" → (index for TTL-based cleanup scans) +``` + +For a relational store, they map to tables with the schema above. + +The `status` field tracks write-ahead state: + +- `'pending'`: Written during block staging (step 3). Not yet backed by a + finalized block. +- `'confirmed'`: Block containing this payload was finalized (step 6). + Safe to serve via API. + +On startup recovery, the sequencer scans for `'pending'` entries not backed +by a finalized block and either promotes or deletes them. + +The sequencer owns this data completely. No external writes. The +`expires_at` field is the contract's `expiresAt` — the storage obligation +boundary. On DELETE/EXPIRE: remove the payload and all associated blob +entries. + +### Entity Lifecycle in Storage + +``` +CREATE (block N, expiresAt = block N+1000) + │ + ├── INSERT payload (status = 'pending') + ├── Block finalized → UPDATE status = 'confirmed' + │ + ├── EXTEND (block N+500, new expiresAt = N+2000) + │ UPDATE expires_at = N+2000 + │ Same payload, same blobs — no new writes except metadata + │ + ├── UPDATE (block N+600, new payload + commitments) + │ Replace payload + blob entries + │ (within same atomic batch as block staging) + │ + └── EXPIRE or DELETE + DELETE FROM payloads WHERE entity_key = ... + (cascades to payload_blobs) +``` + +### Integrity Guarantees + +The write-ahead pipeline plus the on-chain commitment create a two-sided +integrity check: + +**1. Storage → chain (sequencer guarantees):** + +The sequencer will not finalize a block unless the storage backend has +confirmed the write. If the sequencer is honest, every on-chain commitment +has backing data in storage. + +**2. Chain → storage (external verification):** + +Any node syncing from the sequencer can fetch the payload via the API, +recompute the commitment, and check it against the on-chain value. If +mismatch, the sequencer is provably serving wrong data. This catches a +dishonest or buggy sequencer — the commitment is the contract between the +sequencer and every consumer of its API. + +**3. Point-in-time verification:** + +Any party can query the API for a payload and its KZG proof. Specific data +elements can be verified via the point evaluation precompile without +retrieving the full payload. + +**4. After expiry:** + +The sequencer has no obligation to store the payload after expiry. But +historical commitments in the changeset hash chain are permanent on-chain — +they prove a specific payload existed at a specific time, even after the +data is deleted from storage. + +### Syncing Nodes + +Other nodes that want a copy of entity data don't need to replay the chain. +They query the sequencer's API, which reads from the storage backend: + +``` +GET /entities/:entityKey/payload → raw bytes +GET /entities/:entityKey/proof → witness proof +GET /entities/:entityKey → metadata + commitment +``` + +A syncing node can verify any response against the on-chain commitment. The +API is the interface — the storage backend is an implementation detail. Nodes +building their own query layer (e.g., an indexer with its own schema) fetch +payloads from the API and store them in whatever representation suits their +access patterns. + +``` +┌───────────┐ ┌───────────────────────────────────┐ +│ Client │────▶│ Sequencer │ +│ │ │ │ +│ tx + │ │ EVM ──▶ storage (payloads) │ +│ sidecar │ │ │ │ │ +└───────────┘ │ │ ▼ │ + │ │ API ──▶ syncing nodes │ + │ ▼ │ + │ L1 (state roots, changeset hashes) │ + └───────────────────────────────────┘ +``` + +The simplicity here is intentional. The sequencer is the sole block producer +and the sole writer to storage. It doesn't need consensus with other nodes +about payload storage — it just needs to serve correct data that matches +on-chain commitments. Any node can verify independently. + +--- + +## Impact on DELETE and EXPIRE Operations + +### Current Behavior + +- **DELETE**: Owner removes entity before expiry. Commitment zeroed from storage. + Changeset chain preserves the final entityHash. +- **EXPIRE**: Anyone removes an expired entity. Same storage cleanup. + +### With Payload Commitments + +Since the sequencer controls block production and knows which entities have +expired, it can automate expiry: + +**Option 1: Sequencer-initiated EXPIRE (automated pruning)** + +At the end of each block (or in a periodic maintenance transaction), the +sequencer scans for entities whose `expiresAt <= current block` and +submits EXPIRE operations. This transforms EXPIRE from a user-initiated +housekeeping operation into a sequencer-automated one. + +``` +Block N processing: + 1. Execute user transactions (CREATE, UPDATE, EXTEND, TRANSFER, DELETE) + 2. Sequencer appends EXPIRE ops for all newly-expired entities + 3. DELETE FROM payloads WHERE entity_key IN (expired keys) +``` + +The EXPIRE operation remains in the contract — it's just always called by the +sequencer address rather than arbitrary users. The access control doesn't +change (EXPIRE already allows any caller). The storage deletion is the +sequencer's own cleanup — it happens after the EVM execution confirms the +on-chain commitment is zeroed. + +**Option 2: DELETE becomes a soft-delete** + +If the sequencer automates expiry, explicit DELETE becomes less critical for +storage reclamation. DELETE could be reframed as: + +- **Reduce expiresAt to current block** (making the entity immediately + eligible for sequencer-automated EXPIRE) +- Or retain current behavior (immediate commitment zeroing) + +The second option (retain current DELETE) is simpler and doesn't change +semantics. DELETE remains useful for: "I want this gone now, not at expiry." + +**Recommendation:** Keep DELETE as-is. Add sequencer-automated EXPIRE as a +block-level operation. The two serve different purposes: + +- DELETE = owner-initiated immediate removal ("I want this gone") +- EXPIRE = sequencer-automated TTL enforcement ("time's up") + +The sequencer can restrict EXPIRE to its own address if desired (via a +modifier), but the current permissionless design is fine — it just means +the sequencer handles it so users don't have to. + +--- + +## Sequencer as DB-Chain + +### Architecture + +The sequencer is a single-application chain where the EVM handles +commitments and access control, and a storage backend handles data. There's +no distributed storage protocol — the sequencer owns both the chain and the +store. + +``` +┌──────────────────────────────────────────────────────────────┐ +│ Sequencer │ +│ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ +│ │ Payload │ │ EVM │ │ Storage │ │ API │ │ +│ │ Ingress │──▶│ (entity │──▶│ (payload │──▶│ (query │ │ +│ │ (sidecar)│ │ registry│ │ + proof │ │ + sync) │ │ +│ └──────────┘ │ contract│ │ store) │ └────┬─────┘ │ +│ └────┬─────┘ └──────────┘ │ │ +│ │ │ │ +│ ┌─────▼──────┐ ┌───────▼──────┐ │ +│ │ State root │ │ Syncing │ │ +│ │ + changeset│ │ nodes / │ │ +│ │ hash │ │ indexers │ │ +│ └─────┬──────┘ └──────────────┘ │ +│ │ │ +└───────────────────────┼──────────────────────────────────────┘ + │ + ┌────▼────┐ + │ L1 │ + │ (state │ + │ roots, │ + │ hashes)│ + └─────────┘ +``` + +**EVM**: Runs the EntityRegistry contract. Sees only commitments and +metadata — never payload bytes. Produces the changeset hash chain and +entity commitments. + +**Storage**: Payload bytes, witness proofs, and metadata. The sequencer +writes on CREATE/UPDATE and deletes on DELETE/EXPIRE. Single-writer — +no contention, no replication conflicts. Backend can be an embedded KV +store (pebble, rocksdb, badger), a relational DB (postgres, sqlite), or +anything with atomic batch write semantics. + +**API**: Serves payload data and proofs to syncing nodes and indexers. Reads +from the storage backend. Responses are independently verifiable against +on-chain commitments. + +**L1 settlement**: State roots and changeset hashes posted periodically. +Any party can verify the sequencer's state against L1. + +### Data Flow + +``` +1. Client submits: + - Transaction: execute([{CREATE, payloadCommitment: 0x..., ...}]) + - Sidecar: raw payload bytes + +2. Sequencer validates: + - Compute commitment from sidecar payload + - Verify it matches payloadCommitment in transaction + - Reject on mismatch + +3. Sequencer executes: + - Run transaction in EVM + - Contract hashes payloadCommitment into coreHash + - Emit EntityOperation event + +4. Sequencer writes to storage: + - Store payload, commitment, proof keyed by entity key + - On UPDATE: overwrite existing entry + - On DELETE/EXPIRE: remove entry + +5. Sequencer settles to L1: + - Post state root + changeset hash + +6. Syncing nodes: + - Subscribe to events (via RPC or websocket) + - Fetch payloads from sequencer API + - Verify commitment against on-chain value + - Build their own query representation +``` + +### Rollup Security Model + +The changeset hash chain and state roots posted to L1 form the security +basis. The trust model scales with the verification approach: + +**Optimistic (fraud proof):** The sequencer posts state commitments to L1. +During a challenge window, any party with access to the payload data (from +the API or their own copy) can submit a fraud proof showing state divergence. +The on-chain commitment lets the fraud proof reference specific payloads +without re-uploading them. + +**Validity (ZK proof):** The sequencer generates a ZK proof that state +transitions are correct. Payload availability is still needed for indexers +to serve queries, but the proof itself attests to execution correctness +independently. + +**Redundancy (future):** If the single-sequencer trust model is insufficient, +additional nodes can mirror the storage via the API and attest to +data availability. This doesn't require protocol changes — just more readers +of the same API, each verifying against on-chain commitments. + +### Throughput + +With payload bytes out of calldata, the two throughput dimensions decouple: + +| Dimension | Bottleneck | Approximate capacity | +|--------------------|-------------------------------|------------------------------| +| EVM (commitments) | Block gas limit (tunable) | ~770 CREATEs/block at 30M gas | +| Data (payloads) | Network + storage write I/O | 100s of MB/s (SSD-backed) | +| Proof computation | CPU (parallelizable) | ~42ms per 128 KiB blob | + +The gas limit bounds how many entities can be created per block. The data +layer bounds how much content can be stored. These are independent — a block +with 100 CREATEs of 1 MiB payloads each costs the same gas as 100 CREATEs +of 1 byte payloads. The storage backend handles the actual data difference. + +**Block gas limit is a tuning parameter, not a fixed constraint.** On an +application-specific chain, the sequencer operator controls the block gas +limit. Since the only contract is the EntityRegistry and payloads are no +longer in calldata, the per-operation gas cost is small and predictable +(~39k gas per CREATE). The operator can scale the gas limit to match the +sequencer's actual execution capacity: + +| Block gas limit | CREATEs/block | CREATEs/s (2s blocks) | Write speed @ 100 KB avg | Write speed @ 1 MiB avg | +|-----------------|---------------|----------------------|--------------------------|-------------------------| +| 30M (Ethereum) | ~770 | ~385/s | ~37.5 MB/s | ~385 MB/s | +| 100M | ~2,560 | ~1,280/s | ~125 MB/s | ~1.25 GB/s | +| 500M | ~12,800 | ~6,400/s | ~625 MB/s | ~6.25 GB/s * | +| 1B | ~25,600 | ~12,800/s | ~1.25 GB/s * | ~12.5 GB/s * | + +`*` = exceeds typical SSD sequential write throughput (~500 MB/s SATA, +~3.5 GB/s NVMe). At these levels storage I/O becomes the bottleneck, +not EVM execution. Actual throughput is min(EVM capacity, disk write +speed, network ingress). + +The practical ceiling is wherever EVM execution time (commitment hashing, +SSTOREs, event emission) exceeds the target block time, or where storage +I/O saturates. With ~39k gas per operation and no payload processing in +the EVM, the execution is lightweight — the gas limit can be pushed +significantly higher than Ethereum L1's 30M without risking block +production delays. At moderate gas limits (30M–100M) the EVM is the +bottleneck. At aggressive limits (500M+) the bottleneck shifts to +storage write throughput and proof computation. + +--- + +## Payload Size Limits and Storage Accounting + +Without calldata as a natural constraint, the system needs explicit payload +size governance. With `PAYLOADSIZE` available to the contract, enforcement +can happen at two layers: + +### Layer 1: Sequencer pre-validation (mempool policy) + +The sequencer rejects transactions before block inclusion if sidecar +payloads exceed policy limits. These are configurable and can be changed +without contract upgrades: + +| Parameter | Suggested value | Rationale | +|-----------------------|-----------------|-----------------------------------| +| Max payload per op | 10 MiB | Single content upload bound | +| Max payload per tx | 25 MiB | Batch upload bound | +| Max payload per block | 128 MiB | Disk write throughput bound | + +### Layer 2: Contract-enforced limits (consensus rules) + +With `PAYLOADSIZE`, the contract can enforce hard limits that are part of +the state transition function — not just sequencer policy: + +```solidity +uint256 public constant MAX_PAYLOAD_BYTES = 10_485_760; // 10 MiB +uint256 public constant MIN_TTL_BLOCKS = 100; + +function _create(Operation calldata op, BlockNumber current, uint256 sidecarIdx) + internal returns (bytes32, bytes32) +{ + uint256 size = payloadSize(sidecarIdx); + if (size == 0) revert EmptyPayload(); + if (size > MAX_PAYLOAD_BYTES) revert PayloadTooLarge(size, MAX_PAYLOAD_BYTES); + + // TTL-proportional storage obligation check + uint256 ttlBlocks = BlockNumber.unwrap(op.expiresAt) - BlockNumber.unwrap(current); + if (ttlBlocks < MIN_TTL_BLOCKS) revert TTLTooShort(ttlBlocks, MIN_TTL_BLOCKS); + + // ... rest of create +} +``` + +Contract-level enforcement means a compromised or misconfigured sequencer +cannot include transactions that violate size limits — the EVM rejects them +during execution. This is the same guarantee that gas limits provide on +Ethereum L1. + +### Storage Accounting Model + +The sequencer's real cost is disk-space-over-time: storing N bytes for T +blocks. The contract can track this and price accordingly: + +``` +Storage cost ∝ payloadSize × TTL + +For a 1 MiB payload with TTL of 1000 blocks: + storage_units = 1,048,576 bytes × 1000 blocks = 1,048,576,000 byte-blocks +``` + +With `PAYLOADSIZE` the contract can maintain a running storage ledger: + +```solidity +// Track total outstanding storage obligation +uint256 public totalStorageUnits; + +// Per-entity tracking (optional — stored in commitment or separate mapping) +mapping(bytes32 entityKey => uint256 storageUnits) internal _entityStorage; +``` + +On CREATE: add `size * ttl` to the ledger. +On EXTEND: add `size * (newExpiry - oldExpiry)` (size from stored metadata). +On DELETE/EXPIRE: subtract remaining obligation. + +### Admin-Controlled Storage Cap + +The sequencer operator knows their actual infrastructure costs — disk +capacity, IOPS budget, backup overhead. The contract should reflect this +via a permissioned admin function that sets the upper bound on total +outstanding storage: + +```solidity +address public admin; // sequencer operator + +uint256 public storageCap; // max total byte-blocks outstanding +uint256 public totalStorageUnits; // current outstanding byte-blocks + +function setStorageCap(uint256 newCap) external { + require(msg.sender == admin); + storageCap = newCap; +} +``` + +Every CREATE and EXTEND checks `totalStorageUnits + delta <= storageCap` +before proceeding. This gives the operator a hard ceiling they can adjust +as infrastructure scales up or down. + +### Backpressure Pricing + +A hard cap is a blunt instrument — it's either open or full. A more useful +model is a pricing curve that increases cost as utilization approaches the +cap, creating incremental backpressure: + +``` + ▲ storage fee multiplier + │ + │ ╱ + │ ╱ + │ ╱ + │ ╱ + │ ╱ + │ ╱ + 1x │─────╱ + │ + └──────────────────────────▶ utilization + 0% 100% + (storageCap) +``` + +The fee multiplier scales with how full the system is. When utilization is +low, storage is cheap (base rate). As it approaches the cap, the multiplier +increases — reflecting that the sequencer's marginal cost of additional +storage rises as it approaches infrastructure limits. + +**Linear model (simple):** + +``` +multiplier = 1 + (utilization / storageCap) * MAX_PREMIUM + +At 50% full: multiplier = 1 + 0.5 * MAX_PREMIUM +At 90% full: multiplier = 1 + 0.9 * MAX_PREMIUM +``` + +**Exponential model (EIP-1559-style):** + +``` +multiplier = e^(k * utilization / storageCap) + +Where k controls curve steepness. Similar to EIP-1559's +base_fee = MIN_FEE * e^(excess / UPDATE_FRACTION) +``` + +The exponential model is well-understood from EIP-1559 and has the right +property: gentle at low utilization, aggressive near the cap. The admin +controls the cap; the curve controls the economics within it. + +```solidity +// Storage fee charged on CREATE/EXTEND (in native token or storage credits) +function storageFee(uint256 sizeBytes, uint256 ttlBlocks) public view returns (uint256) { + uint256 units = sizeBytes * ttlBlocks; + uint256 baseFee = units * BASE_RATE_PER_UNIT; + uint256 utilization = totalStorageUnits * SCALE / storageCap; + // Exponential multiplier: baseFee * e^(k * utilization / SCALE) + return baseFee * exp(k * utilization / SCALE) / SCALE; +} +``` + +The admin can also adjust `BASE_RATE_PER_UNIT` to reflect actual +infrastructure costs. This makes storage pricing a direct pass-through of +the sequencer operator's costs to users, with the curve providing market- +based rationing when demand approaches capacity. + +### Interaction with EIP-1559 Transaction Basefee + +The sequencer chain inherits EIP-1559 gas pricing for the EVM execution +layer. This creates two independent fee dimensions: + +``` +Total user cost = gas fee (EIP-1559) + storage fee (utilization curve) + +Where: + gas fee = gas_used × basefee (EVM execution: commitments, SSTOREs) + storage fee = f(payload_size, ttl, util) (sequencer storage obligation) +``` + +**Why two dimensions work better than one:** + +With payload in calldata (current model), there's only one fee dimension: +gas. A user submitting 10 small entities in one transaction pays a single +basefee but high calldata cost. A user submitting 10 transactions pays 10x +the basefee. This creates a perverse incentive to batch everything into +single large transactions. + +With payload commitments, the expensive part (storage) is per-operation, +not per-transaction. The basefee covers EVM execution only — and each +CREATE/EXTEND operation within a batch carries its own storage fee based on +payload size and TTL. Batching multiple operations into one transaction +saves on basefee (one tx overhead instead of many) without gaming storage +costs. + +``` +Per-tx cost comparison: + + Current (payload in calldata): + 1 tx × 3 CREATEs: basefee × 1 + calldata(payload_0 + payload_1 + payload_2) + 3 tx × 1 CREATE: basefee × 3 + calldata(payload_0) + ... + calldata(payload_2) + → Batching saves 2× basefee AND packs calldata more efficiently + → Strong incentive to batch, penalizes small frequent uploads + + Proposed (payload commitments): + 1 tx × 3 CREATEs: basefee × 1 + storage_fee(p0) + storage_fee(p1) + storage_fee(p2) + 3 tx × 1 CREATE: basefee × 3 + storage_fee(p0) + storage_fee(p1) + storage_fee(p2) + → Batching saves 2× basefee only (small — ~21k gas × basefee per tx) + → Storage fees identical either way — per-op, not per-tx + → Users choose batch vs individual based on atomicity needs, not cost gaming +``` + +This separation means the basefee auction operates on a much smaller gas +surface (commitments + SSTOREs + events, ~39k gas per CREATE). The basefee +stays low because the heavy cost (payload storage) has its own fee market. +Block space contention is about operation throughput, not data throughput. + +**Basefee dynamics in an application-specific chain:** + +Because the sequencer is the sole block producer, the EIP-1559 basefee +mechanism behaves differently than on Ethereum L1: + +- No competing applications — all gas is EntityRegistry operations. +- The sequencer controls block gas limit and can tune it to the contract's + actual execution profile. +- Basefee will converge to a level reflecting demand for operation slots, + not demand for generic blockspace. +- During low demand, basefee drops to the protocol minimum. The storage fee + still applies — the sequencer's disk costs don't go to zero when the + chain is quiet. + +The two-fee model means the sequencer can price its real costs (storage is +expensive, compute is cheap) rather than collapsing everything into gas. + +### Non-Zero Byte Accounting + +If the full `PAYLOAD_INFO` precompile is available (with `nonZeroBytes`), +the storage fee can weight dense data higher than sparse data: + +``` +Effective size = nonZeroBytes × DENSE_WEIGHT + zeroBytes × SPARSE_WEIGHT + +Where: + DENSE_WEIGHT = 4 (incompressible data, higher disk cost) + SPARSE_WEIGHT = 1 (compressible, lower effective cost) +``` + +This mirrors Ethereum's own calldata pricing (16 gas/non-zero byte vs +4 gas/zero byte) and incentivizes clients to avoid padding payloads with +non-zero filler. The effective size feeds into the storage fee calculation +in place of raw `totalBytes`. + +--- + +## Off-Chain Indexer / Syncing Nodes + +### Current Model + +``` +1. Subscribe to EntityOperation events +2. For CREATE/UPDATE: decode payload from transaction calldata +3. Store entity data in local database +4. Verify local state against changeSetHash() +``` + +### Proposed Model + +``` +1. Subscribe to EntityOperation events +2. For CREATE/UPDATE: fetch payload from sequencer API +3. Verify payload against on-chain payloadCommitment +4. Store entity data in local database (own schema, own representation) +5. Verify local state against changeSetHash() +``` + +The indexer fetches payload from the sequencer's API +rather than decoding it from calldata. This is a simpler dependency — a +single HTTP endpoint — and removes the need for archival node access. + +### Verification + +Any node can independently verify that the sequencer is serving correct data: + +``` +For each CREATE/UPDATE event: + 1. GET /entities/:entityKey/payload from sequencer API + 2. Compute commitment locally from the returned bytes + 3. Compare against on-chain payloadCommitment + 4. If mismatch: sequencer is provably serving wrong data +``` + +This is stronger than the current model. Today, calldata is inherently +correct (the EVM guarantees it). With the commitment model, the indexer +actively verifies the sequencer — the commitment is the contract between +the sequencer and every consumer of its API. + +--- + +## Migration Path + +### Phase 1: Add payloadCommitment to Operation struct + +- Add `bytes32 payloadCommitment` field to `Operation` +- Require `payloadCommitment == keccak256(payload)` in CREATE/UPDATE +- Both fields coexist — backwards compatible +- Zero contract risk — just an additional check + +### Phase 2: Sequencer sidecar protocol + +- Sequencer accepts payload via sidecar channel +- Sequencer validates commitment before inclusion +- Clients stop sending payload in calldata +- Contract stops reading `payload` field — uses `payloadCommitment` only + +### Phase 3: Remove payload from calldata + +- Remove `bytes payload` from Operation struct +- Update CORE_HASH_TYPEHASH (breaking change to hash scheme) +- Update all tests and cross-language encoding specs +- Deploy new contract version + +### Phase 4: KZG commitment (optional upgrade) + +- Replace `keccak256(payload)` commitment with KZG versioned hash +- Add point evaluation verification path for disputes +- Enable DAS if sequencer network grows + +--- + +## Open Questions + +1. **Commitment scheme finality**: Should the contract enforce a specific + commitment scheme (e.g., require versioned hash format 0x01...), or + accept any `bytes32` and leave validation to the sequencer? + +2. **Multi-blob payloads**: For payloads exceeding 128 KiB (one KZG blob), + should the contract accept a single aggregated commitment or an array of + per-blob commitments? + +3. **Sequencer rotation**: If the sequencer is replaced, how are payload + storage obligations transferred? The new sequencer needs all payload + data for active entities. + +4. **Redundancy**: Should the system require N-of-M storage attestation + (data availability committee) from launch, or start with a single + sequencer and add redundancy later? + +5. **Proof-of-storage**: Should the sequencer periodically prove it still + holds payload data (e.g., respond to random challenges), or is the + initial witness proof sufficient? + +6. **UPDATE semantics**: When an entity is updated, should the old payload + be retained (historical access) or immediately prunable? The old + `coreHash` is replaced — only the changeset chain references the old + entityHash. + +7. **Payload addressing**: Should payloads be content-addressed + (`keccak256(payload)` as the key) or entity-addressed (entity key + + version as the key)? Content addressing enables deduplication but + complicates deletion. + +8. **Gas limit implications**: With payloads out of calldata, what should + the sequencer's block gas limit be? The current Ethereum-default 30M + may be artificially low for an application-specific chain that doesn't + need gas for payload data. + +9. **Precompile vs opcode surface area**: The `PAYLOADHASH` / `PAYLOADSIZE` + opcodes are the minimal viable surface. Should we also expose: + - `PAYLOADCOUNT()` — number of sidecar entries in this tx (for batch + validation without counting manually)? + - Content-type or MIME hash from the sidecar (or is calldata sufficient + for metadata)? + - A `PAYLOAD_SLICE(index, offset, length)` precompile for bounded reads + (e.g., reading just a header prefix without full payload access)? + +10. **Sidecar ordering guarantees**: With implicit indexing, the sidecar + order must match the operation order exactly. If the sequencer reorders + operations for optimization (e.g., grouping by entity key), the sidecar + indices shift. Should the protocol forbid operation reordering, or use + explicit indices to allow it? + +11. **Storage accounting granularity**: Should storage tracking be + per-entity (allowing per-entity quotas and billing) or aggregate-only + (simpler, less state)? Per-entity tracking adds one mapping slot per + entity but enables the contract to subtract storage units on + DELETE/EXPIRE precisely. + +12. **Commitment in Commitment struct**: Should the on-chain `Commitment` + store the `payloadSize` alongside `coreHash`? This would let EXTEND + operations adjust storage accounting without a `PAYLOADSIZE` call (the + original size is needed to compute the delta). Adds 32 bytes (one slot) + to the commitment but avoids re-reading sidecar metadata for lifecycle + operations that don't carry a new payload. + +--- + +## References + +- [EIP-4844: Shard Blob Transactions](https://eips.ethereum.org/EIPS/eip-4844) +- [EIP-7594: PeerDAS](https://eips.ethereum.org/EIPS/eip-7594) +- [EIP-7623: Increase Calldata Cost](https://eips.ethereum.org/EIPS/eip-7623) +- [KZG Polynomial Commitments — Dankrad Feist](https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html) +- [c-kzg-4844 Reference Implementation](https://github.com/ethereum/c-kzg-4844) +- [Paradigm: Data Availability Sampling](https://www.paradigm.xyz/2022/08/das) +- [Nomos: FRI-based Commitments for DA](https://blog.nomos.tech/fri-based-commitments-for-data-availability/)