Skip to content

feat(api): align stats/daily + whale/tx with the legacy indexer contract#60

Merged
satyakwok merged 2 commits into
mainfrom
feat/stats-parity-phase1
Jun 8, 2026
Merged

feat(api): align stats/daily + whale/tx with the legacy indexer contract#60
satyakwok merged 2 commits into
mainfrom
feat/stats-parity-phase1

Conversation

@satyakwok
Copy link
Copy Markdown
Member

@satyakwok satyakwok commented Jun 8, 2026

Why

The explorer frontend consumes a fixed set of read endpoints and should be able to run against indexer-rs as the sole indexer. Three gaps blocked that for /stats/daily and /whale/tx.

Changes

  1. /stats/daily shape — was object-wrapped + empty ({"daily":[]} with day_bucket/block_count). Now a bare array [{date, blocks, transactions}]: calendar date derived from the day_bucket epoch in SQL (to_timestamp(day_bucket*86400) at UTC), numeric counts — matching the legacy indexer the frontend was built against.
  2. MV auto-refreshstats_daily_mv (migration 0002) had a refresh() helper with no caller, so it stayed empty despite millions of indexed blocks. The indexer now refreshes it on an interval (INDEXER_STATS_REFRESH_SECS, default 300): first tick does a plain REFRESH (Postgres rejects CONCURRENTLY on a never-populated MV), subsequent ticks use CONCURRENTLY so reads aren't blocked.
  3. /whale/tx alias — added alongside /whale/transfers (the legacy path name).

/accounts/active and /contracts/* already share the legacy shape.

Tests

  • stats/daily serialises as a flat, bare array with legacy field names (no wrapper, no day_bucket).

Follow-up

Contract leaderboards (/contracts/recent, /contracts/pioneers) return the correct shape but no data yet — contract indexing is a separate change (Phase 2).

Summary by CodeRabbit

  • New Features

    • Added /whale/tx API endpoint alias for convenient access to whale transfer data and leaderboard rankings
  • API Updates

    • Modified /stats/daily endpoint response structure: now returns daily statistics as a flat JSON array containing date, blocks, and transactions fields for improved consistency

The explorer frontend consumes a fixed set of read endpoints and can be
pointed at either indexer. Three gaps stopped indexer-rs from serving them:

1. /stats/daily returned an object-wrapped, never-populated payload
   ({"daily":[]} with day_bucket/block_count fields). It now returns a bare
   array [{date, blocks, transactions}] — calendar date derived from the
   day_bucket epoch in SQL (to_timestamp at UTC), numeric counts — matching
   the legacy indexer the frontend was built against.

2. stats_daily_mv was never refreshed (the refresh helper had no caller), so
   it stayed empty despite millions of indexed blocks. The indexer now
   refreshes it on an interval (INDEXER_STATS_REFRESH_SECS, default 300s):
   first tick does a plain REFRESH (CONCURRENTLY is rejected on a
   never-populated MV), subsequent ticks use CONCURRENTLY so reads aren't
   blocked.

3. Added /whale/tx as an alias of /whale/transfers — the legacy path name.

/accounts/active + /contracts/* already share the legacy shape. Contract
leaderboards (recent/pioneers) still need data population — tracked separately.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 8, 2026

Review Change Stack

Warning

Review limit reached

@satyakwok, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 32 minutes and 6 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 6c53b6d6-a338-4e88-9ab8-6cbee0b3c572

📥 Commits

Reviewing files that changed from the base of the PR and between 8fab28b and 8863d6a.

📒 Files selected for processing (1)
  • scripts/smoke.sh
📝 Walkthrough

Walkthrough

This PR introduces a periodic materialized view refresh system and redesigns the daily statistics data contract. The database layer (crates/db/src/stats.rs) restructures StatsDailyRow to expose date, blocks, and transactions instead of legacy bucket/range/gas fields, and adds two refresh variants: refresh_full() for initial population and refresh() for concurrent periodic updates. The API layer (crates/api/src/routes/stats.rs) exposes this redesigned schema as a bare JSON array through /stats/daily. The indexer (bin/indexer.rs) adds configurable refresh scheduling with a background task that performs full refresh on startup, then switches to concurrent refresh on subsequent ticks. A minor routing alias (/whale/tx) is added to leaderboards.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description provides clear context (Why), detailed changes, test information, and follow-ups, but does not follow the repository's required template structure with Scope and Checks sections. Consider using the repository template structure with Scope checkboxes and Checks validation (forge build, test, fmt, slither, storage) to align with project conventions.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: aligning stats/daily and whale/tx endpoints with the legacy indexer contract format.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/stats-parity-phase1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/db/src/stats.rs (1)

22-30: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard limit against non-positive values before binding.

LIMIT $1 is parameterized, but a non-positive value can still defeat expected bounding (LIMIT -1 behaves as no limit in PostgreSQL). Clamp to >= 1 before query bind.

Proposed fix
 pub async fn daily(pool: &PgPool, limit: i64) -> DbResult<Vec<StatsDailyRow>> {
+    let limit = limit.max(1);
     let rows = sqlx::query(
         "SELECT to_char(to_timestamp(day_bucket * 86400) AT TIME ZONE 'UTC', 'YYYY-MM-DD') AS date, \
                 block_count AS blocks, \
                 tx_count AS transactions \
          FROM stats_daily_mv ORDER BY day_bucket DESC LIMIT $1",
     )
     .bind(limit)

As per coding guidelines, "**/db/**: ... Unbounded SELECTs without LIMIT in user-facing endpoints."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/db/src/stats.rs` around lines 22 - 30, The daily function currently
binds the user-provided limit directly into the SQL LIMIT which allows
non-positive values to bypass the bound; clamp the limit to a safe minimum (>=
1) before binding (e.g., compute a sanitized_limit = if limit < 1 { 1 } else {
limit } or use limit.max(1)) and use that sanitized_limit in the .bind(...) call
for the query in daily so the SQL always receives a positive LIMIT.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bin/indexer.rs`:
- Around line 246-255: Wrap each call to indexer_db::stats::refresh and
indexer_db::stats::refresh_full with a tokio timeout (e.g.,
tokio::time::timeout(Duration::from_secs(...), ...)) so the task doesn’t await a
hanging DB call; on timeout treat it like an Err path (log a warning including
that it timed out) and continue so stats_refresh_handle.await can remain
responsive during shutdown; apply the same change for the other occurrence at
the second refresh call (the refresh_full site) and pick a sensible timeout
constant used by this task.
- Around line 53-54: Validate that IndexerConfig.indexer_stats_refresh_secs is >
0 at startup (fail fast with a clear error if it is 0) instead of passing zero
into tokio::time::interval; use the existing default_stats_refresh_secs for the
default. In the stats refresh task (the loop that creates tick =
tokio::time::interval(Duration::from_secs(cfg.indexer_stats_refresh_secs))) make
the DB refresh calls (indexer_db::stats::refresh and refresh_full)
cancellation-aware by spawning or converting the DB future into a selectable
future and using tokio::select! to race it with cancel.cancelled() (and
optionally a timeout via tokio::time::timeout) so the task returns promptly when
the CancellationToken is triggered; ensure main still awaits
stats_refresh_handle but that the handle will complete quickly on cancel.

In `@crates/api/src/routes/stats.rs`:
- Around line 33-52: The smoke test expectations still assume a wrapped object
(.daily) and the old field name (.daily[0].day_bucket); update scripts/smoke.sh
to assert against the new bare-array response: replace any occurrences of
`.daily` with `.` (root array) and replace `.daily[0].day_bucket` with
`.[0].date` (or the correct DailyRow field name `date`) so the smoke checks
match the handler in crates/api/src/routes/stats.rs that returns
Json<Vec<DailyRow>>.

---

Outside diff comments:
In `@crates/db/src/stats.rs`:
- Around line 22-30: The daily function currently binds the user-provided limit
directly into the SQL LIMIT which allows non-positive values to bypass the
bound; clamp the limit to a safe minimum (>= 1) before binding (e.g., compute a
sanitized_limit = if limit < 1 { 1 } else { limit } or use limit.max(1)) and use
that sanitized_limit in the .bind(...) call for the query in daily so the SQL
always receives a positive LIMIT.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: be021a99-a8d1-4381-9f9f-a6224b8a6534

📥 Commits

Reviewing files that changed from the base of the PR and between 58bb0c3 and 8fab28b.

📒 Files selected for processing (4)
  • bin/indexer.rs
  • crates/api/src/routes/leaderboards.rs
  • crates/api/src/routes/stats.rs
  • crates/db/src/stats.rs

Comment thread bin/indexer.rs
Comment on lines +53 to +54
#[serde(default = "default_stats_refresh_secs")]
indexer_stats_refresh_secs: u64,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Does tokio::time::interval allow a zero Duration, or does it panic/error?

💡 Result:

The tokio::time::interval function panics if provided with a zero Duration [1][2][3]. This behavior is explicitly documented in the official Tokio API documentation [1][4]. The implementation enforces this requirement using an assertion, ensuring that the period must be greater than zero [5][6][7]. This also applies to tokio::time::interval_at [4][5].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="bin/indexer.rs"

echo "== File exists =="
ls -l "$FILE"

echo
echo "== Lines 1-120 =="
nl -ba "$FILE" | sed -n '1,120p'

echo
echo "== Lines 200-310 =="
nl -ba "$FILE" | sed -n '200,310p'

echo
echo "== Find refresh/statistics interval creation =="
rg -n "interval(_at)?|indexer_stats_refresh_secs|stats_refresh_handle|refresh.*db|cancel|shutdown" "$FILE" -S

Repository: Sentriscloud/indexer-rs

Length of output: 231


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="bin/indexer.rs"
[ -f "$FILE" ] || { echo "Missing $FILE"; exit 1; }

echo "== File exists =="
ls -l "$FILE"

echo
echo "== Lines 1-140 =="
awk 'NR>=1 && NR<=140 {printf "%d:%s\n", NR, $0}' "$FILE"

echo
echo "== Lines 200-320 =="
awk 'NR>=200 && NR<=320 {printf "%d:%s\n", NR, $0}' "$FILE"

echo
echo "== Find refresh/statistics interval creation (and DB refresh calls) =="
rg -n "interval(_at)?|indexer_stats_refresh_secs|stats_refresh_handle|refresh.*db|cancel|shutdown|SELECT|UPDATE|INSERT" "$FILE" -S

Repository: Sentriscloud/indexer-rs

Length of output: 11879


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find where indexer_db::stats::refresh and refresh_full live
rg -n "mod stats|pub mod stats|fn refresh_full|fn refresh\\b|refresh_full\\(" -S . -g'*.rs' --hidden || true

echo
echo "== Candidate refresh implementations =="
rg -n "refresh_full\\s*\\(|refresh\\s*\\(" -S . -g'*.rs' --hidden | head -n 80

echo
echo "== Show indexer_db stats module if present =="
fd -t f "stats.rs" . || true
fd -t f "stats*" . || true

# Print likely files
for f in $(fd -t f "stats.rs" . 2>/dev/null || true); do
  echo
  echo "----- $f (first 220 lines) -----"
  awk 'NR>=1 && NR<=220 {print NR ":" $0}' "$f"
done

Repository: Sentriscloud/indexer-rs

Length of output: 6702


Guard INDEXER_STATS_REFRESH_SECS > 0 and make stats refresh shutdown-cancel responsive

  • cfg.indexer_stats_refresh_secs == 0 will panic when constructing tokio::time::interval(Duration::from_secs(cfg.indexer_stats_refresh_secs)); fail fast on startup.
Proposed fix
     let cfg: IndexerConfig = Figment::new().merge(Env::raw()).extract()?;
+    if cfg.indexer_stats_refresh_secs == 0 {
+        anyhow::bail!("INDEXER_STATS_REFRESH_SECS must be >= 1");
+    }
  • Shutdown can block: after a tick fires, the refresh task awaits indexer_db::stats::refresh(_full) directly (no timeout) and doesn’t observe cancel.cancelled() until that DB await completes; main then awaits stats_refresh_handle after cancel.cancel().
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bin/indexer.rs` around lines 53 - 54, Validate that
IndexerConfig.indexer_stats_refresh_secs is > 0 at startup (fail fast with a
clear error if it is 0) instead of passing zero into tokio::time::interval; use
the existing default_stats_refresh_secs for the default. In the stats refresh
task (the loop that creates tick =
tokio::time::interval(Duration::from_secs(cfg.indexer_stats_refresh_secs))) make
the DB refresh calls (indexer_db::stats::refresh and refresh_full)
cancellation-aware by spawning or converting the DB future into a selectable
future and using tokio::select! to race it with cancel.cancelled() (and
optionally a timeout via tokio::time::timeout) so the task returns promptly when
the CancellationToken is triggered; ensure main still awaits
stats_refresh_handle but that the handle will complete quickly on cancel.

Comment thread bin/indexer.rs
Comment on lines +246 to +255
let res = if populated {
indexer_db::stats::refresh(&pool).await
} else {
indexer_db::stats::refresh_full(&pool).await
};
match res {
Ok(()) => populated = true,
Err(e) => {
tracing::warn!(error = %e, "stats_daily_mv refresh failed");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add a timeout around each MV refresh call to keep shutdown responsive.

The refresh future is awaited directly; if the DB call hangs, this task cannot observe cancellation and stats_refresh_handle.await can block shutdown indefinitely.

Proposed fix
                     _ = tick.tick() => {
-                        let res = if populated {
-                            indexer_db::stats::refresh(&pool).await
-                        } else {
-                            indexer_db::stats::refresh_full(&pool).await
-                        };
+                        let res = tokio::time::timeout(Duration::from_secs(30), async {
+                            if populated {
+                                indexer_db::stats::refresh(&pool).await
+                            } else {
+                                indexer_db::stats::refresh_full(&pool).await
+                            }
+                        }).await;
                         match res {
-                            Ok(()) => populated = true,
-                            Err(e) => {
+                            Ok(Ok(())) => populated = true,
+                            Ok(Err(e)) => {
                                 tracing::warn!(error = %e, "stats_daily_mv refresh failed");
                             }
+                            Err(_) => {
+                                tracing::warn!("stats_daily_mv refresh timed out");
+                            }
                         }
                     }

Also applies to: 267-267

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bin/indexer.rs` around lines 246 - 255, Wrap each call to
indexer_db::stats::refresh and indexer_db::stats::refresh_full with a tokio
timeout (e.g., tokio::time::timeout(Duration::from_secs(...), ...)) so the task
doesn’t await a hanging DB call; on timeout treat it like an Err path (log a
warning including that it timed out) and continue so stats_refresh_handle.await
can remain responsive during shutdown; apply the same change for the other
occurrence at the second refresh call (the refresh_full site) and pick a
sensible timeout constant used by this task.

Comment thread crates/api/src/routes/stats.rs
@satyakwok satyakwok merged commit d4b7828 into main Jun 8, 2026
7 of 8 checks passed
@satyakwok satyakwok deleted the feat/stats-parity-phase1 branch June 8, 2026 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant