feat(api): align stats/daily + whale/tx with the legacy indexer contract#60
Conversation
The explorer frontend consumes a fixed set of read endpoints and can be
pointed at either indexer. Three gaps stopped indexer-rs from serving them:
1. /stats/daily returned an object-wrapped, never-populated payload
({"daily":[]} with day_bucket/block_count fields). It now returns a bare
array [{date, blocks, transactions}] — calendar date derived from the
day_bucket epoch in SQL (to_timestamp at UTC), numeric counts — matching
the legacy indexer the frontend was built against.
2. stats_daily_mv was never refreshed (the refresh helper had no caller), so
it stayed empty despite millions of indexed blocks. The indexer now
refreshes it on an interval (INDEXER_STATS_REFRESH_SECS, default 300s):
first tick does a plain REFRESH (CONCURRENTLY is rejected on a
never-populated MV), subsequent ticks use CONCURRENTLY so reads aren't
blocked.
3. Added /whale/tx as an alias of /whale/transfers — the legacy path name.
/accounts/active + /contracts/* already share the legacy shape. Contract
leaderboards (recent/pioneers) still need data population — tracked separately.
|
Warning Review limit reached
More reviews will be available in 32 minutes and 6 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR introduces a periodic materialized view refresh system and redesigns the daily statistics data contract. The database layer (crates/db/src/stats.rs) restructures StatsDailyRow to expose date, blocks, and transactions instead of legacy bucket/range/gas fields, and adds two refresh variants: refresh_full() for initial population and refresh() for concurrent periodic updates. The API layer (crates/api/src/routes/stats.rs) exposes this redesigned schema as a bare JSON array through /stats/daily. The indexer (bin/indexer.rs) adds configurable refresh scheduling with a background task that performs full refresh on startup, then switches to concurrent refresh on subsequent ticks. A minor routing alias (/whale/tx) is added to leaderboards. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/db/src/stats.rs (1)
22-30:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winGuard
limitagainst non-positive values before binding.
LIMIT $1is parameterized, but a non-positive value can still defeat expected bounding (LIMIT -1behaves as no limit in PostgreSQL). Clamp to>= 1before query bind.Proposed fix
pub async fn daily(pool: &PgPool, limit: i64) -> DbResult<Vec<StatsDailyRow>> { + let limit = limit.max(1); let rows = sqlx::query( "SELECT to_char(to_timestamp(day_bucket * 86400) AT TIME ZONE 'UTC', 'YYYY-MM-DD') AS date, \ block_count AS blocks, \ tx_count AS transactions \ FROM stats_daily_mv ORDER BY day_bucket DESC LIMIT $1", ) .bind(limit)As per coding guidelines, "
**/db/**: ... Unbounded SELECTs without LIMIT in user-facing endpoints."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/db/src/stats.rs` around lines 22 - 30, The daily function currently binds the user-provided limit directly into the SQL LIMIT which allows non-positive values to bypass the bound; clamp the limit to a safe minimum (>= 1) before binding (e.g., compute a sanitized_limit = if limit < 1 { 1 } else { limit } or use limit.max(1)) and use that sanitized_limit in the .bind(...) call for the query in daily so the SQL always receives a positive LIMIT.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bin/indexer.rs`:
- Around line 246-255: Wrap each call to indexer_db::stats::refresh and
indexer_db::stats::refresh_full with a tokio timeout (e.g.,
tokio::time::timeout(Duration::from_secs(...), ...)) so the task doesn’t await a
hanging DB call; on timeout treat it like an Err path (log a warning including
that it timed out) and continue so stats_refresh_handle.await can remain
responsive during shutdown; apply the same change for the other occurrence at
the second refresh call (the refresh_full site) and pick a sensible timeout
constant used by this task.
- Around line 53-54: Validate that IndexerConfig.indexer_stats_refresh_secs is >
0 at startup (fail fast with a clear error if it is 0) instead of passing zero
into tokio::time::interval; use the existing default_stats_refresh_secs for the
default. In the stats refresh task (the loop that creates tick =
tokio::time::interval(Duration::from_secs(cfg.indexer_stats_refresh_secs))) make
the DB refresh calls (indexer_db::stats::refresh and refresh_full)
cancellation-aware by spawning or converting the DB future into a selectable
future and using tokio::select! to race it with cancel.cancelled() (and
optionally a timeout via tokio::time::timeout) so the task returns promptly when
the CancellationToken is triggered; ensure main still awaits
stats_refresh_handle but that the handle will complete quickly on cancel.
In `@crates/api/src/routes/stats.rs`:
- Around line 33-52: The smoke test expectations still assume a wrapped object
(.daily) and the old field name (.daily[0].day_bucket); update scripts/smoke.sh
to assert against the new bare-array response: replace any occurrences of
`.daily` with `.` (root array) and replace `.daily[0].day_bucket` with
`.[0].date` (or the correct DailyRow field name `date`) so the smoke checks
match the handler in crates/api/src/routes/stats.rs that returns
Json<Vec<DailyRow>>.
---
Outside diff comments:
In `@crates/db/src/stats.rs`:
- Around line 22-30: The daily function currently binds the user-provided limit
directly into the SQL LIMIT which allows non-positive values to bypass the
bound; clamp the limit to a safe minimum (>= 1) before binding (e.g., compute a
sanitized_limit = if limit < 1 { 1 } else { limit } or use limit.max(1)) and use
that sanitized_limit in the .bind(...) call for the query in daily so the SQL
always receives a positive LIMIT.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: be021a99-a8d1-4381-9f9f-a6224b8a6534
📒 Files selected for processing (4)
bin/indexer.rscrates/api/src/routes/leaderboards.rscrates/api/src/routes/stats.rscrates/db/src/stats.rs
| #[serde(default = "default_stats_refresh_secs")] | ||
| indexer_stats_refresh_secs: u64, |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does tokio::time::interval allow a zero Duration, or does it panic/error?
💡 Result:
The tokio::time::interval function panics if provided with a zero Duration [1][2][3]. This behavior is explicitly documented in the official Tokio API documentation [1][4]. The implementation enforces this requirement using an assertion, ensuring that the period must be greater than zero [5][6][7]. This also applies to tokio::time::interval_at [4][5].
Citations:
- 1: https://docs.rs/tokio/latest/tokio/time/fn.interval.html
- 2: https://docs.rs/tokio/1.25.0/tokio/time/fn.interval.html
- 3: https://docs.rs/tokio/1.50.0/tokio/time/fn.interval.html
- 4: https://docs.rs/tokio/latest/tokio/time/fn.interval_at.html
- 5: https://tikv.github.io/doc/src/tokio/time/interval.rs.html
- 6: https://rust.velas.com/src/tokio/time/interval.rs.html
- 7: https://relm4.org/docs/stable/src/tokio/time/interval.rs.html
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="bin/indexer.rs"
echo "== File exists =="
ls -l "$FILE"
echo
echo "== Lines 1-120 =="
nl -ba "$FILE" | sed -n '1,120p'
echo
echo "== Lines 200-310 =="
nl -ba "$FILE" | sed -n '200,310p'
echo
echo "== Find refresh/statistics interval creation =="
rg -n "interval(_at)?|indexer_stats_refresh_secs|stats_refresh_handle|refresh.*db|cancel|shutdown" "$FILE" -SRepository: Sentriscloud/indexer-rs
Length of output: 231
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="bin/indexer.rs"
[ -f "$FILE" ] || { echo "Missing $FILE"; exit 1; }
echo "== File exists =="
ls -l "$FILE"
echo
echo "== Lines 1-140 =="
awk 'NR>=1 && NR<=140 {printf "%d:%s\n", NR, $0}' "$FILE"
echo
echo "== Lines 200-320 =="
awk 'NR>=200 && NR<=320 {printf "%d:%s\n", NR, $0}' "$FILE"
echo
echo "== Find refresh/statistics interval creation (and DB refresh calls) =="
rg -n "interval(_at)?|indexer_stats_refresh_secs|stats_refresh_handle|refresh.*db|cancel|shutdown|SELECT|UPDATE|INSERT" "$FILE" -SRepository: Sentriscloud/indexer-rs
Length of output: 11879
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find where indexer_db::stats::refresh and refresh_full live
rg -n "mod stats|pub mod stats|fn refresh_full|fn refresh\\b|refresh_full\\(" -S . -g'*.rs' --hidden || true
echo
echo "== Candidate refresh implementations =="
rg -n "refresh_full\\s*\\(|refresh\\s*\\(" -S . -g'*.rs' --hidden | head -n 80
echo
echo "== Show indexer_db stats module if present =="
fd -t f "stats.rs" . || true
fd -t f "stats*" . || true
# Print likely files
for f in $(fd -t f "stats.rs" . 2>/dev/null || true); do
echo
echo "----- $f (first 220 lines) -----"
awk 'NR>=1 && NR<=220 {print NR ":" $0}' "$f"
doneRepository: Sentriscloud/indexer-rs
Length of output: 6702
Guard INDEXER_STATS_REFRESH_SECS > 0 and make stats refresh shutdown-cancel responsive
cfg.indexer_stats_refresh_secs == 0will panic when constructingtokio::time::interval(Duration::from_secs(cfg.indexer_stats_refresh_secs)); fail fast on startup.
Proposed fix
let cfg: IndexerConfig = Figment::new().merge(Env::raw()).extract()?;
+ if cfg.indexer_stats_refresh_secs == 0 {
+ anyhow::bail!("INDEXER_STATS_REFRESH_SECS must be >= 1");
+ }- Shutdown can block: after a tick fires, the refresh task awaits
indexer_db::stats::refresh(_full)directly (no timeout) and doesn’t observecancel.cancelled()until that DB await completes;mainthen awaitsstats_refresh_handleaftercancel.cancel().
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bin/indexer.rs` around lines 53 - 54, Validate that
IndexerConfig.indexer_stats_refresh_secs is > 0 at startup (fail fast with a
clear error if it is 0) instead of passing zero into tokio::time::interval; use
the existing default_stats_refresh_secs for the default. In the stats refresh
task (the loop that creates tick =
tokio::time::interval(Duration::from_secs(cfg.indexer_stats_refresh_secs))) make
the DB refresh calls (indexer_db::stats::refresh and refresh_full)
cancellation-aware by spawning or converting the DB future into a selectable
future and using tokio::select! to race it with cancel.cancelled() (and
optionally a timeout via tokio::time::timeout) so the task returns promptly when
the CancellationToken is triggered; ensure main still awaits
stats_refresh_handle but that the handle will complete quickly on cancel.
| let res = if populated { | ||
| indexer_db::stats::refresh(&pool).await | ||
| } else { | ||
| indexer_db::stats::refresh_full(&pool).await | ||
| }; | ||
| match res { | ||
| Ok(()) => populated = true, | ||
| Err(e) => { | ||
| tracing::warn!(error = %e, "stats_daily_mv refresh failed"); | ||
| } |
There was a problem hiding this comment.
Add a timeout around each MV refresh call to keep shutdown responsive.
The refresh future is awaited directly; if the DB call hangs, this task cannot observe cancellation and stats_refresh_handle.await can block shutdown indefinitely.
Proposed fix
_ = tick.tick() => {
- let res = if populated {
- indexer_db::stats::refresh(&pool).await
- } else {
- indexer_db::stats::refresh_full(&pool).await
- };
+ let res = tokio::time::timeout(Duration::from_secs(30), async {
+ if populated {
+ indexer_db::stats::refresh(&pool).await
+ } else {
+ indexer_db::stats::refresh_full(&pool).await
+ }
+ }).await;
match res {
- Ok(()) => populated = true,
- Err(e) => {
+ Ok(Ok(())) => populated = true,
+ Ok(Err(e)) => {
tracing::warn!(error = %e, "stats_daily_mv refresh failed");
}
+ Err(_) => {
+ tracing::warn!("stats_daily_mv refresh timed out");
+ }
}
}Also applies to: 267-267
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bin/indexer.rs` around lines 246 - 255, Wrap each call to
indexer_db::stats::refresh and indexer_db::stats::refresh_full with a tokio
timeout (e.g., tokio::time::timeout(Duration::from_secs(...), ...)) so the task
doesn’t await a hanging DB call; on timeout treat it like an Err path (log a
warning including that it timed out) and continue so stats_refresh_handle.await
can remain responsive during shutdown; apply the same change for the other
occurrence at the second refresh call (the refresh_full site) and pick a
sensible timeout constant used by this task.
Why
The explorer frontend consumes a fixed set of read endpoints and should be able to run against indexer-rs as the sole indexer. Three gaps blocked that for
/stats/dailyand/whale/tx.Changes
/stats/dailyshape — was object-wrapped + empty ({"daily":[]}withday_bucket/block_count). Now a bare array[{date, blocks, transactions}]: calendardatederived from theday_bucketepoch in SQL (to_timestamp(day_bucket*86400)at UTC), numeric counts — matching the legacy indexer the frontend was built against.stats_daily_mv(migration 0002) had arefresh()helper with no caller, so it stayed empty despite millions of indexed blocks. The indexer now refreshes it on an interval (INDEXER_STATS_REFRESH_SECS, default300): first tick does a plainREFRESH(Postgres rejectsCONCURRENTLYon a never-populated MV), subsequent ticks useCONCURRENTLYso reads aren't blocked./whale/txalias — added alongside/whale/transfers(the legacy path name)./accounts/activeand/contracts/*already share the legacy shape.Tests
stats/dailyserialises as a flat, bare array with legacy field names (no wrapper, noday_bucket).Follow-up
Contract leaderboards (
/contracts/recent,/contracts/pioneers) return the correct shape but no data yet — contract indexing is a separate change (Phase 2).Summary by CodeRabbit
New Features
/whale/txAPI endpoint alias for convenient access to whale transfer data and leaderboard rankingsAPI Updates
/stats/dailyendpoint response structure: now returns daily statistics as a flat JSON array containingdate,blocks, andtransactionsfields for improved consistency