Skip to content

fix: fetch site state backups from legacy delegates on upgrade#10

Merged
sanity merged 2 commits intomainfrom
fix-legacy-state-backup
Apr 16, 2026
Merged

fix: fetch site state backups from legacy delegates on upgrade#10
sanity merged 2 commits intomainfrom
fix-legacy-state-backup

Conversation

@sanity
Copy link
Copy Markdown
Contributor

@sanity sanity commented Apr 16, 2026

Problem

Sites vanish permanently after a delegate WASM upgrade when the network has garbage-collected the contract state. The legacy delegate migration (fire_legacy_migration) only requests GetPublicKey, GetKnownSites, and GetSigningKey from old delegates -- but NOT GetSiteState. The delegate backup (the only surviving copy of the user's site data when network state is gone) is stranded under the old delegate key and never fetched.

Reported by Ivvor in #9: two sites with full keys disappeared after a recent update. Exported backup keys also failed because the contract state was unreachable under both the old and new contract keys.

Approach

Three changes, each providing defense in depth:

  1. Proactive backup fetch during legacy migration: When legacy KnownSites arrives, immediately request GetSiteState for each prefix from the responding legacy delegate. This runs in parallel with network GETs for the same prefix.

  2. Defensive fallback in NotFound handler: request_site_state_backup() (called when a network GET returns NotFound) now queries ALL legacy delegates in addition to the current one. This handles the case where a user imports a key after the upgrade.

  3. Forward-persist restored backups: handle_restored_site_state() now also backs up the restored state to the current delegate, so it survives future delegate WASM upgrades without requiring the legacy migration to succeed again.

handle_restored_site_state only overwrites default (empty) state, so a successful network GET always wins over a stale delegate backup.

Also documents the migration coverage requirement: every Get* variant in DelegateRequest that reads persisted data MUST be covered by the legacy migration path, otherwise that data is silently lost on upgrade.

Testing

  • cargo test passes (5 tests)
  • cargo clippy --all-targets -- -D warnings clean
  • cargo check -p delta-ui --target wasm32-unknown-unknown clean

Manual verification needed: upgrade a node with existing sites and verify they survive.

Closes #9

[AI-assisted - Claude]

sanity and others added 2 commits April 16, 2026 12:25
The legacy delegate migration only requested GetPublicKey, GetKnownSites,
and GetSigningKey -- but NOT GetSiteState. When the network had
garbage-collected the site's contract state (or the user was offline),
the only surviving copy was the delegate backup stored under the old
delegate key, and it was never fetched. Sites appeared to vanish
permanently after a delegate WASM upgrade.

Two changes:

1. When legacy KnownSites arrives, immediately request GetSiteState for
   each prefix from the responding legacy delegate. The backup arrives
   as a fallback if all network GETs fail.

2. request_site_state_backup (called from the NotFound handler) now also
   queries all legacy delegates, not just the current one. This handles
   the case where a user imports a key after the upgrade.

3. handle_restored_site_state now persists the backup to the current
   delegate so it survives future upgrades.

handle_restored_site_state only overwrites default (empty) state, so a
successful network GET always wins over a stale backup.

Closes #9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add migration coverage documentation to DelegateRequest enum and
AGENTS.md. Every Get* variant that reads persisted data must be covered
by the legacy delegate migration path, otherwise that data type is
silently lost on delegate WASM upgrade.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sanity sanity merged commit f3888c7 into main Apr 16, 2026
2 of 3 checks passed
@sanity sanity mentioned this pull request Apr 16, 2026
sanity added a commit that referenced this pull request Apr 17, 2026
Without cargo:rerun-if-changed directives, Cargo cached the build
script output and the baked-in BUILD_TIMESTAMP_ISO/GIT_COMMIT drifted
from what was actually compiled. This masked whether the published
webapp contained recent fixes.

April 2026 incident: published Delta after #10 merged, but the deployed
webapp showed BUILD_TIMESTAMP=2026-04-15 and GIT_COMMIT=403a816 (the
stdlib-bump, not the fix). The baked-in timestamp was a lie, but so
was the rest of the WASM -- cargo reused stale compiled output.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sites vanishing on upgrade

1 participant