feat: implement instance backup and restore system#91
Open
hippoley wants to merge 2 commits intoYuan-lab-LLM:mainfrom
Open
feat: implement instance backup and restore system#91hippoley wants to merge 2 commits intoYuan-lab-LLM:mainfrom
hippoley wants to merge 2 commits intoYuan-lab-LLM:mainfrom
Conversation
Implement the backup system whose schema and models were already defined but had no working code. This adds full CRUD operations for manual instance backups with async Kubernetes Job-based backup/restore. New files: - repository/backup_repository.go: data access layer (Create, Get, List, Update, Delete, Count) - services/backup_service.go: business logic with ownership checks, per-instance backup limit (20), async K8s Jobs for tar.gz archiving and restore, soft-delete with async cleanup - handlers/backup_handler.go: RESTful HTTP handlers wired under /instances/:id/backups - services/backup_service_test.go: 13 unit tests covering validation, ownership, limits, soft-delete, race-condition defense, and restore preconditions Modified files: - cmd/server/main.go: register backup repo, service, handler and routes - utils/response.go: map backup-specific errors to proper HTTP status codes (400/404) Key design decisions: - Backup jobs run as K8s Jobs (busybox) with HostPath volume mounts, matching the existing PVC storage model - Soft-delete pattern: DeleteBackup marks status='deleted' then launches async file cleanup, preventing data loss on accidental deletion - Race-condition defense: markBackupCompleted/markBackupFailed re-read DB state before updating, skipping writes if the backup was concurrently soft-deleted - Restore clears the target directory before extraction to guarantee a clean state - Nil-guard on pvcService in all async goroutines so unit tests run without a live K8s cluster Closes: backup system part 1 (manual backup/restore) Ref: backups table in 001_init_schema.sql
…D API Part 2 of the instance backup system: - BackupScheduleRepository: CRUD + ListAllActive for backup_schedules table - BackupScheduler: background loop (60s tick) with minimal cron parser supporting @hourly/@daily/@weekly/@monthly and standard 5-field expressions - Idempotency guard: 90s gap check prevents double-fire within same cron window - Expiry cleanup: soft-deletes only completed backups past expires_at - CreateScheduledBackup: system-actor backup creation with auto-computed expiry - BackupScheduleHandler: 4 HTTP endpoints for schedule CRUD with validation - Runtime hardening: panic recovery in tick, sync.Once for Stop(), mutex for tick overlap prevention - 10 new tests covering cron parsing, validation, idempotency, cleanup safety, panic recovery, and double-stop protection
c57c70a to
43c1504
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Instance Backup System
Implements the complete backup system for OpenClaw instances, filling in the empty
backupsandbackup_schedulesDB schema that was already defined in001_init_schema.sql.Part 1: Manual Backup & Restore
BackupRepository: CRUD operations for thebackupstableBackupService: full lifecycle — create, list, get, delete, restoremarkBackupCompleted/markBackupFailedre-read DB state before writing)BackupHandler: 5 HTTP endpoints (create, list, get, delete, restore)Part 2: Scheduled Backups & Expiry Cleanup
BackupScheduleRepository: CRUD +ListAllActive()for thebackup_schedulestableBackupScheduler: background loop (60s tick interval) with a minimal cron expression parser@hourly,@daily,@weekly,@monthlypresets and standard 5-field cron (minute hour dom month dow)*, numbers, ranges (1-5), steps (*/6,1-5/2), lists (1,3,5)CreateScheduledBackup: system-actor backup creation (no user-ownership check) with auto-computedexpires_atBackupScheduleHandler: 4 HTTP endpoints for schedule CRUD with cron expression and retention_days validationRisk Mitigations (built-in)
status = "completed"ANDexpires_at < now()— never touchescreatingstatusretention_days >= 1enforced at API leveltick()hasdefer recover()so the scheduler goroutine survives unexpected panicssync.Onceprotectsclose(stopChan)from double-close panicsync.Mutex.TryLock()skips a tick if the previous one is still runningZero Breaking Changes
Test Coverage