Backup / restore-backup / trash — data-safety v2 (8 tracks, 276 tests)#62
Merged
Conversation
Captures the design for `loseit backup` / `loseit restore-backup` and the trash-on-delete safety, plus a parallel-tracked implementation plan with BDD acceptance scenarios written in CLI language. The plan was originally drafted under the assumption that Lose It! had only a per-day diary endpoint. A live capture (clicking "My Week" on the home dashboard) confirmed `getDailyDetailsIncludingPendingForDate- Range` is real and dispatched in production. Adding T0 to decode it flips the fetch primitive from N RPCs/grain to 1. Sanitized request/response fixtures included; both have user_id/ user_name replaced with the test-config placeholders so the conformance suite can byte-compare against them.
Reviewer asked for the BDDs to show what files should look like, not
just that they exist. Each scenario now spells out:
- Exact TOON top-level keys in order (schema_version, account,
grain, generated_at, entries[N]), with placeholder values for
fields whose exact contents are environmental (timestamps, hex
IDs, email-like strings).
- Concrete stdout snippets per scenario — the "fetch / fallback /
skip" rows and the trailing summary block.
- The trash.jsonl line shape as a JSON object with every required
key, including the nested entry block.
- The diary --output json shape including created_at / modified_at.
The goal is interface validation: each BDD now tells a reviewer the
exact surface a passing implementation must produce, not just "the
file exists." Specific values (counts, paths) are exact where they
must be; environmental values are abstract.
Implements track T1 of the backup-spec: the on-disk contract for grain files, foods cache, and index file (spec §4). Adds: - `lose_it.backup._fs` — dataclasses (GrainDoc, FoodsDoc, IndexDoc, GrainEntry, FoodCacheEntry, AccountRef, GrainBounds), read/write pairs through `toon_format`, schema-version guard (`SchemaVersionMismatch`), and an atomic-write primitive (`atomic_write_text`: tmp -> fsync -> os.replace, leaves no `*.tmp*` siblings). - Entries are canonically sorted on write by (day_num asc, meal_ordinal asc, created_at asc) so diffs between two snapshots are stable. - Top-level key order is pinned to spec §4 in every writer so the CLI BDDs that grep the first lines hold. - Bumps `version.txt` to 0.3.0 (minor: new module). Tests: 13 hermetic conformance cases covering round-trip, key-order, sort-on-write, schema-version refusal on both reader and writer, and the atomic-write postcondition.
Adds the wire decoder for getDailyDetailsIncludingPendingForDateRange and exposes it through LoseIt.diary_range so callers can pull a window of diary days with a single RPC instead of N per-day fetches. Also defines TooMuchData for the T2 fetch primitive to catch 413/429/5xx and bisect into a smaller grain. - core.daily: build_range_payload, get_daily_details_range, parse_entries_by_day, TooMuchData - core.init: get_init_day_keys (parses full day_num/day_key window from a single getInitializationData response) - client: LoseIt.diary_range bootstraps the day-key cache lazily; a full backup loop now costs 1 init + N range RPCs, not N*per-day - conformance fixture for the request (byte-pinned) + response
) Routes every delete through a trash sink BEFORE the wire call fires, so deleted entries are always recoverable. Per docs/backup-spec.md §9. New module: src/lose_it/trash.py - TrashSink Protocol (runtime_checkable) - TrashReceipt, DeleteResult, DeleteSafetyError - LocalFileTrashSink — appends JSONL to ~/.local/share/loseit/trash.jsonl with chmod 600. - ConsoleTrashSink — TOON/JSON to stdout/stderr. - ChainedTrashSink — fan-out; all-must-succeed, no rollback (§9.7 q3). SDK rewrite: - LoseIt.delete_entry(entry, *, trash_sink=..., acknowledge_no_trash, confirm) -> DeleteResult. Default sink: LocalFileTrashSink. Stash succeeds first; only then does the wire delete fire. If the sink raises, the wire call is skipped and the exception propagates. - LoseIt.restore_trash(*, trash_file, line, keep, dry_run) — replays a trash record through log_food and (optionally) consumes the line. CLI: - loseit delete grows --trash-file / --print-deleted / --no-trash / --i-know-this-is-unrecoverable. --no-trash without ack exits 2 with the exact BDD-pinned stderr. - loseit restore-trash — new command. Default consumes the last line. Tests: - tests/conformance/test_trash.py — unit tests pin the stash-before- wire-call invariant + the chained-sink no-rollback contract. - tests/conformance/test_cli_trash.py — Typer CliRunner integration. - test_cli{,_toon}.py env fixtures isolate HOME so the default sink writes inside tmp_path instead of the developer's real homedir. Version: 0.2.0 -> 0.3.0 (minor — adds visible CLI surface).
T4 of the backup-spec implementation plan. Pulls FLE.f4 / FLE.f5 (GWT epoch-ms longs) into ``FoodLogEntry.created_at`` / ``modified_at`` as aware UTC datetimes, and projects them into ``to_dict()`` as ISO 8601 strings. Unblocks T7 (safe-mode restore), whose upsert match key is (food_id, created_at +/- 10 minutes). Bumps version 0.2.0 -> 0.3.0 (visible field surface on a public dataclass). New fields default to ``None`` so existing fixtures and tests continue to construct ``FoodLogEntry`` without supplying timestamps.
main shipped v0.3.0 via PR #60 while delete-safeguards was being integrated; bumping to 0.4.0 keeps the tag-on-merge CI happy when the eventual delete-safeguards -> main PR lands.
Adds `lose_it.backup.discover_earliest_day` (T3 of the backup-spec impl plan): a pure-logic probe that finds the earliest day a user has diary entries on. The algorithm uses the bulk range RPC (T0) hierarchically: - one yearly `diary_range(Jan-1, Dec-31)` probe per candidate year, - once a year hits, monthly `diary_range(month_start, month_end)` probes, - once a month hits, day-by-day `diary(d)` walk to the exact day. Day-by-day rather than binary-search is load-bearing (spec §5.2): a user who only logged Aug-14 and Aug-15 of their first month would be silently skipped by a midpoint probe. Falls back to a 12-month monthly fan-out when a yearly probe raises `TooMuchData` (the heavy-logger spec §5 branch). Hermetic conformance tests cover the four spec-called-out scenarios (typical profile, no entries ever, late-year start, oversize year) plus the day-by-day walk invariant, driven by a `FakeLoseIt` double. Bumps version.txt to 0.5.0 ahead of the backup feature integration.
…retry) (#68) Adds T2 of the backup track: src/lose_it/backup/_fetch.py with the Grain value type (day/week/month constructors + canonical splitter), fetch_grain (one diary_range call per grain, recurses on TooMuchData through month -> weeks -> days, re-raises if the day-grain floor fails), update_food_cache (spec §6.3 once-per-UTC-day describe gate with today_utc injectable for tests), and the to_grain_entry / grain_entry_sort_key helpers used by the T6 orchestrator. Sort key substitutes modified_at for created_at: T4's empirical analysis showed FoodLogEntry.created_at (FLE.f4) is not a real timestamp; only modified_at (FLE.f5) is a real UTC epoch. The substitution lives in grain_entry_sort_key with a comment, and test_to_grain_entry_uses_modified_at_when_created_at_is_bogus pins the decision. Tests: 20 new hermetic conformance cases covering clean-fetch RPC count, month->week and week->day recursive fallback, day-grain abort, describe cadence (gate hit + gate miss + dedupe + new id), and the sort-key invariants. Full suite is 232 passing.
#69) Adds `lose_it.backup._upsert` with the pure-function half of T7: upsert_match (boolean) and plan_day (per-day matched/missing partition) for safe-mode restore. Per the empirical FoodLogEntry analysis, modified_at substitutes for created_at as the time half of the key — the captured f4 values are not real epoch-ms timestamps. plan_day enforces the additive-only contract (spec §7.4): server-only entries are never enumerated, and each server entry can claim at most one archive entry. Bumps version.txt to 0.5.0.
Composes T1 + T2 + T3 into the end-to-end ``LoseIt.backup`` flow plus the cheap-mode restore (``--skip-restore-on-nonempty-grain-time-ranges``). Safe-mode restore raises ``NotImplementedError`` with a clear pointer at the cheap-mode flag until T7 ships. * ``src/lose_it/backup/_orchestrator.py`` — new module owning the backup walk, resume logic, discovery cache handshake, and cheap-mode restore loop. Silent — all per-grain decisions go through an optional ``progress(report)`` callback. * ``client.py`` — adds ``LoseIt.backup`` and ``LoseIt.restore_backup`` façades; both are 1-1 forwarders to the orchestrator. * ``tests/conformance/test_backup_orchestrator.py`` — 16 hermetic tests using a structural ``FakeLoseIt`` double. * ``version.txt`` → 0.6.0 per the impl plan's track-T6 bookkeeping.
…store (#71) * sdk: hide raw PK arrays from FoodLogEntry.to_dict(); add round-trip functional test (#60) * sdk: hide raw PK arrays from FoodLogEntry.to_dict(); add round-trip functional test `FoodLogEntry.to_dict()` projected the food/entry SimplePrimaryKeys verbatim as 16-int byte arrays — noise that no caller of the JSON/TOON output actually consumed: - Food PK: never crosses the SDK boundary as bytes. The only external use case is round-tripping a food reference, which `food_id` (hex) already serves via `LoseIt.{get_food,describe_food,log_food}`. - Entry PK: no LoseIt RPC accepts it as input on its own. Even `deleteFoodLogEntry` requires the full entry body, which forces a fresh diary read; the entry PK comes along for the ride. There is no external workflow where a caller can act on an entry PK alone. So `to_dict()` now emits `food_id` (32-char hex, same shape as `FoodSearchResult.food_id`) and drops both raw `pk` arrays + the entry-side hex entirely. The raw bytes stay on the dataclass where the envelope builders need them. Added a live-API functional test (`tests/functional/test_readme_example.py`) mirroring the README's SDK example: pinned to 2018-03-15 for isolation, asserts empty → log → present → delete → empty using only the documented SDK surface (`search`/`log_food`/`diary`/`delete_entry`). Matching diary entries by `food_id` proves that the externally visible identifier is sufficient for the round-trip — no PK bytes ever escape the SDK. Also extended the README example to use `food_id` for the match-and-delete loop instead of a name substring, so the docs and the test stay aligned. * test(functional): use servings-based second log; gram path needs gram-stored food The README's SDK example logs `serving_amount=61, serving_unit=ServingUnit.g` as the second log, but the first hit for `li.search("tortilla")` is the Xtreme Wellness wrap — serving-stored with no per-serving-g cross-class slot — so the gram path raises `PortionError` before the diary read-back even runs. The round-trip test isn't gating on the unit-conversion code (that's covered by `test_entries_serving_unit.py` in the conformance suite). It's gating on `search → log → diary → delete → diary` only needing `food_id` to round-trip — no PK arrays. So the second log now uses `servings=2.0` instead, which works regardless of the food's native unit. Verified live against the real API on 2018-03-15: passes. * chore: bump version to 0.3.0 * feat(cli+sdk): backup / restore-backup CLI + safe-mode upsert restore Wires T7's pure plan_day function into the orchestrator as restore_backup_safe, exposes it through LoseIt.restore_backup (replacing the prior NotImplementedError stub), and adds the loseit backup and loseit restore-backup CLI commands per spec §3. * src/lose_it/backup/_orchestrator.py: new restore_backup_safe + a SafeRestoreGrainReport report shape; RestoreSummary gains per-day counters that safe mode populates. * src/lose_it/client.py: restore_backup default routes through safe mode, with skip_restore_on_nonempty_grain_time_ranges=True falling back to cheap mode. New upsert_window kwarg surfaces the ±10m fuzz. * src/lose_it/cli.py: adds backup, restore-backup commands. Renders per-grain rows + summary block per spec §3.1 / §3.2; supports --dry-run, --quiet-skips, -o text|json|toon. * tests/conformance/test_backup_restore_safe.py: 7 unit tests for safe-mode (missing → log, idempotent re-run, ±10m window, additive vs. server-only, dry-run, strict_account, pk round-trip). * tests/conformance/test_cli_backup.py: 7 CLI integration tests hermetic via monkeypatched _open_loseit. * version.txt: 0.6.0 → 0.7.0.
Main shipped PR #60 (0.3.0) during the 8-track integration. delete- safeguards bumped through 0.4.0, 0.5.0, 0.6.0, 0.7.0 in successive track merges. Resolving the version.txt conflict in favor of 0.7.0 since CI tags v{version} on merge to main, and v0.3.0 is already taken.
…xpand delete Documents the new commands shipped in delete-safeguards (8-track impl): - delete: trash-sink behavior + recovery flow + --trash-file / --no-trash flags - backup: rolling archive to ~/.local/share/loseit/backup/YYYY/MM.toon, one RPC per grain via the bulk-fetch endpoint, --quiet-skips, --dry-run - restore-backup: safe-mode (upsert by food_id + modified_at ± 10m, the empirical correction for FoodLogEntry.f4 not being a real timestamp) + cheap-mode - restore-trash: undo the most recent loseit delete, --line/--keep/--dry-run Also updates the TOC and the loseit --help command-list rendering.
phitoduck
added a commit
that referenced
this pull request
Jun 13, 2026
Two ruff failures kept CI red on main after the delete-safeguards merge (PR #62) and the path-consolidation merge (PR #72): - UP035: src/lose_it/backup/_upsert.py imported Iterable from typing. Pyupgrade rule says use collections.abc.Iterable. - I001: after the swap, the import block became un-sorted. Both fixed; ruff check is now clean. Tests stay at 276/276.
phitoduck
added a commit
that referenced
this pull request
Jun 13, 2026
Two ruff failures kept CI red on main after the delete-safeguards merge (PR #62) and the path-consolidation merge (PR #72): - UP035: src/lose_it/backup/_upsert.py imported Iterable from typing. Pyupgrade rule says use collections.abc.Iterable. - I001: after the swap, the import block became un-sorted. Both fixed; ruff check is now clean. Tests stay at 276/276.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the data-safety spec at
docs/backup-spec.mdacross 8 parallel tracks (T0-T8). 276/276 unit tests green. Version 0.7.0.What ships
loseit backup— captures the diary into a per-grain TOON archive (~/.local/share/loseit/backup/YYYY/MM.toonetc.). Resumable, atomic per-grain, one RPC per grain by default via the bulk-fetch endpoint.loseit restore-backup— restores from an archive. Default safe mode is entry-level upsert (food_id + modified_at ± 10m); cheap mode (--skip-restore-on-nonempty-grain-time-ranges) skips grains the server already has data for.loseit deletenow routes through a TrashSink before the wire delete fires. Default sink (LocalFileTrashSink) appends a JSONL record to~/.local/share/loseit/trash.jsonl(mode 0600).--no-trashrequires--i-know-this-is-unrecoverable.loseit restore-trash— replays the most recent (or--line N) trash record.Track-by-track PR trail (already merged into
delete-safeguards)diary_rangeSDK method (1 RPC per grain)created_at/modified_atonFoodLogEntryfood_id + modified_at ± 10m)Empirical findings during implementation
getDailyDetailsIncludingPendingForDateRangereturns oneDailyDetailsper day in the requested range. Spec §6.1 was rewritten so this is the default; per-daygetDailyDetailsIncludingPendingForDateis the recursion floor.FoodLogEntry.f4is NOT a real "created" timestamp. Empirical analysis of captured fixtures shows values clustering around 1970-02-15 (decoded as epoch-ms). Onlyf5(modified_at) is a real timestamp (~June 2026 in the fixtures). The spec called for(food_id, created_at ± 10m)as the upsert key; the implementation uses(food_id, modified_at ± 10m)instead. Field is still surfaced for forward-compat. A future live re-capture should verify what f4 actually is — could be a counter, hash, or different epoch.What did NOT ship (deliberately, per spec)
schema_version, then we design.TrashSink(spec §9.7 open question).yeargrain (oversized for the bulk endpoint by design).Test plan for the reviewer
docs/backup-impl-plan.md§7 against the safe Feb-2016 window.loseit diary --output json) and noting whethercreated_atis ~now or ~1970.⛔ DO NOT MERGE until reviewed by Eric.
Per Eric's explicit instruction: this PR is for review only. Eric will merge it manually once satisfied.