Skip to content

log_cache: skip unparseable entries instead of aborting fold#1

Open
Justin3546 wants to merge 1 commit intoevanpurkhiser:mainfrom
Justin3546:skip-corrupt-log-entries
Open

log_cache: skip unparseable entries instead of aborting fold#1
Justin3546 wants to merge 1 commit intoevanpurkhiser:mainfrom
Justin3546:skip-corrupt-log-entries

Conversation

@Justin3546
Copy link
Copy Markdown

Summary

fold_state_from_append_log currently bails out on the first line of things.log that doesn't deserialize into WireItem. On accounts with long Things Cloud history, the log can include legacy entries that the current schema doesn't recognize (historical ACTIONGROUP-* IDs, fields that are now required but used to be null, older variants of OneOrMany, etc.), and any one of them kills every command the CLI runs — including --no-sync.

On my account, the log has 21,733 entries and 9 of them trip this. The result is that the CLI exits with just Corrupt log entry at …/things.log and never produces output.

This PR changes the fold loop to:

  • Warn to stderr with the line/column of the parse error
  • Advance safe_offset past the bad line
  • Continue folding the rest of the log

Conceptually: the append-log is a best-effort replay log, and one unknown entry shouldn't be fatal. The behavior for partial trailing lines (the existing fold_state_ignores_trailing_partial_line test) is unchanged — those are still stopped-at, not skipped, since they may complete on the next sync.

Test plan

  • Added fold_state_skips_unparseable_line_and_continues which seeds a log with good / corrupt / good and asserts both good entries end up in state and the state-cache offset advances to EOF.
  • Existing fold_state_ignores_trailing_partial_line still passes.
  • Verified against a real account with ~10MB of log history: 9 entries skipped with informative warnings, things3 today / inbox / projects render correctly.

The append-log can contain legacy entries that don't deserialize into
the current WireItem schema — for example, historical ACTIONGROUP-*
IDs, entries with fields that have since been narrowed from optional to
required, or variants that no longer match. These are all best-effort
replay data; a single entry the parser doesn't understand shouldn't
abort the whole command.

Previously fold_state_from_append_log returned on the first
deserialization error, which made the CLI unusable against accounts
with any old-format history in their log.

Now we log a warning to stderr with the error location and skip the
entry, advancing safe_offset so the state cache still moves forward.
Fold continues, and the remaining entries apply normally.

Added a test that seeds a log with a known-bad entry between two good
ones and verifies both good entries end up in state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant