Skip to content

feat(parser): migrate Codex + editor-extension providers#883

Open
mariusvniekerk wants to merge 5 commits into
fam/bespokefrom
fam/codex-editors
Open

feat(parser): migrate Codex + editor-extension providers#883
mariusvniekerk wants to merge 5 commits into
fam/bespokefrom
fam/codex-editors

Conversation

@mariusvniekerk

@mariusvniekerk mariusvniekerk commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Migrates codex, gemini, copilot, the Copilot IDE variants (VS Code and Visual Studio), and positron onto facade providers.

Includes the Codex incremental-fingerprint expectation update: a Codex re-sync is a full re-parse that stores the raw file size and hash, with the parsed-prefix versus partial-tail distinction enforced at parse-diff time via CodexTranscriptConsumedSize rather than in the stored fingerprint.

Codex sessions have an append-only JSONL transcript plus a session_index.jsonl title sidecar. Moving Codex behind a concrete provider keeps that composite source identity and incremental append capability explicit at the provider boundary.

The provider preserves dated and archived discovery, live-over-archived lookup, shallow index watch planning, index-event classification, index-aware mtimes, source hashing, full parse output, and append parsing with full-parse fallback signals.

fix(parser): preserve codex provider sidecar semantics

Codex index changes are part of source freshness, so the provider cannot treat unchanged transcript size as no new data when the index mtime drove the fingerprint. The provider also needs to keep legacy live-over-archived UUID behavior and classify removed transcript paths syntactically.

Index events now conservatively refresh sibling Codex sources because this provider layer has no DB state for title diffing; the sync engine can still apply its DB-aware filtering before provider dispatch is fully authoritative.

Validation: go test -tags "fts5" ./internal/parser -run TestCodexProvider -count=1; go vet ./...; git diff --check. go test -tags "fts5" ./internal/parser -count=1 currently fails on TestProviderMigrationModes because inherited lower provider branches such as claude still need their branch-local shadow opt-ins.

fix(parser): make codex provider sidecars authoritative

The Codex provider could not safely infer sidecar-only freshness from a single max mtime. Rather than advertise append-only parsing with incomplete sidecar state, keep provider-authoritative Codex parses on the full-parse path until the facade can model sidecar dirtiness explicitly.

Also route persisted path lookup and changed-path classification through the same UUID canonicalization as discovery so archived duplicates do not win over live dated transcripts.

Validation: go test -tags "fts5" ./internal/parser -run 'Test(CodexProvider|ProviderMigrationModes)' -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

test(sync): compare codex shadow parity

Codex is shadow-compared on this branch, so add source-level migration coverage that compares provider observation with ParseCodexSession.

The fixture uses the real sessions/YYYY/MM/DD layout plus sibling session_index.jsonl, proving the provider preserves title sidecar behavior, parser output, and data-version planning.

Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestObserveProviderSourceMatchesCodexLegacyParser|TestCodexProvider|TestParseCodex|TestProviderMigrationModes' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...; git diff --check

fix(parser): accept codex legacy-shaped sources

Provider-authoritative Codex sync still has to rediscover sessions that were stored by the legacy parser even when their rollout filename does not expose a UUID-shaped session id. Without that compatibility path, the later dispatch migration can drop or fail to reprocess valid Codex transcripts that ParseCodexSession can read from session metadata.

Keep the UUID-aware source contract as the preferred path and fall back to root-scoped JSONL sources only when Codex path metadata does not apply, so normal duplicate canonicalization remains unchanged while legacy-shaped fixtures stay reachable.

Validation: go test ./internal/parser -count=1; go fmt ./...; go vet ./...; go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestCodexProvider|TestSyncEngineCodex|TestSyncSingleSessionHashCodex|TestSyncEngineSkipCache' -count=1; git diff --check

refactor(parser): fold codex into provider

Make the Codex provider own its source discovery, lookup, and parse
behavior instead of shimming the package-level free functions. Delete
DiscoverCodexSessions, FindCodexSourceFile, ParseCodexSession, and
ParseCodexSessionFrom: discovery and find-source bodies move onto the
codex source set (discoverSessionPaths, findSourceFile), and parse moves
onto the provider (parseSession, parseSessionFrom). Drop the Codex
AgentDef DiscoverFunc/FindSourceFunc hooks and make Codex
provider-authoritative; ShallowWatchRootsFunc and the exec-source helpers
(IsCodexExecSessionFile, ResolveCodexShallowWatchRoots, the one-time
codex_exec skip migration) stay since only the four parser entrypoints
must go.

A provider has no database handle, so the engine reproduces the DB-aware
and mtime-aware bookkeeping the legacy single-session JSONL path
performed, scoped to Codex to preserve behavior exactly:

  - shouldSkipProviderSourceByDB folds the session_index.jsonl sidecar
    into a DB-stored fingerprint skip, so an unchanged transcript is not
    reparsed when only the shared index mtime advanced and this session's
    title did not change, and a resync still skips after the in-memory
    skip cache is cleared.
  - The provider Parse force-replaces stored rows because Codex emits a
    full parse (it does not advertise incremental append); a late
    token_count line appended to an existing turn rewrites the stored
    message instead of being dropped by an append-only write.
  - Index events keep flowing through the engine's DB-aware
    classifyCodexIndexPath rather than the provider's broad index
    fan-out: the engine fans out only to sessions whose stored title
    changed and pins the chosen on-disk copy (SourceRefForPath) so the
    provider's live-over-archived canonicalization cannot resurrect a
    stale duplicate over the stored copy.
  - SyncAllSince re-expands a UUID's live and archived duplicates
    (AllSourcePathsForUUID) before the mtime cutoff filter, restoring the
    legacy discover-then-filter order so a changed archived copy newer
    than the cutoff is not lost behind an older live copy.

Route parse-diff, the token-use disk probe, and the SSH remote resolve
script through provider Discover/FindSource for provider-authoritative
agents that no longer carry a DiscoverFunc, so Codex sources stay
discoverable, resolvable on disk, and transferable (including the
session_index.jsonl sidecar).

Replace the deleted shadow-baseline test with provider-API coverage
(provider Discover/Parse through ObserveProviderSource) plus a guard that
the four legacy entrypoints stay gone, route the package and engine tests
through the provider methods, and remove codex_provider.go from the
pending shim scan list. This also fixes the previously known-failing
TestSyncPathsCodexIndexEventRefreshesStoredDuplicate, since the index
event now honors the stored archived copy.

test(sync): host shared shadow source helper at codex fold

The per-provider shadow/parse tests share writeProviderShadowSourceFile
to write source fixtures. The Codex fold is the lowest branch that calls
it, so the canonical definition lives here; later provider folds inherit
it instead of redeclaring their own copies.

test(sync): remove unused codex stat assignments

The pre-commit lint hook rejects two Codex appended-fixture tests because they assign os.Stat results back to info without using the value. The tests already assert the append and close operations that matter for setup.

Removing the unused assignments keeps staticcheck clean for the Codex provider migration branch.

fix(parser): pin codex duplicate sources

Codex discovery and raw-ID lookup should still prefer the live dated transcript, but exact filesystem events and DB-stored source hints are different: the caller has already selected a concrete source path. Canonicalizing those paths back to a stale live duplicate can overwrite an updated archived transcript.

Changed-path classification now returns the source pinned to the event path, and non-fresh stored path/fingerprint lookup returns the exact source so SyncSingleSession preserves the archived path already recorded in the database.

Validation: go test -tags "fts5" ./internal/parser -run 'TestCodexProvider(FindSourcePinsExactArchivedDuplicate|ChangedPathPinsArchivedDuplicate|SourceMethods|DiscoverDedupesLiveAndArchivedByUUID)' -count=1; go test -tags "fts5" ./internal/sync -run 'TestSync(PathsCodexArchivedDuplicateEventPinsChangedFile|SingleSessionCodexPreservesStoredArchivedDuplicate|PathsCodexIndexEventRefreshesStoredDuplicate|AllSinceCodexKeepsChangedArchivedDuplicate)' -count=1; go test -tags "fts5" ./internal/parser -run 'TestCodexProvider|TestParseCodex|TestDiscoverCodex' -count=1; go test -tags "fts5" ./internal/sync -run 'Test.*Codex.*' -count=1; go vet ./...; git diff --check

fix(sync): keep codex freshness skips out of cache

Codex provider DB-fresh skips are successful freshness decisions, not parse failures or intentional no-session skips. Recording them in the persistent skip cache can hide a later parser data-version bump because the cache check runs before the DB freshness check.\n\nKeep DB-fresh provider skips non-cacheable and make existing skip-cache entries fall through when a stored row at that path has a stale data version. The same bypass helper still preserves the existing stale-project self-healing behavior.\n\nValidation: go test -tags "fts5" ./internal/sync -run 'TestProcessFile(SkipCacheReparsesStaleCodex(Project|DataVersion)|CodexDBFreshSkipIsNotCached)|Test.*Codex.*' -count=1; go test -tags "fts5" ./internal/parser -run 'TestCodexProvider|TestParseCodex|TestDiscoverCodex' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go vet ./...; git diff --check

fix(sync): surface codex provider discovery failures

Provider-backed parse-diff should not report a clean or incomplete diff when provider discovery failed. Returning that error keeps requested provider-authoritative agents honest and matches the expectation that parse-diff is a verification surface, not a best-effort sync.\n\nAlso pin coverage for stale Codex index entries whose transcripts no longer resolve, so the existing empty-candidate guard cannot regress into an invalid empty work item.\n\nValidation: go test -tags "fts5" ./internal/sync -run 'Test(ParseDiffProviderDiscoveryErrorFails|ClassifyCodexIndexPathSkipsMissingTranscript|ProcessFile(SkipCacheReparsesStaleCodex(Project|DataVersion)|CodexDBFreshSkipIsNotCached))' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go vet ./...; git diff --check

fix(sync): drop duplicate shadowCallerProvider Discover in codex test
Gemini and Copilot are direct local file sources, but each has source-shape details that were still coupled to the legacy adapter path. Moving them behind concrete providers keeps Gemini tmp/<project>/chats discovery and Copilot bare-vs-directory precedence explicit.

The providers preserve raw and full ID lookup, changed-path classification, source hashing, Gemini project hints, Copilot workspace.yaml freshness, aggregate usage events, and parser output normalization.

fix(parser): preserve gemini copilot provider freshness

Gemini and Copilot now advertise provider-owned watch classification, so remove and rename events need to map back to syntactic source refs even after the filesystem entry has disappeared. Without that fallback, watcher-driven sync can leave stale provider sessions until a wider resync happens.\n\nCopilot also exposes a composite fingerprint that includes workspace.yaml freshness and shutdown aggregate usage. The provider parse result has to carry that same file metadata and usage event slice because sync consumes ParseResult, not only ParsedSession.\n\nValidation: go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

fix(parser): include gemini project metadata freshness

Gemini project names can come from projects.json or trustedFolders.json, so treating only the transcript as the provider source leaves metadata-only changes stale. The provider now watches those root-level sidecars, classifies their changes back to discovered sessions, and folds their contents into the source fingerprint.\n\nValidation: go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

fix(parser): hash copilot workspace metadata

Copilot workspace.yaml can change the provider-visible title without changing the event stream. Size and mtime are useful freshness guards, but the provider hash should also include the workspace file contents so same-length title edits cannot be skipped.\n\nValidation: go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

fix(sync): bridge provider path classification

Concrete providers own source sidecars that legacy path classifiers do not know about. SyncPaths now falls back to provider changed-path classification after the legacy classifiers miss, and provider-classified files force a full parse so metadata-only events can refresh stored session state.\n\nLegacy classification remains authoritative when it recognizes a path, preserving existing project extraction and optimized sidecar filters while still letting migrated providers cover new sidecar surfaces.\n\nValidation: go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go vet ./...; git diff --check

fix(sync): preserve provider sidecar reparses

Provider sidecar events can map to the same session file as a legacy path event in one watcher batch. Keeping only the first classified file made the result order-dependent and could drop the force-parse signal that metadata-only changes rely on.

Per-file forced parses also need to bypass the generic skip cache, not just the agent-specific mtime checks, because sidecar updates may leave the transcript mtime untouched while still changing parsed session metadata.

Validation: go test -tags "fts5" ./internal/sync -run 'TestSyncPathsGeminiProjectMetadataEventRefreshesProject' -count=1; go test -tags "fts5" ./internal/sync -count=1; go test -tags "fts5" ./internal/parser -run 'Test(Gemini|Copilot|ProviderMigration)' -count=1; go vet ./...; git diff --check

fix(sync): skip removed provider source events

Provider changed-path classification can return syntactic source refs for deleted files so providers can model remove events. While legacy file processing is still authoritative, enqueueing an exact missing source path makes SyncPaths fail at the initial stat instead of treating the watcher remove as a no-op.

Keep sidecar fanout intact for existing sources, because metadata changes such as Gemini projects.json still need to force a reparse even when the transcript mtime is unchanged.

Validation: go test -tags "fts5" ./internal/sync -run 'TestEngine_ClassifyPathsProvider(RemoveSkipsMissingGeminiSource|SidecarKeepsExistingGeminiSources)|TestSyncPathsGeminiProjectMetadataEventRefreshesProject' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...; git diff --check

test(sync): compare gemini copilot shadow parity

Gemini and Copilot are migrated through concrete providers on this branch, so reviewers need a sync-level parity check that exercises the provider observation contract rather than only parser-local behavior.

The fixtures cover their sidecar-sensitive source shapes: Gemini project metadata feeds the resolved project hint, and Copilot workspace.yaml participates in both title selection and the composite fingerprint.

Validation: go test -tags "fts5" ./internal/sync -run 'TestObserveProviderSourceMatches(Gemini|Copilot)LegacyParser' -count=1; go test -tags "fts5" ./internal/parser -run 'Test(Gemini|Copilot)Provider' -count=1; go vet ./...; git diff --check. Full go test -tags "fts5" ./internal/parser ./internal/sync -count=1 currently fails in existing TestSyncPathsCodexIndexEventRefreshesStoredDuplicate.

refactor(parser): fold gemini and copilot into providers

Move Gemini and Copilot source discovery, lookup, and parse ownership onto
the concrete geminiProvider and copilotProvider and delete the six
package-level legacy entrypoints: DiscoverGeminiSessions,
FindGeminiSourceFile, ParseGeminiSession, DiscoverCopilotSessions,
FindCopilotSourceFile, and ParseCopilotSession.

Discovery and find-source bodies now live as provider-owned source-set
helpers (discoverSessionPaths and findSourceFile on each source set), the
gemini confirmGeminiSessionID guard moves to the provider file, and the
parsers become the providers' parseSession methods. The copilot source set's
bare/dir precedence and dedup, and the gemini session-filename matching, are
reproduced on the provider exactly as before.

Gemini project resolution is preserved on the provider: sourceRef already
resolves the project via BuildGeminiProjectMap/ResolveGeminiProject for both
discovery and changed-path classification, so removing the engine's gemini
project-map plumbing loses no project names. BuildGeminiProjectMap and
ResolveGeminiProject stay exported package helpers used by the provider.

Make both Gemini and Copilot provider-authoritative and drop their legacy
sync dispatch: the classifyOnePath copilot and gemini blocks (and the now
unused geminiProjectsByDir parameter threaded through classifyOnePath and
classifyPaths), the processFile case arms, and the processGemini,
processCopilot, and shouldSkipCopilot methods. copilotEffectiveMtime stays as
a shared composite-mtime helper used by discoveredFileMtime.

Wire the provider facade into parse-diff: agents that dropped their
DiscoverFunc are now discovered through discoverProviderSources (filtered to
the resolved, provider-discoverable agents), and resolveParseDiffAgents
accepts file-based agents backed by a shadow-compare or
provider-authoritative provider. Without this, a provider-authoritative agent
would silently fall out of parse-diff once its DiscoverFunc was removed.

Drop the Gemini and Copilot AgentDef DiscoverFunc/FindSourceFunc hooks, remove
both files from the pending shim scan list, delete the shared shadow-baseline
test file, and replace it with provider-API coverage plus guards asserting the
legacy entrypoints stay gone. Package and engine tests route through the
provider methods via new test helpers.

test(sync): drop duplicate shadow source helper def

The canonical writeProviderShadowSourceFile now lives at the Codex fold,
so this redeclaration in provider_shadow_test.go conflicts with it. Drop
the local copy and its now-unused os/path filepath imports; callers use
the inherited shared helper.

test(sync): restore provider-aware classify tests at gemini fold

The original restack mis-merged engine_test.go on this branch, reverting
the OpenCode SQLite, OpenCode removed-file, Claude stat-error, and Vibe
meta-only classification tests to their stale pre-fold shapes (fake
opencode.db bytes instead of a seeded session, dropped
seedOpenCodeSQLiteSession helper) and re-adding a classify_vibe_test.go
that exists on no lower branch. Those stale tests asserted the legacy
direct-classification behavior and failed against the provider-routed
path. Restore the correct versions inherited from the codex branch, keep
this branch's two new Gemini provider classify tests, and drop the
spurious classify_vibe_test.go.

test(sync): restore gemini provider classify tests at gemini fold

Re-add the two Gemini changed-path classify tests
(TestEngine_ClassifyPathsProviderRemoveSkipsMissingGeminiSource and
TestEngine_ClassifyPathsProviderSidecarKeepsExistingGeminiSources) that
were dropped while restoring this branch's mis-merged engine_test.go to
its provider-aware shape.

fix(sync): skip fresh gemini copilot before hashing

Gemini and Copilot lost their legacy DB freshness gates when the provider-authoritative path took over. That made unchanged sessions reach provider fingerprinting and parsing during normal full syncs, which is unnecessary work and no longer matches the old processGemini/processCopilot behavior.\n\nRestore the cheap pre-fingerprint checks for those two agents: Gemini compares the stored file path size and mtime, while Copilot compares transcript size plus the workspace.yaml effective mtime. Force-parse paths still flow through the provider so sidecar-driven reparses and parse-diff are not suppressed.\n\nValidation: go test -tags "fts5" ./internal/sync -run 'TestProcessFileProviderAuthoritativeSkipsFresh(Gemini|Copilot)BeforeFingerprint|TestProcessCodexAppendedStaleProject(DoesFullReparse|CarriesForceReplace)' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go vet ./...; git diff --check

fix(sync): restore discover fields on shadowCallerProvider

The rebase onto origin/main dropped the discoverSources and discoverErr
fields from the shadowCallerProvider test struct while keeping the Discover
method that reads them, leaving this branch and every branch stacked above
it uncompilable. Restore the two fields so the Discover stub resolves.
VS Code Copilot and Visual Studio Copilot both needed concrete providers because their source identity is richer than a plain parser callback. VS Code needs workspace and global chat discovery with .jsonl preference, while Visual Studio needs virtual per-conversation trace sources with sibling-aware freshness.

The providers preserve raw and full ID lookup, watch classification, source hashing, VS Code project hints, Visual Studio physical trace fan-out, strict composite trace fingerprints, force-replace parse semantics, and parser output normalization.

fix(parser): classify copilot ide source changes

The Copilot IDE providers advertised changed-path classification, but the initial migration only accepted source paths that still existed. That dropped deletion and metadata-only events before the sync layer could make a refresh or removal decision.

Classify syntactically valid removed VS Code chat files and Visual Studio trace files, fan workspace.json changes out to current workspace chat sessions, and cover Visual Studio physical trace fan-out with multiple conversations.

fix(parser): include vscode workspace metadata freshness

VS Code Copilot project names come from workspace.json, so classifying manifest writes is not enough if the source fingerprint still only reflects the chat transcript. An unchanged chat file could skip the parse that refreshes Session.Project.

Fold workspace.json size, mtime, and content hash into workspace chat fingerprints while leaving global chat fingerprints unchanged, and cover metadata-only freshness in the provider tests.

fix(sync): refresh vscode copilot workspace metadata

VS Code Copilot was provider-aware for workspace.json freshness, but this stack still runs legacy sync writes. Without mirroring that freshness in the legacy process path, metadata-only workspace renames could be classified but then skipped against the unchanged chat transcript.

Move the Copilot IDE providers into shadow compare on their migration branch, preserve .jsonl priority during provider changed-path classification, and store composite workspace freshness for VS Code Copilot sessions while both shapes run.

Validation: go test -tags "fts5" ./internal/sync -run 'TestSyncPathsVSCodeCopilot(JSONLPriority|WorkspaceMetadataRefreshesProject)' -count=1; go test -tags "fts5" ./internal/parser -run 'Test(VSCodeCopilotProvider|VisualStudioCopilotProvider|ProviderMigrationModes)' -count=1; go test -tags "fts5" ./internal/sync -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

test(sync): compare copilot ide shadow parity

VS Code Copilot and Visual Studio Copilot are already opted into shadow comparison on this branch, but provider method tests alone do not prove the migration path still matches the legacy parser output consumed by sync.

Cover the workspace-backed VS Code JSONL source and Visual Studio virtual trace source through ObserveProviderSource so reviewers can see provider observation, data-version planning, and legacy parser parity in one place.

Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestObserveProviderSourceMatches(VSCodeCopilot|VisualStudioCopilot)LegacyParser|TestCopilotIDEProvider|Test(VSCodeCopilotProvider|VisualStudioCopilotProvider)' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...; git diff --check

refactor(parser): fold copilot IDE providers

Move VSCode Copilot and Visual Studio Copilot source discovery, lookup, and
parse ownership onto their concrete providers and delete the seven legacy
package-level free functions: DiscoverVSCodeCopilotSessions,
FindVSCodeCopilotSourceFile, ParseVSCodeCopilotSession,
DiscoverVisualStudioCopilotSessions, FindVisualStudioCopilotSourceFile,
ParseVisualStudioCopilotConversation, and ParseVisualStudioCopilotVirtualPath.

VSCode Copilot: discoverSessionFiles and findSourceFile become source-set
helpers, parseSession becomes a provider method, and the shared
discoverVSCodeSessionFiles helper stays in discovery.go.

Visual Studio Copilot: discoverSessionFiles and findSourceFile become
source-set helpers (over the retained findVisualStudioCopilotTraceSourceFile
and discoverVisualStudioCopilotSessionFiles helpers), and parseConversation
becomes a provider method. The virtual-path resolution is reproduced on the
provider via the provider-neutral ParseVirtualSourcePath helper plus the
trace-file and conversation-ID predicates (splitVisualStudioCopilotVirtualPath),
replacing the deleted ParseVisualStudioCopilotVirtualPath. External callers
(session export, direct service, parsediff, engine skip-path checks) use the
new exported SplitVisualStudioCopilotVirtualPath, which wraps the same neutral
splitter. The provider's discovery now surfaces an unreadable physical trace
file as a source so the read failure is reported instead of being dropped.

Make both providers provider-authoritative and drop their legacy sync dispatch:
the classifyOnePath VSCode block, classifyVisualStudioCopilotPath and its call,
the processFile case arms, processVSCodeCopilot and its vscodeCopilot* helpers,
processVisualStudioCopilot, the vscodeJSONLSiblingExists helper, and the
now-dead legacy-preamble references to these agents.

Drop the AgentDef DiscoverFunc/FindSourceFunc hooks for both, remove both
provider files from the pending shim scan list, and replace the shadow-baseline
test with provider API coverage plus a guard asserting the legacy entrypoints
stay gone. Re-home the shared writeProviderShadowSourceFile test helper into
provider_shadow_test.go so the sync test package builds.

fix(parser): preserve copilot provider metadata

Provider-authoritative Copilot sync consumes ParseResult side channels, not only fields stored on ParsedSession. VS Code Copilot was parsing aggregate token usage but returning an empty ParseResult.UsageEvents slice, so a provider resync could erase usage rows.

Visual Studio Copilot single-session resyncs carry the stored project through Source.ProjectHint. Honoring that hint prevents the provider default from overwriting preserved project metadata, while VS Code now also carries the composite fingerprint size and mtime alongside the hash.

Validation: go test -tags "fts5" ./internal/parser -run 'Test(VSCodeCopilotProviderSourceMethods|VisualStudioCopilotProviderSourceMethods)' -count=1; go test -tags "fts5" ./internal/sync -run 'TestSyncPathsVSCodeCopilotPersistsUsageEvents|TestSyncSingleSessionContextVisualStudioCopilotPreservesProject' -count=1; go test -tags "fts5" ./internal/parser -run 'Test.*Copilot.*Provider|TestParseVSCodeCopilotSession_TokenUsage|TestParseVisualStudioCopilot' -count=1; go test -tags "fts5" ./internal/sync -run 'Test.*(VSCodeCopilot|VisualStudioCopilot).*' -count=1; go vet ./...; git diff --check

test(parser): guard visual studio copilot session fold

The Copilot IDE fold deleted ParseVisualStudioCopilotSession along with the other Visual Studio Copilot legacy entrypoints, but the regression guard did not name that symbol. Adding it prevents a future shim from reappearing unnoticed.

Validation: go test -tags "fts5" ./internal/parser -run 'TestCopilotIDEProvidersOwnLegacyEntrypoints|Test(VSCodeCopilotProviderSourceMethods|VisualStudioCopilotProviderSourceMethods)' -count=1; git diff --check
Fold Positron onto a concrete provider-authoritative implementation and
delete the duplicated legacy parser path so there is a single source of
truth for its workspaceStorage-only layout and parse behavior. Discovery,
source lookup, and parse move onto the provider; the package-level
DiscoverPositronSessions, FindPositronSourceFile, and ParsePositronSession
free functions are removed and positron.go is deleted. The engine's
positron-specific dispatch, effective-mtime, and skip-cache blocks are
removed in favor of the provider Fingerprint, which folds workspace.json
size, mtime, and a chat+workspace composite hash into the source
fingerprint so a workspace-only project rename still re-syncs.

To keep that composite freshness once positron has no legacy mtime block,
the SyncAllSince mtime filter resolves provider-authoritative sources
through the provider Fingerprint (discoveredFileEffectiveMtime) instead of
the legacy per-agent mtime path. Codex is excluded from that path: its
Fingerprint folds the shared session_index.jsonl mtime into every session,
which is correct for the skip cache but defeats the per-copy mtime
discrimination the incremental-sync cutoff needs to preserve a changed
archived duplicate, so codex keeps its raw per-file mtime and the index
refresh stays handled separately by codexIndexRefresh. The OpenCode
incremental-sync test asserts the resulting composite freshness, where a
part-only edit advances the source mtime past the cutoff and re-syncs.
Codex does not advertise incremental append, so re-syncing an appended
transcript is a full re-parse that stores the raw file size and hash,
including the ignored partial trailing line. The parsed-snapshot versus
partial-tail distinction is enforced at parse-diff time via
CodexTranscriptConsumedSize, not in the stored fingerprint. Align the
regression with the provider-folded behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant