Skip to content

refactor(parser): require explicit provider factories + migration cleanup#885

Open
mariusvniekerk wants to merge 7 commits into
fam/sqlite-containersfrom
fam/explicit-registry
Open

refactor(parser): require explicit provider factories + migration cleanup#885
mariusvniekerk wants to merge 7 commits into
fam/sqlite-containersfrom
fam/explicit-registry

Conversation

@mariusvniekerk

@mariusvniekerk mariusvniekerk commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Closes out the migration. Replaces the legacy default factory with an explicit per-agent factory registry that panics on a missing factory, and routes the import-only ChatGPT and Claude.ai export providers through dedicated export interfaces.

Removes the migration scaffolding now that every provider is folded: the anti-shim guard test and the dead DiscoverFunc/FindSourceFunc registry hooks. Also includes the source-pinning and lookup hardening tail.

Where to look: the explicit providerFactoryForDef switch and the import-only provider.

Make the provider registry force every agent onto an explicit facade path
instead of silently inheriting a legacy fallback factory. Remove the
legacy provider fallback entirely so an unhandled AgentDef is a loud
construction error, and represent the non-filesystem export parsers
(Claude.ai, ChatGPT) with explicit import-only providers. Mark the
concrete providers authoritative in the migration manifest and drop the
legacy-only marker.

Route FindSourceFile and SourceMtime through provider FindSource and
Fingerprint so the stack tip exercises provider-owned source identities
and composite mtimes rather than parallel legacy dispatch.

Retire the test scaffolding that depended on the removed legacy types:
per-provider tests assert concrete construction, the obsolete
legacy-fallback and legacy-only-mode registry tests are dropped, and the
zero-legacy anti-shim gate runs with an empty pending list.

With codex's legacy ShallowWatchRootsFunc removed in favor of the provider
WatchPlan, fix collectProviderWatchRoots so a WatchPlan root that does not
exist yet but lives under an already-watched ancestor no longer marks the
whole directory for unwatched polling. A not-yet-created per-session
recursive root otherwise regressed parity by polling two codex dirs that
share a watched state-directory root; the ancestor watch observes the
target's creation and a later sync establishes the deeper watch.

refactor(parser): fold export parsers onto the import-only providers

ChatGPT and Claude.ai sessions never come from disk discovery; they enter
the archive only through a one-shot import. Move the ParseChatGPTExport and
ParseClaudeAIExport free functions onto the import-only provider as the
ChatGPTExportParser and ClaudeAIExportParser methods, and route the importer
and tests through NewProvider plus a type assertion.

This removes the last provider-specific legacy parse free functions, so the
parser package now owns every agent's parse behavior on provider receivers
rather than on standalone entrypoints.

refactor(sync): remove the dead provider shadow-compare harness

No provider runs in shadow-compare mode: every agent is now either
provider-authoritative or import-only. The shadow harness that dual-ran a
side-effect-free provider parse against the legacy result and recorded the
diff was therefore never invoked at runtime.

Delete the harness end to end: the ObserveProviderSource entry point and its
comparison machinery, the Engine.observeProviderShadow hook and its two call
sites, the ProviderShadowRecorder config/field wiring, and the
ProviderMigrationShadowCompare mode (collapsing every switch that paired it
with provider-authoritative). The provider outcome validation and effect
planning helpers that the live parse path still relies on move to
provider_effects.go, which is all that file ever held that was reachable.

test(parser): drop the facade-migration anti-shim scaffolding

With every provider folded onto receiver methods and zero provider-specific
legacy parse free functions left, the migration's enforcement tests have
served their purpose. They assert the absence of named functions and that
provider files do not shim legacy entrypoints, which is only meaningful
while the stack is mid-migration; after merge they are pure maintenance drag
that breaks whenever a symbol is legitimately renamed.

Delete the per-provider Test*ProviderOwnsLegacyEntrypoints guards and the
shared anti-shim scan (provider_shim_scan_test.go). The providers' behavioral
tests remain and are what actually protect the parse paths going forward.

feat(parser): migrate aider, omp, reasonix to providers

origin/main carries three agents the facade stack never migrated: Aider,
OhMyPi, and Reasonix. After rebasing onto it, the explicit provider registry
panicked on startup because those agents had no concrete factory, and the
migration manifest still listed them as legacy-only against a manifest that no
longer defines that mode.

Give each a concrete provider so the zero-legacy registry stays intact:

- OMP shares Pi's JSONL session format, so the Pi provider is parameterized by
  AgentDef (type and ID prefix) and serves both Pi and OhMyPi; ParseOMPSession
  is folded away.
- Reasonix gets a single-file provider whose composite fingerprint folds the
  .jsonl.meta sidecar (mirroring reasonixEffectiveInfo) and whose changed-path
  classifier reproduces the project/global/archive/subagent layouts and the
  sidecar-to-transcript mapping.
- Aider gets a multi-session provider that fans one history file out into one
  session per run under "<history>#<idx>" virtual paths and force-replaces on
  parse, mirroring the Shelley shape. Per-run skip is handled by
  dropUnchangedSharedSQLiteResults (content-hash compare); remote-sync identity
  stability is preserved by threading the path rewriter through ProviderConfig
  so per-run IDs stay stable across temp extraction dirs.

The three manifest entries flip to provider-authoritative and the legacy engine
methods (processAider, processReasonix, aiderFileUnchanged, aiderIdentityPath,
classifyAiderPath) plus the now-dead legacy processFile fall-through are
removed.

The two codex append regression tests that were re-pointed onto processFile no
longer consume their re-stat result; drop the unused assignment to satisfy
staticcheck.

test(sync): restore provider runtime regressions

The shadow-compare harness removal also deleted coverage for live provider-authoritative runtime behavior. Restore those checks against the final processFile and SyncSingleSession paths with a small fake provider so future migrations cannot drop source lookup, retry data-version, skip-cache, skip-reason, not-found, or force-parse behavior silently.\n\nAlso assert the checked-in migration manifest only contains final provider modes and remove stale shadow-compare wording from live sync comments.\n\nValidation: go test -tags "fts5" ./internal/sync -run 'TestProcessFileProvider|TestSyncSingleSessionProviderAuthoritativeBypassesProviderSkipCache' -count=1; go test -tags "fts5" ./internal/parser -run 'TestProviderMigrationModes' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check

fix(sync): make provider source lookup authoritative

Stored file_path values can be stale after provider migration, especially for virtual DB-backed sources and remote-canonical Aider histories. Treat them as lookup hints instead of source-of-truth paths so provider-owned identity and freshness decide what can be resynced.

Aider now resolves remote canonical physical and virtual hints using the same path-rewriter identity model as parse, Hermes verifies state.db contains the requested raw ID before claiming it, and DB-backed providers fall through from stale hints to raw-ID lookup while fresh deleted rows remain not found.

Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestDBBackedProviderStoredVirtualPathFreshness|TestDBBackedProviderRejectsInvalidStoredVirtualPaths|TestFindSourceFileProviderAuthoritativePrefersProviderOverStoredPath|TestSyncForgeMissingConversation' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check

fix(watch): preserve provider watch root semantics

Provider-authoritative agents must drive live watcher setup from their WatchPlan, otherwise legacy flags such as Cowork's ShallowWatch can silently narrow recursive coverage. Keep the provider root shape when collecting watcher roots so recursive provider roots stay recursive.

Missing provider roots are only treated as covered when an existing recursive watch covers the subtree or a shallow root can observe direct creation of that missing root. This preserves Codex's shallow parent behavior without letting shallow ancestors hide deeper missing recursive roots from polling.

Validation: go test -tags "fts5" ./cmd/agentsview -run 'TestCollectWatchRoots(UsesCoworkProviderRecursiveRoot|UsesGeminiProviderMetadataRoot|UsesAntigravityCLIHistoryRoot|PreservesDirsSharingWatchRoot|HermesSessionsWatchesStateDBParent)|TestMissingWatchRootCoverageDoesNotTreatShallowAncestorAsRecursive' -count=1; go test -tags "fts5" ./cmd/agentsview ./internal/parser -run 'TestCollectWatchRoots|TestMissingWatchRootCoverage|TestCoworkProviderSourceMethods|TestGeminiProvider|TestAntigravityCLI|TestSQLiteFanoutSourceSetUsesFindDBPathAsCanonical' -count=1; go test -tags "fts5" ./cmd/agentsview ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check

test(provider): cover non-sync provider callers

Provider migration removed legacy discovery and source lookup hooks, so non-sync callers need explicit coverage that they continue to use provider capabilities. Add contracts for parse-diff supported-agent resolution, token-use raw disk probing, and SSH remote directory resolution across the provider-authoritative agents called out by review.

The Cursor token-use case uses a real provider-owned source layout to prove an unsynced raw or canonical session ID resolves through provider FindSource. Comments now describe the shared disk source lookup instead of implying FindSourceFunc is still the only path.

Validation: go test -tags "fts5" ./cmd/agentsview ./internal/sync ./internal/ssh -run 'TestParseDiffSupportedAgentsIncludesProviderAuthoritativeAgents|TestParseDiffProviderAuthoritativeAgentsAreDiscoverable|TestParseDiffDiscoversProviderSources|TestResolveSessionID_ProviderAuthoritativeCursorOnDiskNotInDB|TestAgentHasDiskSourceLookupIncludesProviderAuthoritativeAgents|TestBuildResolveScript' -count=1; go test -tags "fts5" ./cmd/agentsview ./internal/sync ./internal/ssh -count=1; go fmt ./...; go vet ./...; git diff --check

docs(provider): spell out source identity contract

Provider implementers need the source-hint and freshness rules at the API boundary, not only embedded in migration review context. Document that stored file paths and fingerprint keys are advisory, provider lookup is authoritative, and persisted source identity must stay compatible with source metadata, skip-cache, data-version, PostgreSQL, and session metadata consumers.

Also update the facade design note so the current stack tip no longer claims shadow-compare mode still exists.

Validation: go test -tags "fts5" ./internal/parser -run 'TestProvider|TestProviderMigrationModes' -count=1; go fmt ./...; go vet ./...; git diff --check; mdformat applied by commit hook

test(provider): enforce anti-shim ownership policy

The migration should not rely on per-agent AgentDef hooks or exported provider-specific facade functions once a provider is authoritative. Add one maintained package-wide guard for that policy and remove the obsolete Kiro-only scan.

Aider and Reasonix were still wired through legacy DiscoverFunc/FindSourceFunc despite having provider-authoritative implementations. Remove those hook assignments and make their provider-specific parser/discovery/lookup helpers package-local so runtime callers go through the provider registry.

Validation: go test -tags "fts5" ./internal/parser -run 'TestProviderAuthoritativeAgentDefsDoNotExposeLegacyHooks|TestNoExportedProviderFacadeShims|TestNoProviderFacadeShimNamePolicyDocumentsAllowedHelpers|TestProviderAntiShimScanReadsExpectedPackage|TestAider|TestReasonix|TestProviderMigrationModes' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -run 'Aider|Reasonix|TestProviderAuthoritativeAgentDefsDoNotExposeLegacyHooks|TestNoExportedProviderFacadeShims|TestProviderMigrationModes' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check; mdformat applied by commit hook

fix(parser): make import export capabilities agent-specific

ChatGPT and Claude.ai imports should advertise only the export parser surface that belongs to their agent. The shared import-only provider type made both interface assertions succeed, which hid capability loss during migration review and let callers treat either provider as supporting the other export format.

Validation: go test -tags "fts5" ./internal/parser -run 'TestImportOnlyProviderExportCapabilitiesAreAgentSpecific|TestParseChatGPTExport|TestParseClaudeAIExport' -count=1; go test -tags "fts5" ./internal/parser -count=1; go fmt ./...; go vet ./...; git diff --check

fix(parser): hash reasonix metadata sidecars

Reasonix metadata-only edits can change session fields without touching the transcript bytes. Provider fingerprinting already folded sidecar size and mtime, but the hash stayed transcript-only, so stored freshness state could miss metadata-only changes under hash-based comparisons.

Keep missing-sidecar hashes compatible with the existing transcript-only value, and add provider coverage for layout classification plus deleted sidecar/transcript behavior so the migration contract is explicit.

Validation: go test -tags "fts5" ./internal/parser -run 'TestReasonixProvider(Fingerprint|ChangedPath)' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -run 'Reasonix' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check

fix(sync): keep aider off mtime-only skip cache

Aider histories fan out one physical file into multiple virtual run rows. Letting the generic skip cache key the physical history by mtime alone can bypass the provider fingerprint and per-run DB checks, hiding same-mtime content changes, missing run rows, stale hashes, or stale data versions.

Disable generic skip caching for Aider and rely on the provider parse plus dropUnchangedSharedSQLiteResults hash/data-version filtering. This preserves correctness at the cost of reparsing the shared history before dropping unchanged runs.

Validation: go test -tags "fts5" ./internal/sync -run 'TestProcessFileAiderProvider' -count=1; go test -tags "fts5" ./internal/parser -run 'TestAiderProviderFindSourceUsesCanonicalIdentity|TestAiderProviderRemoteIdentityStable' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -run 'Aider' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check

refactor(parser): extract multi-session container base

Shelley and Aider both surface one physical container (a SQLite DB / a chat history file) as many virtual member sessions, and each hand-rolled the same source-set scaffolding: discovery, watch plan, changed-path classification, the StoredFilePath/FingerprintKey/RawSessionID FindSource tiers, fingerprinting, and the container/member parse fan-out.

Introduce multiSessionContainerSourceSet, a reusable source set, provider, and factory configured entirely through functional options (with*()), so a new special case is a new option rather than a wider signature or a new interface method. Aider's remote-sync identity (PathRewriter) and its canonical-path stored fallback, and Shelley's member-presence check, are all expressed as options.

Fold shelley_provider.go and aider_provider.go onto the base. Per-provider code drops ~45% each (shelley 532->289, aider 501->274); the 469-line base is a one-time cost the remaining container-family providers (zed, kiro, opencode, db-backed, copilot) can reuse.

refactor(parser): add generic SourceSet provider plumbing

The multi-session container base shipped with its own ProviderFactory and a delegating Provider that forwarded six methods to the source set. Every future reusable base (single-file sidecar, and the rest) would re-hand-roll that same factory + forwarding shell.

Extract it once: a SourceSet interface (the Provider source methods plus Parse, minus the Definition/Capabilities/config plumbing), a sourceSetProvider wrapper that supplies ProviderBase and applies the two normalizations every provider shares (raw-session-ID injection on FindSource, the request/config machine fallback on Parse), and a generic sourceSetFactory built from def + caps + a per-config SourceSet constructor.

multiSessionContainerSourceSet now implements SourceSet directly (gaining a Parse method); newMultiSessionProviderFactory becomes a thin adapter over newSourceSetFactory. ParseIncremental stays on the ProviderBase unsupported default until a base needs it.

refactor(parser): add single-file source-set base, fold reasonix

Add singleFileSourceSet, the second reusable SourceSet base, for providers whose physical source is one file that parses into exactly one session (no virtual member paths, no fan-out). Like multiSessionContainerSourceSet it is configured through functional options (withFile*()) and plugs into newSourceSetFactory. The sidecar/composite fingerprint variance across this family stays inside each provider's withFileFingerprint closure, so the base carries no sidecar knowledge until a shared helper is warranted.

Fold reasonix onto it as the validation provider: discovery, the .jsonl.meta sidecar changed-path mapping, the WatchSubdirs-aware changed-path resolution, the composite fingerprint, and the project-hint + fingerprint stamping in parse all become option closures. The provider file drops from 457 to ~280 lines; behavior is preserved (full parser and sync suites green).

refactor(parser): fold zed onto multi-session container base

Zed is a SQLite-DB-per-root thread container like Shelley. Replace its hand-rolled factory/provider/source set with multiSessionContainerSourceSet option closures, preserving the zed specifics: the container path is root/threads/threads.db, the member fingerprint mtime comes from ZedSQLiteSourceMtime, and the container fan-out stamps the DB's own content hash (computed in the parseContainer closure, not the request fingerprint, so missing-fingerprint parses still hash the rows). sqliteDBCompositeMtime stays put; it is shared with shelley and kiro.

refactor(parser): fold visual studio copilot onto container base

Visual Studio Copilot surfaces conversations from shared trace files, but unlike the SQLite containers it discovers one source per conversation (deduped across traces, newest wins) plus a bare source per unreadable trace. Two base generalizations support that: withSourceDiscovery lets a provider emit member-level matches at discovery time, and multiSessionMatch now carries a ProjectHint surfaced on the SourceRef.

The multi-session parse closures now receive the full ParseRequest instead of just the machine string, mirroring the single-file base. This lets vs_copilot honor req.Source.ProjectHint, which the engine sets to the DB-preserved project on single-session re-sync so a user's project override is not reverted. shelley, zed, and aider closures are updated to the new signature (they read req.Machine).

parseConversation becomes a free function and the test helpers call it directly. vs_copilot keeps its virtual-paths-always-strict changed-path classification and stamps the shared trace hash on every fanned-out conversation. Provider file drops 454->250; full parser and sync suites green.

refactor(parser): fold cowork onto single-file base

Cowork sources are single Claude-format transcripts, but one transcript can yield several sessions (main plus subagents) and a parse drives removals via excluded session IDs. Generalize singleFileSourceSet's parse contract to return ([]ParseResult, []string excluded) instead of one *ParseResult, and add withAlwaysCompleteResultSet so a parse that only excludes sessions still reports a complete (not skipped) result set. SourcesForChangedPath now derives allowMissing from jsonlMissingPathFallbackAllowed(req) rather than hardcoding true, which cowork (and vibe) require and reasonix is indifferent to.

reasonix's parse closure is updated to the new slice signature. Cowork's hand-rolled factory/provider/source set become option closures; parseSession becomes the free function parseCoworkSession; the metadata-to-transcript changed-path mapping, composite mtime, and project-hint-from-metadata are all preserved. Provider file drops 523->352; full parser and sync suites green.

refactor(parser): fold vibe onto single-file base

Vibe sessions are single messages.jsonl transcripts with a sibling meta.json. Fold onto singleFileSourceSet: discovery, the messages.jsonl/meta.json changed-path mapping (strict vs session-dir-name fallback under allowMissing), the composite fingerprint, and the fallback-ID exclusion become option closures. The single result plus exclusions and the skip-on-no-session behavior ride the base's multi-result parse contract without withAlwaysCompleteResultSet, since vibe still skips when no session is parsed.

parseSession and parseVibeResult become the free functions parseVibeSession and parseVibeResultFile; the test helpers call them and the provider's Discover directly instead of a concrete *vibeProvider. Provider file drops 507->300; full parser and sync suites green.

refactor(parser): make JSONLSourceSet a SourceSet, fold amp

Add a ParseFile option to JSONLSourceSetOptions and a Parse method to JSONLSourceSet, so the directory-of-files source set (and DirectoryJSONLSourceSet) implements the full SourceSet interface and rides the generic sourceSetFactory. Parse mirrors the single-file base: empty results with no exclusions is a clean no-session skip; req.Machine is resolved by sourceSetProvider. Amp is folded as the template: its hand-rolled factory, provider, five forwarding methods, and Parse collapse to newSourceSetFactory plus an ampParseFile closure; parseSession becomes the free function parseAmpSession. 139->63 lines.

refactor(parser): fold qwen provider onto SourceSet factory

Replace the hand-rolled qwenProvider struct, factory struct, var _
Provider assertion, forwarding methods, and Parse with the generic
newSourceSetFactory plus a ParseFile option on the JSONL source set.
The ParseFile closure passes req.Source.ProjectHint as the project
hint. Convert parseSession from a *qwenProvider method to the free
function parseQwenSession and update the test helper accordingly.

refactor(parser): fold workbuddy onto SourceSet convergence

Replace the hand-rolled workBuddyProvider struct, factory, and forwarding
methods with newSourceSetFactory plus a ParseFile option, matching the amp
provider. Convert parseSession from a method to the free function
parseWorkBuddySession.

Add an optional LookupIDValid predicate to JSONLSourceSetOptions so the
generic FindSource fallback accepts WorkBuddy composite subagent IDs
(<id>:subagent:<id>), which IsValidSessionID rejects. The option defaults to
IsValidSessionID, preserving behavior for all other source-set providers.

refactor(parser): fold deepseek_tui onto SourceSet convergence

refactor(parser): fold zencoder onto SourceSet convergence

refactor(parser): thread context through JSONLSourceSet ParseFile

ParseFile now receives the parse context so directory-of-files folds can do context-aware work such as git-root project resolution; the existing single-result closures ignore it. Also adds a RawSessionIDForLookup hook that normalizes a stored raw session ID before the FindSource discovery comparison, for providers whose stored IDs carry a suffix the discovered filename stem lacks.

refactor(parser): fold iflow onto SourceSet convergence

iFlow now rides the generic source-set factory via DirectoryJSONLSourceSet's ParseFile. Its multi-result parse, relationship inference, and git-aware project resolution move into the ParseFile closure using the threaded context, and the subagent-base-ID normalization its custom FindSource performed is carried by the RawSessionIDForLookup hook.

refactor(parser): fold kimi onto SourceSet convergence

Kimi rides the generic source-set factory via ParseFile. Its colon-joined raw session IDs cannot be matched by the filename-stem discovery scan, so the wire.jsonl path reconstruction (including the agents/ subagent layout) moves to a new RawSessionIDSourceFiles hook that FindSource consults before the scan.

refactor(parser): fold kiro_ide onto SourceSet convergence

Kiro IDE rides the generic source-set factory via ParseFile. Its two on-disk layouts (old <ws-hash>:<file-hash> .chat IDs and new UUID .json files under workspace-sessions/) are resolved by reconstructing candidate paths in the RawSessionIDSourceFiles hook, since the colon-joined old IDs cannot be matched by the filename-stem discovery scan.

refactor(parser): fold qwenpaw onto SourceSet convergence

QwenPaw rides the generic source-set factory via ParseFile. Its colon-joined raw IDs are reconstructed through RawSessionIDSourceFiles; a DB-recorded file_path outside the configured roots is honored via the new StoredPathFallbackRoot hook, which synthesizes the implicit <root>/<workspace>/sessions/ layout; and the wholesale-rewrite outcome is carried by the ForceReplace option.

refactor(parser): convert JSONLSourceSet to functional options

JSONLSourceSet and DirectoryJSONLSourceSet are now built with with*() option closures plus default bundles (withContentHashing, withSymlinkFollowing) instead of a struct literal, matching the multiSessionContainer and singleFile bases. Each source only states what differs from the zero-value defaults. Constructors are lowercased (newJSONLSourceSet / newDirectoryJSONLSourceSet) as they are package-internal. Behavior is unchanged; SiblingMetadataSourceSet keeps its struct-based constructor via the shared jsonlSourceSetFromOptions.

fix(parser): make aider discovery opt-in

Aider had no central session store, so the registry gave it DefaultDirs [""], which resolves to $HOME and drove an always-on bounded walk. For a passive viewer that is a poor default: background refreshes (usage reports, desktop launches) enumerate $HOME and trigger macOS privacy prompts for Documents/Downloads/Music/Photos. Drop the default so aider is discovered only when the user opts in via AIDER_DIR or aider_dirs; a configured broad root still gets the bounded, protected-folder-pruned walk. This aligns the provider-migration branch with the opt-in behavior shipped on main in #844, which this branch predates.

fix(ssh): resolve remote aider history without DefaultDirs

Making aider discovery opt-in emptied its DefaultDirs, but the SSH resolve
script only emitted the AIDER_DIR history-file snippet while iterating that
list, so an explicitly configured remote AIDER_DIR stopped resolving any
.aider.chat.history.md files for transfer. Handle aider independently of
DefaultDirs, emitting buildAiderResolveSnippet whenever its EnvVar is set while
still avoiding any default $HOME scan.

Also cover JSONLSourceSet.FindSource RawSessionIDForLookup normalization, which
runs before both the LookupIDValid gate and the SessionIDFromPath discovery
comparison.
Every agent is now provider-authoritative, so no AgentDef sets DiscoverFunc
or FindSourceFunc. The fields, the always-false branches that consulted them
(full-sync discovery, parse-diff discovery and discoverability, SSH resolve,
source lookup, token-use disk probing), and the now-empty parseDiffDatabaseSources
were left behind by the staged migration. Provider discovery
(discoverProviderSources / parseDiffProviderSources) and provider lookup
(findProviderSourceFile) already own every one of these paths.

Also drop the migration-era scaffolding tests that only asserted this legacy
code no longer exists: the provider_anti_shim_test.go suite (legacy-hook nil
checks plus the AST scan for exported Discover/Find/Parse/Process facade
functions) and the per-agent require.Nil(def.DiscoverFunc) registry assertions.
With the fields gone, their absence is enforced by the compiler.
Hermes FindSource aborted the whole lookup when a state.db query
errored, even though parseArchive deliberately falls back to the
transcript parser when state.db is unreadable or schema-incompatible. A
valid transcript session sitting next to a corrupt or legacy state.db
could no longer be located for resync.

Treat a state.db lookup error as non-claiming for that root and continue
to the transcript lookup, matching the parser's documented fallback.
Guard provider.findRequests with require.Len before indexing so a
missing request fails as a clear length assertion rather than a panic.
Drive the parse-diff provider-authoritative contract from the registry
so it covers every current file-based provider-authoritative agent
instead of a hand-maintained subset. Add a Codex regression test pinning
PreferStoredSource to the stored archived duplicate, contrasted with the
canonicalize-to-live behavior it opts out of.
… single-file doc

Single-session lookup picked the lexicographically last trace file while
discovery picked the newest by mtime, so a conversation could resolve to a
different virtual path on resync when filename and mtime order disagreed.
Extract a shared visualStudioCopilotCandidateWins selector and use it from
both paths, and add a test where mtime and path order diverge.

Also correct the single-file source-set doc comment, which still claimed
exactly one session per file even though Cowork fans one file into many.
The facade-design spec's top note says shadow-compare was removed, but the
Migration Plan section still prescribed opting providers into shadow-compare,
which reads as current guidance. Add a historical-scope banner so future
migration authors do not try to use the retired mode.
An unrelated edit to the Kiro session-count guard was swept into the prior
commit. It flipped the len(r.results) > 1 guard to != 1 with a comment
claiming it corrects the zero-session case, but the empty-result branch
returns earlier so the change was inert and the comment misleading. Restore
the > 1 guard: a zero-session container stays counted as one discovered
source, matching the legacy syncKiroSQLite tally and how every other
zero-session file is counted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant