Skip to content

feat(parser): migrate Qwen + QwenPaw providers#879

Open
mariusvniekerk wants to merge 3 commits into
fam/cli-jsonlfrom
fam/qwen-family
Open

feat(parser): migrate Qwen + QwenPaw providers#879
mariusvniekerk wants to merge 3 commits into
fam/cli-jsonlfrom
fam/qwen-family

Conversation

@mariusvniekerk

@mariusvniekerk mariusvniekerk commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Migrates Qwen and QwenPaw onto facade providers, preserving their nested session-directory layouts and raw-session lookup behavior.

Qwen uses a nested project/chats JSONL source shape, so it is a good next provider facade slice after the shallow and directory JSONL migrations. Moving it behind a concrete provider keeps discovery, lookup, fingerprinting, and parse output explicit without introducing another source framework.

Legacy discovery accepts any one-level .jsonl file under chats while raw-session lookup still validates the requested ID before matching filename-derived IDs. The provider keeps that asymmetry, symlinked project directory and file behavior, project hints, and existing parser normalization intact.

Validation: go fmt ./...; go test -tags "fts5" ./internal/parser -run TestQwenProvider -count=1; go test -tags "fts5" ./internal/parser -count=1; make test-short; go vet ./...; git diff --check

test(parser): opt qwen into provider shadow

Qwen now has a concrete facade provider on this branch, so its migration mode should enter shadow comparison instead of remaining an additive legacy-only provider.

Lower provider opt-ins stay inherited and later branches remain responsible for their own concrete providers.

Validation: go test -tags "fts5" ./internal/parser -run TestProviderMigrationModes -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

test(sync): compare qwen shadow parity

Qwen is shadow-compared on this branch, so add the source-level migration proof that provider observation matches ParseQwenSession for its nested project/chats layout.

This keeps reviewers focused on behavioral parity while later branches continue migrating their own provider shapes.

Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestObserveProviderSourceMatchesQwenLegacyParser|TestQwenProvider|TestParseQwen' -count=1; go fmt ./...; go vet ./...; git diff --check; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...

refactor(parser): fold qwen into provider

Qwen already had a concrete provider, but the branch still kept exported legacy parser/source functions and legacy sync dispatch. That left the migration additive instead of making the provider shape authoritative.\n\nMove Qwen parsing behind the provider method, remove the registry callbacks and sync processor/classifier, and replace the shadow comparison with provider API coverage plus a guard that the old entrypoints stay gone.\n\nValidation: go test -tags "fts5" ./internal/parser -run 'TestQwen|TestParseQwenSession' -count=1 -v; go test -tags "fts5" ./internal/sync -run 'TestEngine_ClassifyPathsQwenSession|TestProviderMigration|TestObserveProvider|TestSyncSingle.*Qwen|TestQwen' -count=1 -v; go test -tags "fts5" ./internal/parser ./internal/sync ./cmd/agentsview -count=1; go vet ./...; git diff --check

fix(parser): thread ctx through qwen source lookups
QwenPaw stores rewritten JSON session snapshots under workspace-scoped sessions directories, with a second one-level subdirectory namespace for console sessions. Moving it behind a concrete provider keeps those raw ID shapes explicit while continuing to use the shared file-source mechanics where they fit.

The provider preserves workspace project hints, root and console source lookup, hidden/deeper layout rejection, symlinked workspace discovery, content-hash fingerprinting, and force-replace parse semantics for rewritten session files.

fix(parser): prune qwenpaw source traversal

QwenPaw legacy discovery only walks valid workspace directories, the sessions directory, and one real non-hidden namespace below sessions. The provider migration reused the generic recursive JSONL walker, which preserved emitted source filtering but still allowed traversal and event classification through deeper or symlinked session namespaces.

Add a shared traversal predicate to the JSONL source helper so providers can keep discovery and changed-path classification aligned when their source layouts are narrower than an unbounded recursive scan.

test(parser): opt qwenpaw into provider shadow

QwenPaw now has a concrete facade provider on this branch, so its migration mode should enter shadow comparison instead of remaining legacy-only and additive.

Earlier provider opt-ins stay inherited and later branches own their modes.

Validation: go test -tags "fts5" ./internal/parser -run TestProviderMigrationModes -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

test(sync): compare qwenpaw shadow parity

QwenPaw is shadow-compared on this branch, so add source-level migration coverage that compares provider observation with ParseQwenPawSession.

The test covers both root session files and console subdirectory session files so the path-derived workspace/session IDs and planned data-version behavior stay visible during review.

Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestObserveProviderSourceMatchesQwenPawLegacyParser|TestQwenPawProvider|TestParseQwenPaw|TestSyncSingleSession_QwenPaw|TestWriteBatchQwenPaw' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...

refactor(parser): fold qwenpaw into provider

Move QwenPaw parse and source-lookup ownership onto the concrete
qwenPawProvider and delete the package-level legacy entrypoints
(DiscoverQwenPawSessions, FindQwenPawSourceFile, ParseQwenPawSession)
plus their unexported discovery/traversal helpers.

ParseQwenPawSession becomes the provider parseSession method, and the
rawID resolution from FindQwenPawSourceFile moves onto the provider as
sourceFileForRawID with its traversal guard. The provider's existing
JSONLSourceSet already reproduces workspace/sessions/console discovery,
symlink handling, and hidden-subdir pruning, so the legacy
DiscoverQwenPawSessions free function is dropped entirely.

Route QwenPaw sync classification and processing through the
provider-neutral runtime by removing its legacy engine dispatch: the
classifyOnePath workspace/sessions block, the processFile case arm, and
the processQwenPaw method. The provider's SourcesForChangedPath and
forceReplace-on-parse capability preserve changed-path remapping and the
full-rewrite write semantics the legacy path provided.

Make QwenPaw provider-authoritative in the migration manifest, drop its
AgentDef DiscoverFunc/FindSourceFunc hooks, remove it from the pending
shim scan list, and replace the shadow-baseline test with provider API
coverage plus a guard that the legacy symbols stay gone.

To preserve single-session resync parity, FindSource now resolves a
DB-stored file_path that points outside any configured QWENPAW_DIR by
synthesizing a source from the path's implicit
<root>/<workspace>/sessions/ layout, recovering the workspace as
ProjectHint so a reparse keeps the canonical qwenpaw:<workspace>:<stem>
ID instead of orphaning it under an empty workspace.

fix(parser): thread ctx through qwenpaw source lookups
Update qwen and qwenpaw provider call sites to the exported source-set framework API.
@roborev-ci

roborev-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

roborev: Combined Review (56db670)

Medium findings remain for Qwen provider parity; no security issues were identified.

Medium

  • internal/parser/qwen_provider.go:23 - The migrated Qwen source set never enables content hashing. qwenParseFile only copies req.Fingerprint.Hash, but Fingerprint will not populate it, so Qwen syncs now clear or omit sessions.file_hash even though the removed legacy processQwen always computed it.

    • Fix: Add WithContentHashing() to newQwenSourceSet and cover the real provider.Fingerprint path in the provider test.
  • internal/parser/qwen_provider.go:23 - Qwen no longer supports explicit resync from an existing DB file_path outside the current configured roots. FindSourceFile can still return that stored path, but provider-authoritative processing then calls FindSource and JSONLSourceSet rejects paths not under Roots, causing SyncSingleSession to fail where legacy processQwen parsed the stored path directly.

    • Fix: Add a Qwen WithStoredPathFallbackRoot that derives the implicit projects root from <root>/<encoded-project>/chats/<session>.jsonl, mirroring the QwenPaw fallback behavior.

Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 5m52s), codex_security (codex/security, done, 2m4s) | Total: 8m5s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant