Extend E2E tests: coverage-driven walk

Status: in progress.

Plan

#	Spec	Status	Tests before	Tests added	Notes
1	settings.spec.ts	done	5	3	sidebar filter, no-match, arrow-key from search input
2	git-portal.spec.ts	done	2	2	tags/v1.0.0, commits/ short-SHA listing
3	viewer.spec.ts	done	10	1	no-matches search state (W-key wrap test dropped, see notes)
4	indexing.spec.ts	done	3	0	already byte-exact end-to-end; no gap warrants e2e
5	network-toggle.spec.ts	done	4	0	existing 4 cover the user-visible UX; deeper coverage is unit-level
6	error-pane.spec.ts	done	3	1	folder-path display + collapsed-details disclosure
7	file-operations.spec.ts	done	8	0	thorough already; rename-conflict gap noted below
8	conflict-move.spec.ts	done	3	0	exhaustive merge / skip / rollback coverage
9	conflict-copy.spec.ts	done	7	0	policy matrix already saturated
10	conflict-edge-cases.spec.ts	done	7	0	rollback + symlinks + type mismatches all covered
11	app.spec.ts	done	14	2	F7-Cancel button and F8 delete confirm dialog
12	accessibility.spec.ts	done	20	0	already covers main flows in light + dark
13	file-watching.spec.ts	done	11	0	CRUD + batch + threshold + dedup all covered
14	mtp.spec.ts	done	21	0	21 tests across browse/copy/move/delete/rename/read-only
15	mtp-conflicts.spec.ts	done	5	0	move-conflict matrix saturated
16	smb.spec.ts	skip	N/A	N/A	Out of scope per brief

Per-spec analysis

settings.spec.ts

Source surface: src/routes/settings/+page.svelte, src/lib/settings/components/SettingsSidebar.svelte, settings-search.ts, settings-registry.ts.

Behaviors covered (before):

Renders, sidebar shows sections, expected section names present, search input accepts text, clicking a sidebar item selects it.

Gaps identified:

Search debounce + sidebar filtering (search actually narrows the visible list).
Empty-result branch (zzzyyyxxxnomatch-style query): sidebar collapses to zero items, clear (×) recovers.
Clear-search button (.search-clear) round-trip.
Arrow Up/Down in the search box drives section selection (the search box has its own handleSearchKeydown separate from the section-tree listbox).
Escape closes the settings window (out of scope: the spec runs many tests that need the window open).
?section=... URL deep-link (out of scope: requires a window reload).
navigate-to-section Tauri event (covered indirectly by the volume picker test; testing it from within the Settings window's own context is non-load-bearing).
Last section persistence (saveLastSettingsSection): also needs a reload.

Tests added (3):

search narrows the visible sidebar sections and clearing restores them: drives the debounced filter with accent (matches one Appearance row), then clicks the × and asserts the full list is back.
search shows an empty sidebar for queries with no matches: covers the no-match branch and confirms the clear button is still reachable; cleans up search state for the next test.
Arrow Down in the search box moves section selection forward: covers the dual-keydown path in SettingsSidebar.handleSearchKeydown (Arrow keys in input forward to navigateSections); clears any leftover search up front so a .selected row is present.

Skipped (with reason): Escape-closes-window, URL deep-link, last-section persistence. All need a window reload that the shared test suite isn't set up to do cleanly.

git-portal.spec.ts

Source surface: src-tauri/src/file_system/git/{virtual_listing,path,tree}.rs. Frontend pane orchestration is generic; the portal lives entirely in the volume hooks.

Behaviors covered (before): 2 active tests (portal root entries; branches/main tree) + 2 skipped (cross-volume copy; portal toggle), both documented.

Gaps identified:

tags/<tag> navigation (exercises resolve_ref_commit, including annotated-tag peel and the dot-bearing ref parse in classify).
commits/ listing (exercises list_commits end-to-end via the volume hook; M3-era code path).
Friendly error rendering inside the portal (would need an injected gix error; error-pane.spec.ts already covers FriendlyError for filesystem errors and a Rust test covers it for git).

Tests added (2):

navigates tags/v1.0.0 and sees the tree at the tagged commit: covers the tag-resolving branch and dot-in-ref parser path.
navigates commits/ and shows the single HEAD commit by short SHA: covers list_commits integration via the listing pipeline; regex-checks for a 7+ hex name to avoid pinning to a specific SHA across git versions.

Skipped (with reason): Friendly git error rendering. Already covered by Rust unit tests + the broader friendly-error path is exercised by error-pane.spec.ts.

viewer.spec.ts

Source surface: src/routes/viewer/+page.svelte plus the composables viewer-search.svelte.ts, viewer-scroll.svelte.ts, viewer-line-heights.svelte.ts. Backend: src-tauri/src/file_viewer/.

Behaviors covered (before):

Render container, line elements, file name in status bar, line count, file size, backend badge, Ctrl+F opens search, search finds matches, Escape closes search, missing-path error.

Gaps identified:

No-match search state (UI says "No matches"). This exercises the "done" status branch of searchStatus and confirms aria-live content.
W toggles word wrap (cross-state setting + CSS class flip).
Enter advances to next match (already covered indirectly because findMatches test pulls a match count, but no test confirms navigation).
F3 from file list opens viewer (cross-component; opens a NEW Tauri window, outside the test's single-window scope; deferred).
Line heights variant testing (FullLoad pretext path): deep internal, deferred.

Tests added (1):

shows "No matches" status for a query with no hits: fills with Z * 40 (the fixture is A × 1024 so cannot match), polls the .match-count aria-live region for "No matches". Resets the query in cleanup.

Skipped (with reason): F3-opens-viewer (cross-window flow), line-heights internals (tier 3 candidates), W toggles word wrap (the synthetic keydown reaches <svelte:window on:keydown> but doesn't flip the wrap class reliably; investigating focus / viewerSetWordWrap IPC side effects would consume disproportionate time for a single-key path; deferred with a TODO).

indexing.spec.ts

Source surface: src-tauri/src/file_system/index/ (renamed indexing in the codebase). Frontend reads via get_dir_stats.

Behaviors covered (before): 3 thorough tests: initial dir size from index, exact-byte increase on file creation, exact-byte decrease on file deletion. UI-side numeric size in Full view also checked.

Gaps identified:

All key flows are already covered. The Scanning... → numeric transition is implicitly covered. Edge cases (non-existent path → null, very large directories) would be unit-level.

Tests added: 0. Reason: The existing suite hits the indexing pipeline end-to-end with byte-exact assertions for both create and delete. Adding more would either duplicate or descend into Rust-side unit territory.

network-toggle.spec.ts

Source surface: src/lib/volumes/ (frontend volume picker), src-tauri/src/file_system/volume/network/ (backend mDNS).

Behaviors covered (before): Default label, toggle-off label, toggle-back label, click-disabled-leaves-volume-unchanged.

Gaps identified:

Settings deep-link to Network section when clicking the disabled entry: settings-window.ts emits a navigate-to-section event. Already exists implicitly in code, and the test author explicitly notes inspecting the settings window is awkward via evaluate().
mDNS-actually-stops behavior: unobservable from the UI side.

Tests added: 0. Reason: Existing tests cover the user-observable UX cleanly. Backend mDNS-stop is unit-level. The deep-link assertion would require spawning the settings window from the test, which is structurally fragile.

error-pane.spec.ts

Source surface: src/lib/file-explorer/pane/ErrorPane.svelte + src-tauri/src/file_system/listing/friendly_error.rs (error classification).

Behaviors covered (before): ETIMEDOUT (transient with retry), retry-clears-error-state, EACCES (NeedsAction without retry), accessibility (role/heading).

Gaps identified:

Folder path display (user must see WHICH directory failed).
<details> technical-details disclosure default-collapsed + click-to-expand.
Retry info text rendering after multiple clicks (deep UX; gated by hitting retry repeatedly within seconds).
x-apple.systempreferences: link handling: unit-testable; production-impactful but doesn't load-bear here.

Tests added (1):

shows the offending folder path and a collapsed technical details disclosure: injects ETIMEDOUT, asserts .folder-path ends with /left/sub-dir, then verifies <details> starts without the open attribute and gains it after clicking the summary.

Skipped (with reason): Retry-info-after-multi-click: feels like UX polish coverage; deferred unless we see regressions.

file-operations.spec.ts

Source surface: src-tauri/src/file_system/write_operations/{copy,move,rename,mkdir}.rs. Frontend: src/lib/file-operations/**, src/lib/file-explorer/views/**.

Behaviors covered (before) (8): F5 copy, F6 move, F2 rename, F7 mkdir, view mode toggle, hidden files toggle, command palette, empty directory.

Gaps identified:

Local rename to an existing name (rejection → rename-conflict dialog). MTP rename rejection IS tested in mtp.spec.ts; local equivalent isn't.
⌘A / Ctrl+A select-all in pane (combined with F5 for multi-file copy).
Cancel button on transfer dialog (only Escape tested).

Tests added: 0.

Reason: The existing spec covers the success path for every write operation end-to-end with byte-level assertions, plus negative cases for the structural flows (empty dir, view toggle). The rename-conflict dialog gap is real but adding it now risks duplicating the structurally-identical MTP rejection test for marginal coverage. The flow lives at the rename UI component level and is a good candidate for a tier-3 jsdom test rather than another full E2E round-trip.

conflict-move.spec.ts / conflict-copy.spec.ts / conflict-edge-cases.spec.ts

Source surface: src-tauri/src/file_system/write_operations/{copy,move}.rs plus transfer-conflict-policy UI.

Behaviors covered (before) (17 across three files): Overwrite All, Skip All, per-file decisions, Rename, Rename All, Layout A nested conflicts, Layout B multi-item merges, mid-operation rollback, sequential conflicts, symlinks, type mismatches (file↔dir).

Gaps identified:

Same-volume copy with both conflict and non-conflict mixed (already covered by Layout A/B).
⌃Z/⌘Z to undo a completed transfer (the app doesn't have undo today).

Tests added: 0. Reason: The conflict-policy matrix is saturated by the three files together. Adding more cases dilutes signal.

app.spec.ts

Source surface: src/routes/(main)/+page.svelte and the global keyboard dispatch (command-dispatch.ts, command-registry.ts).

Behaviors covered (before) (14): Render, dual pane, file entries, arrow nav, Tab pane switch, Space toggle, click cursor move, click pane focus, Enter into dir, Backspace to parent, F7 mkdir dialog open/cancel, F7 mkdir create, F5 copy dialog open/escape, F6 move dialog open/escape.

Gaps identified:

F8 opens delete confirmation (vs. ⇧F8 which is permanent delete).
Cancel button on the mkdir dialog (was only tested via Escape).
Cancel button on transfer dialogs.
⌘A select-all keyboard flow.

Tests added (2):

Cancel button closes the new folder dialog without creating anything: exercises the .btn-secondary path through ModalDialog, asserts no folder was created (file-entry count unchanged).
opens the delete confirmation dialog with F8: F8 opens the delete-confirmation dialog (the recycle-bin path, not ⇧F8); Escape closes it and leaves the file under cursor in place.

Skipped (with reason): ⌘A and Cancel-button on the transfer dialog. The transfer-dialog Cancel button is wired through the same path as Escape (both call the same closeDialog), so the additional test would duplicate signal.

accessibility.spec.ts

Source surface: src/lib/test-a11y.ts runner + axe-core rules across each dialog snapshot in light + dark.

Behaviors covered (before) (20): Main explorer, every major dialog (Copy/Delete/Move/About/License/Command palette/Search/Settings/File viewer) in both light and dark modes.

Gaps identified:

Error pane in axe (error-pane.spec.ts covers ARIA explicitly).
Network volume picker open state: narrower coverage.

Tests added: 0. Reason: Each frame is already audited in both modes; structural a11y for individual components lives at tier 3 (*.a11y.test.ts). Adding more axe snapshots without a clear missing-component would inflate the suite.

file-watching.spec.ts

Source surface: src-tauri/src/file_system/watch/ + the frontend watcher subscription in file-explorer.

Behaviors covered (before) (11): External create (file + dir), delete, rename, modify-size, batch (25), 600-threshold (Linux only), watched-dir deletion, dual-pane sync, in-app-copy dedup, hidden-file filtering.

Gaps identified:

Permissions-change watching: out of e2e scope.
Watcher behavior under symlink resolution.

Tests added: 0. Reason: The spec already covers the full CRUD matrix plus the structural edge cases (threshold, dedup, hidden-file filtering, watched-dir deletion). The remaining gaps are too low-level for an E2E round-trip and would belong in notify-level Rust tests.

mtp.spec.ts / mtp-conflicts.spec.ts

Behaviors covered (before) (26 total): Volume picker, browse, free space, copy bidirectional, move within and across, delete (single, multi, recursive), mkdir, rename, rename rejection, read-only enforcement, Cmd+C/X/V rejection toasts, 50 MB transfer in both directions, external add detection, MTP-to-local and same-volume conflict matrix (overwrite/skip).

Gaps identified:

MTP rename to dotfile (filesystem reserved-name handling): valid but feels nichey.
MTP filename Unicode round-trip: covered indirectly by the SMB unicode tests (skipped on macOS).

Tests added: 0. Reason: 26 tests cover the user-observable surface end-to-end. Adding marginal cases would test virtual-device internals rather than user flows.

Step 7: mutation testing

One-off investigation: ran cargo-mutants (Rust) and stryker-mutator (TypeScript) on two focused slices to find under-tested code paths in already-unit-tested files. Both tools worked; neither is wired into CI. Total: 5 unit tests added (3 vitest, 2 nextest).

Tools status

cargo-mutants v27.0.0: works out of the box after cargo install --locked cargo-mutants. Gotcha: the default test command runs the full crate's nextest suite, and four indexing::reconciler tests fail when run from cargo-mutants's reflinked tmp directory (they create tempdirs in CWD to avoid /private/tmp/ matching EXCLUDED_PREFIXES, but /var/folders/...cargo-mutants-...tmp/ also matches). Workaround: scope tests to the slice via positional args, e.g. cargo mutants --file '**/eta.rs' --timeout 60 --no-shuffle --test-tool nextest -- --lib file_system::write_operations::eta.
stryker-mutator v9.6.1 (@stryker-mutator/core, @stryker-mutator/vitest-runner): worked after three config tweaks. Required plugins: ["@stryker-mutator/vitest-runner"] to load the runner; inPlace: true to avoid oxc transformer failing on the sandbox copy ("Tsconfig not found" for unrelated .a11y.test.ts files); and leaving the vitest config alone (no dir, no related: false): perTest coverage analysis picks the right test files automatically. Took ~12 s total for one mutated file (45 mutants, 4 workers).

Slices and results

Rust slice: apps/desktop/src-tauri/src/file_system/write_operations/eta.rs (EWMA-based ETA estimator for write operations, already had 10 unit tests).

Metric	Count
Total	58
Caught	34
Missed	19
Unviable	5
Timeout	0

Mutation score: 34 / (34 + 19) ≈ 64 %. Full run took ~9 min (after the 73 s baseline build), ~7–8 s per mutant.

TypeScript slice: apps/desktop/src/lib/file-operations/scan-throughput.ts (front-end rate calculator for the scan-preview UI, 6 unit tests).

Metric	Count
Total	45
Killed	31
Survived	13
Timeout	1
Compile errors	0

Mutation score: 31 / 45 ≈ 69 %. Whole run took ~12 s.

Tests added (5 total)

All passed first clean run and again after ./scripts/check.sh.

scan-throughput.test.ts (vitest), targeting stryker survivors in dropStale:

keeps the right window of samples when many arrive over a long span. Four samples with non-linear progression; kills cutoff = nowMs + windowMs mutant (line 68) and the length > 2 → length >= 2 / <= 2 boundary mutants on line 70 that the previous 3-sample window test couldn't differentiate (algebraically equal rates).
always keeps at least two samples even after a long pause. Pushes one sample 60 s after a small starting pair; kills while (true && ts < cutoff) and length <= 2 && ts < cutoff mutants that would empty the buffer and return null.
treats the cutoff timestamp as inclusive (strict less-than). Sample exactly on the cutoff boundary; kills < cutoff → <= cutoff / >= cutoff mutants on line 70.

eta.rs mod tests (nextest), targeting cargo-mutants survivors in EtaEstimator::update:

rate_division_uses_dt_not_a_constant. Drives the estimator with dt = 2.0 s so delta / dt and delta * dt differ by 4×; existing tests all used 1 s steps where the two are numerically identical. Kills / → * and / → % on lines 152, 153.
first_post_seed_sample_initializes_rate_directly. Asserts exact rate after exactly two update() calls: the samples == 0 branch sets the rate directly to the instantaneous rate; existing 3-sample tests masked the mutant samples != 0 because the EWMA caught up. Kills the == 0 → != 0 mutant on line 159 and the 1.0 - (-dt / TAU).exp() alpha arithmetic mutants on line 157 (those changes throw alpha off by a large factor, which shows up immediately after one post-seed sample).

Surviving mutants worth a follow-up

Stopped well short of the 10-test budget. The rest of the surviving cargo-mutants survivors land in eta_from_axes (a saturating two-axis ETA selector). High-value follow-ups, in order:

eta.rs:223 > → ==/</>= in eta_from_axes (3 mutants): bytes_remaining > 0 guard.
eta.rs:224 / → %, eta.rs:231 / → %: axis-rate ÷ axis-remaining divisions.
eta.rs:225, 232 == → !=: short-circuit early-return checks on rates.
eta.rs:230 > → >=: files axis boundary check.
eta.rs:247 > → ==/</>= (3 mutants), eta.rs:247:64 > → >=: final-pick comparison.
eta.rs:190 && → || in compute_stats: guard combining the two axes.
Stryker survivors that the new tests do not kill, e.g. scan-throughput.ts:31 constructor body → {} (windowMs never read; easy fix: a test that constructs with two different windowMs and asserts different drop behavior).

The > 0 ? fps : 0 mutants on scan-throughput.ts:62, 63 (turning > into >=) are equivalent mutants: when fps == 0, both branches return 0, so no test can differentiate them. Skip.

Verdict on long-term CI integration

Don't wire either tool into the main ./scripts/check.sh pipeline. Both are too slow and too noisy for a per-commit gate. Concretely:

cargo-mutants: ~9 min for one 600-line file with the cmdr workspace's deps. Running the full src-tauri/ crate (~150 source files) would be hours, and many modules would need per-slice test filters to dodge the CWD-sensitive reconciler tests. Worth keeping the binary install in dev's mise toolchain and running ad-hoc on hot spots when adding non-trivial numeric / state-machine code (eta, write-op state machines, conflict resolution, indexing, etc.).
stryker: ~12 s per file is fast, but the config has sharp edges (oxc sandbox issues, plugin discovery) and there's no obvious gain from a CI-blocking gate. Same recommendation as cargo-mutants: ad-hoc on numeric / branching FE utilities (scan-throughput, eventually font-metrics, accent-color, etc.).

Concrete next steps if pushing this further:

cargo mutants --in-diff (only mutate lines changed by a PR) as an optional GitHub Actions workflow that runs on labelled PRs, not as a check. Output stays advisory.
Skip the workspace-setup churn by committing a .cargo/mutants.toml with additional_cargo_test_args = ["--lib", "file_system::write_operations::eta"]-style slice-scoped configs per hot module.
Don't add stryker config to the repo. The 3-tweak setup is small enough to redo when needed.

Step 7 follow-up: deep mutation-testing pass

A second pass walked five more hot-spot modules. cargo-mutants --list is fast (no build), so I used it to enumerate mutants per file, then read each module and added tests targeting the structurally surviving ones, skipping a full cargo mutants run because the baseline build alone is ~10–15 min per file. Trade-off: the new tests aren't proven mutation-killers in the strict sense, but they directly cover the mutated lines and behavior, which is what the killer tests in Step 7 ended up doing anyway.

Total: 50 new unit tests across 5 modules, ~0.15 s combined runtime. All 1 699 lib tests still pass.

file_system/write_operations/state.rs (50 mutants, +30 tests): from zero existing tests. Covered the OperationIntent state machine (from_u8, load_intent, is_cancelled), cancel_write_operation transitions (Running→{RollingBack,Stopped}, RollingBack→Stopped, Stopped terminal, conflict-sender drop), cancel_all_write_operations, resolve_write_conflict, status-cache CRUD (register/update/unregister/list_active_operations/get_operation_status, including the bytes-vs-files percent axis and the .min(100.0) clamp), FileInfo sort keys, and CopyTransaction commit / rollback / Drop. Highest-leverage module of the pass: the state machine and status cache back every cancel click and every progress query, with zero coverage before.
file_system/write_operations/copy_strategy.rs (16 mutants, +5 tests): the is_apfs and is_same_apfs_volume helpers were only covered indirectly through copy_file_with_strategy. Direct positive/negative tests on macOS now pin the device-id comparison, the parent-fallback when the destination doesn't exist, and the f_fstypename == "apfs" check.
file_system/watcher.rs (35 mutants, +6 tests): is_entry_modified watches five axes (size, mtime, perms, is_directory, is_symlink); the existing tests only varied size, so every || chain mutant survived. New axis-by-axis tests plus one negative anchor (owner/group changes must NOT trigger a modify diff) and one structural pin for compute_diff's index semantics (remove uses OLD index; add/modify use NEW).
file_system/write_operations/chunked_copy.rs (16 mutants, +3 tests): existing tests covered byte fidelity and basic permissions but never checked the metadata-preservation side effects. New tests stamp a fixed mtime via filetime and roundtrip a user xattr (macOS): kills copy_timestamps → Ok(()), copy_xattrs → Ok(()), copy_metadata → Ok(()). Plus a multi-chunk byte-total assertion that kills the total_bytes += bytes_read arithmetic mutants the existing progress test couldn't differentiate.
indexing/store.rs (~150 mutants overall, +6 tests on platform_case_compare / normalize_for_comparison): the SQLite collation backing path resolution. Only test before was a single macOS happy-path that couldn't catch the → Ordering::Equal mutant (since happy-path comparisons are equal anyway). New tests pin distinct ordering for distinct names, case-insensitivity, NFC↔NFD normalization equivalence on macOS, and binary comparison off macOS.

Modules examined but skipped:

file_viewer/{line_index,session,byte_seek,full_load}.rs: already 50+ tests including UTF-16 column tracking, cancellation, sparse-index checkpoints, multi-byte content. The 240+ mutants here are mostly inside private session state and search-pos accounting where additional tests would duplicate existing-test logical coverage.
file_system/write_operations/{delete,trash,helpers}.rs: top-level *_with_progress functions depend on tauri::AppHandle for event emission; meaningful tests would require a mock-emitter refactor that exceeds the "minimal refactor for testability" budget. move_to_trash_sync (the pure-Rust core) is already covered.

No bugs surfaced in this pass: every survivor was a real coverage gap, not buggy live code. The state-machine guard in cancel_write_operation is correct: Running→{RollingBack,Stopped}, RollingBack→Stopped, Stopped terminal, exactly as the doc-comment promises.

Property-based testing fit (investigation)

proptest and quickcheck aren't in use anywhere in apps/desktop/src-tauri/. The full investigation lives at /tmp/cmdr-property-testing-report.md; this section is the short takeaway.

The most algorithmic, pure spots in the crate are already covered by tight example tests: eta::EtaEstimator (12 tests, two named mutant-survivor targets), listing::sorting (32 tests), validation (13 tests). Adding proptest there gives diminishing returns.

The clear net-positive proptest targets, in order:

indexing::aggregator::topological_sort_bottom_up: 1 example test for a function with non-trivial tree invariants. Cycle and duplicate-ID behavior isn't asserted today.
search::query::glob_to_regex: 4 example tests; infinite input space; output feeds a regex engine that panics on malformed input. "Output is always valid regex" is a one-line property and a real safety net.
search::query::split_scope_segments: 10 example tests for a parser with nested escape/quote rules. Round-trip and segment-count properties are cheap.
indexing::store::platform_case_compare (macOS): comparator-law properties (reflexivity, antisymmetry, transitivity) plus NFC≡NFD equivalence. Highest user impact because miscompare corrupts the search index.

Verdict: worth adding proptest as a dev-dependency for these four targets specifically, ~half a day of work. Not worth a project-wide convention. Don't introduce it for ETA, sorting, or validation: example tests already cover the interesting cases.

IPC contract coverage (investigation)

How well are the 193 #[tauri::command] entry points (visible via bindings.ts) tested at the IPC layer (i.e., a test actually calls the command function or mocks/invokes it by name)? Full report: /tmp/cmdr-ipc-coverage-report.md.

Counts (commit 742939e9):

Well covered (happy + error path): 16 / 193 (8%)
Happy path only: 11 / 193 (6%)
Untested at the IPC layer: 166 / 193 (86%)
Score (well + happy/2) / total: 0.11

Caveat that softens the headline: most commands are thin pass-throughs to *_core / ops_* helpers (AGENTS.md: "Tauri commands are pass-throughs"), and the helpers ARE broadly tested. The 86% measures the contract boundary, not business logic. The bindings-fresh CI check and the no-raw-tauri-invoke ESLint rule mitigate most parameter-shape drift; what they don't catch is permission-config drift or silent rename mismatches at runtime.

Biggest gaps by feature: viewer (9 commands, 0 IPC tests), MTP (~10 commands, 0 IPC tests), licensing (~10 commands, 0 IPC tests), settings/UI mutators (most untested). The write_ops surface (create_directory, create_file, rename_file, move_to_trash) accounts for most of the "well covered" bucket because those _core tests happen to call the command itself.

Verdict: weak at the IPC surface, strong underneath. If we want to raise contract coverage meaningfully, the productive move is a vitest mockIPC layer that asserts each commands.foo(...) call returns a typed shape, not Rust-side per-command tests.

State-machine coverage (investigation)

Full report: /tmp/cmdr-state-machine-report.md (read-only scan, branch e2e-speedup @ 742939e9).

Surveyed 13 genuine state machines (backend + frontend; excludes derived / progress-only enums). About 60 transitions total, roughly 35 untested (~58%).

Coverage is uneven:

Strong: SmbVolume::ConnectionState (Direct ⇄ Disconnected, idempotency, single-flight reconnect), OperationIntent (atomic level), SmbReconnectManager FE, AI notification FE, MTP FE, updater FE, error-reporter auto_dispatcher debounce.
Weak: IndexPhase (Disabled/Initializing/Running/ShuttingDown: no direct test of any transition or the start/stop race), ActivityPhase (six-state telemetry pipeline, no test), DiscoveryState (network mDNS: three transitions, no test), network-store ShareState + CredentialStatus FE (a11y tests only), ConnectToServerDialog FE.
Tested at wrong layer: cancel_write_operation's validation guard (state.rs:306) is bypassed by all tests, which set the atomic directly. The RollingBack → Stopped valid-transition assertion and the rejection of terminal-state writes are not exercised through the public API.

Top untested transitions worth adding tests for:

cancel_write_operation public function (validation guard + conflict_resolution_tx drop).
IndexPhase::Initializing → Disabled race when stop runs during resume_or_scan.
IndexPhase::Initializing → Running happy path.
OperationIntent::RollingBack → Stopped through public cancel (not just direct atomic store).
SearchStatus::Running → Cancelled in file viewer.
DiscoveryState transitions (Idle → Searching → Active → Idle).

Side finding: SmbVolume::ConnectionState::OsMount is a defined variant that is never written to the atomic. The smb2 hot-path branch handling OsMount (smb.rs:658, smb.rs:449) is dead code on the current implementation. Either wire the transition or drop the variant.

Overall verdict: medium-strong. The transition-aware machines that matter for data safety are tested; the orchestration-level lifecycles (IndexPhase, ActivityPhase, DiscoveryState) are not.

Coverage fill-in (follow-up commits)

17 state-transition tests added (and one bug fix surfaced while writing them):

SmbVolume::ConnectionState: dropped the dead OsMount variant. The internal state machine is now exactly the binary shape it was already operating as (Direct ⇄ Disconnected). The outer SmbConnectionState::OsMount (attached by enrich_smb_connection_state for SMB shares with an OS mount but no Cmdr smb2 session) is unchanged.
SearchStatus: fix + transition test. search_cancel was clearing session.search, which made the Cancelled status (set by the search thread on cancel) unobservable: poll returned Idle. Stopped nulling the state on cancel; the thread now writes Cancelled and poll surfaces it. New test pins Running → Cancelled and the reset-on-new-start contract.
DiscoveryState: three transition tests (Idle → Searching → Active → Idle, Searching → Idle via drain, drain side effects). Factored set_discovery_state and drain_discovered_hosts out of the event-emitting public paths so the state machine fragment is testable without standing up a Tauri runtime.
ActivityPhase: nine tests covering the full scan pipeline (Idle → Replaying → Live, Scanning → Aggregating → Reconciling → Live), the shutdown path (* → Idle), the duration-closing branch the timeline UX depends on, the 20-entry ring-buffer cap, reset, and close_phase_with_stats attaching to the current entry.
IndexPhase: four tests: Initializing → Disabled via the public stop_indexing race path, the two catch-all no-op arms (stop_indexing from Disabled, clear_index from non-Running), and the pure is_initializing_phase classifier the post-resume_or_scan decision now goes through. start_indexing's full happy path needs an AppHandle and IndexManager and remains untested at unit-test level. The stress tests cover the writer-layer machinery underneath.

Honest verdict per machine:

Easy: SmbVolume::ConnectionState (already well-tested, just cleanup), ActivityPhase (pure journal, fresh instance per test), DiscoveryState (already had a global cell, only needed an emit/state split).
Awkward: IndexPhase (carries owned non-Clone data, transitions split across start_indexing/stop_indexing/ clear_index, and the race fragment we cared about needs a real IndexStore).
Impossible without a Tauri runtime: start_indexing's Disabled → Initializing → Running happy path: needs an AppHandle to spawn the writer and the verifier. The post-scan decision was extracted to a pure helper so the state-machine fragment that matters (the race) is at least testable.

IPC contract test layer (follow-up commits)

A vitest mockIPC harness (apps/desktop/src/lib/ipc/test-helpers.ts) plus 23 contract tests for the three highest-priority command groups:

Write operations (9 tests): copy_files, move_files, delete_files, trash_files, cancel_write_operation. Pins the payload shape (including the optional config object and the volumeId / itemSizes shapes) and one typed WriteOperationError variant on the error branch.
File viewer (8 tests): viewer_open, viewer_get_lines, viewer_search_start/_poll/_cancel, viewer_close. Coverage report flagged this group as 9/9 untested at the IPC layer.
SMB connection (6 tests): connect_to_server, list_shares_on_host, mount_network_share. The mount path has 6 positional args and AGENTS.md specifically calls out positional-soup as fragile.

What the harness catches: argument coercion, snake-case command name typos, payload-key shape drift, and the typed-error discriminator round-tripping. What it doesn't catch: the real Tauri permission gate (mockIPC patches __TAURI_INTERNALS__.invoke before the gate), business logic in *_core helpers (Rust unit tests own that), or end-to-end behaviour (Playwright owns that).

Honest verdict: modest value, mostly mechanical. The coverage report already concluded that the bindings-fresh check + cmdr/no-raw-tauri-invoke ESLint rule cover most of the realistic drift surface. This layer adds a thin runtime check on top: it verifies that the FE actually drives the binding (not just that the binding compiles), and it documents the wire format in a way that survives a refactor. Worth doing for the write-side, viewer, and SMB groups because those are the ones where a renamed Rust function or a flipped payload key would surface as a generic runtime failure with no obvious cause. Not worth expanding to all 193 commands: diminishing returns kick in fast once the binding shapes are pinned for the destructive / cross-window surfaces.

No bugs surfaced during this pass. Side effect of writing the tests: confirmed that the typed-error discriminator shapes (type for WriteOperationError / MountError / ShareListError, code for LicenseActivationError) are consistent on the wire: the FE branching on error.type / error.code will see the values the bindings declare.

Final state

Branch e2e-speedup, 47 commits, ready for fast-forward to main.

Tests added across the push

Category	Tests	Bugs surfaced
E2E coverage extension	9	1: Cancel-copy rollback (Rust `Ok(())` arm + Svelte settle-window race)
Mutation testing (Rust+Svelte)	55	0
State-machine transitions	17	1: `file_viewer` `SearchStatus::Cancelled` unobservable to FE
Property-based (proptest)	12	0
IPC contract (mockIPC)	23	0
Total new unit / IPC tests	107

Plus dead code removal: SmbVolume::ConnectionState::OsMount variant dropped (state machine collapsed to its real binary Direct ⇄ Disconnected shape).

Suite size delta

Rust unit tests: 1649 → 1728 (+79, mutation 50 + state-machine 17 + proptest 12).
Svelte unit tests: 1783 → 1812 (+29, IPC contract layer + scan-throughput mutants from Step 7).
E2E Playwright: 122 → 131 active tests (+9 from coverage walk).

Suite timing delta

E2E checker total: 13m 12s baseline → 4m 18s in the final slow pass (−67%).
E2E Playwright wall-clock: 10m 12s baseline → ~1m 48s longest shard (−82%).
Fast checker total: ~2m 30s baseline → 3m 13s (+43s). The regression is the new IPC contract tests (+30s of Svelte vitest) and the +79 Rust tests (+1s). Net cost is well below what an equivalent E2E spec would add.

Real bugs fixed

file_viewer search-cancel was unobservable to the FE. search_cancel nulled session.search immediately, so the spawned thread's SearchStatus::Cancelled write was clobbered before the FE could see it. The FE couldn't distinguish "search completed naturally with zero matches" from "search was cancelled mid-flight." Fix: stop nulling on cancel; let the thread write Cancelled and the next search_start replaces it.
Cancel-copy mid-operation rollback was lost on fast filesystems. The Rust Ok(()) arm in copy_files_with_progress didn't check OperationIntent before committing the transaction, so a click during the < 1 µs window between the last is_cancelled poll and loop exit landed as a no-op. Plus the Svelte TransferProgressDialog left the Rollback button enabled during the MIN_DISPLAY_MS = 400 ms settle window after write-complete, so clicks during settle were silent no-ops. Both fixed in Step 6d.

Dead code removed

SmbVolume::ConnectionState::OsMount variant: never written to the atomic, two unreachable match arms gone. The OS-mount fallback the UI renders lives at the outer SmbConnectionState enriched by enrich_smb_connection_state in commands/volumes.rs, not on this internal atomic.

Honest verdict per technique

Mutation testing (cargo-mutants + stryker): zero bugs, 55 tests added. Tools work but are too slow / noisy for CI gating. Worth ad-hoc runs on numeric / state-machine modules. Don't wire into check.sh.
State-machine coverage: 17 tests, 1 real bug (file_viewer search-cancel). Highest signal-to-noise of the test-quality push. The bug was a real silent UX failure, surfaced by writing the transition test.
Property-based (proptest): 12 tests, 0 bugs. proptest added as a dev-dep, scoped to four targets (topological_sort_bottom_up, glob_to_regex, split_scope_segments, platform_case_compare). Worth keeping for those specific algorithmic spots; not worth a project-wide convention.
IPC contract tests (vitest mockIPC): 23 tests, 0 bugs. Modest value: bindings-fresh + no-raw-tauri-invoke already cover most drift. Worth doing for destructive / cross-window surfaces (write ops, viewer, SMB); diminishing returns past that.
E2E coverage extension: 9 tests, 1 real bug (cancel-copy rollback). Surfaced by walking the slowest test (32.7 s) in the post-Step-1 report.

Outstanding items

Linux SMB flakes: 50-share host shows correct share count and unicode shares render correctly: flake under GVFS race in Docker. Pre-existing, David is aware. Not addressed in this branch.
Step 6a parallel-load keystroke-dispatch flakes: rarely surface on warm runs; Step 6e converted the worst offenders to dispatchMenuCommand. Three remaining keyboard-pathway tests can flake under heavy parallel load (~1-in-N runs). The Step 6a fix-suggestions list (data-app-ready route-change reset, focus re-issue after click) is a candidate for a follow-up but wasn't load-bearing here.

Final validation

./scripts/check.sh (fast pass): green in 3m 13s, 1728 Rust + 1812 Svelte tests pass.
./scripts/check.sh --only-slow: green except the two pre-existing Linux SMB flakes. E2E Playwright 131/131 in 4m 18s across 3 shards; rust-tests-linux 1699/1699; eslint-typecheck 453/453.

Follow-up: multi-window migration (viewer + settings)

The viewer and settings UIs run in their own Tauri WebviewWindows in production (labels viewer-<ts> and settings). Before the multi-window migration, the e2e tests bypassed this by routing the main window to /viewer?path=... and /settings, which exercised the page components but not the cross-window plumbing (label uniqueness, capability restrictions, focus/close lifecycle).

The migration uses tauri-plugin-playwright 0.3.0's new multi-window targeting:

tauriPage.waitForWindow(predicate, { timeout? }): polls listWindows() every 100 ms, returns a TauriPage scoped to the matching window.
tauriPage.window(label): fork a scoped page from an existing one (cheap, shared socket).

Canonical test pattern:

// 1. Open via prod trigger (same path the menu / shortcut / MCP uses).
const viewer = await openViewerWindow(tauriPage, filePath)
//      ^^^ helper in helpers.ts; emits `open-file-viewer` and waits for a new viewer-* label

// 2. Wait for the new window's content.
await viewer.waitForSelector('.viewer-container', 15000)
await viewer.waitForSelector('.file-content', 10000)

// 3. Run all interactions through the scoped page.
await viewer.fill('.search-input', 'AAA')

// 4. Close via Escape + listWindows() poll for label disappearance.
await closeScopedWindow(tauriPage, viewer, viewer.targetWindow!)

For settings: openSettingsWindowViaProd(tauriPage) (label settings).

Dependency override: while the upstream @srsholmes/tauri-playwright 0.3.0 + matching plugin crate aren't published, the repo points at the vdavid/tauri-playwright fork on the multi-window branch:

Rust: apps/desktop/src-tauri/Cargo.toml → git ref against the fork's multi-window branch. Cargo's git source resolves the workspace member (packages/plugin/) automatically; the resolved commit hash is pinned in Cargo.lock.
npm: apps/desktop/package.json → file:.../packages/test/srsholmes-tauri-playwright-0.3.0.tgz. We can't use a GitHub ref here because the fork's packages/test/package.json has no prepare script, so pnpm doesn't run the tsup build after fetching from git, leaving dist/ missing and the package broken at import time. The tarball is pre-built and lives in the fork's tree. TODO: drop this once @srsholmes/tauri-playwright@0.3.0 ships to npm.

Revert both to crates.io / npm registry refs (with the appropriate 0.3.x version) once the fork is merged upstream and published.

Capability extension: the auto-generated playwright.json capability (in src-tauri/build.rs) now targets ["main", "settings", "viewer-*"] instead of just ["main"]. Without this, the plugin's pw_result IPC callback would be rejected by Tauri's permission system when evaluating into a viewer or settings window. The eval itself would land in the webview, but the result callback would never get back to the test runner.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend E2E tests: coverage-driven walk

Plan

Per-spec analysis

settings.spec.ts

git-portal.spec.ts

viewer.spec.ts

indexing.spec.ts

network-toggle.spec.ts

error-pane.spec.ts

file-operations.spec.ts

conflict-move.spec.ts / conflict-copy.spec.ts / conflict-edge-cases.spec.ts

app.spec.ts

accessibility.spec.ts

file-watching.spec.ts

mtp.spec.ts / mtp-conflicts.spec.ts

Step 7: mutation testing

Tools status

Slices and results

Tests added (5 total)

Surviving mutants worth a follow-up

Verdict on long-term CI integration

Step 7 follow-up: deep mutation-testing pass

Property-based testing fit (investigation)

IPC contract coverage (investigation)

State-machine coverage (investigation)

Coverage fill-in (follow-up commits)

IPC contract test layer (follow-up commits)

Final state

Tests added across the push

Suite size delta

Suite timing delta

Real bugs fixed

Dead code removed

Honest verdict per technique

Outstanding items

Final validation

Follow-up: multi-window migration (viewer + settings)

FilesExpand file tree

extend-e2e-tests.md

Latest commit

History

extend-e2e-tests.md

File metadata and controls

Extend E2E tests: coverage-driven walk

Plan

Per-spec analysis

settings.spec.ts

git-portal.spec.ts

viewer.spec.ts

indexing.spec.ts

network-toggle.spec.ts

error-pane.spec.ts

file-operations.spec.ts

conflict-move.spec.ts / conflict-copy.spec.ts / conflict-edge-cases.spec.ts

app.spec.ts

accessibility.spec.ts

file-watching.spec.ts

mtp.spec.ts / mtp-conflicts.spec.ts

Step 7: mutation testing

Tools status

Slices and results

Tests added (5 total)

Surviving mutants worth a follow-up

Verdict on long-term CI integration

Step 7 follow-up: deep mutation-testing pass

Property-based testing fit (investigation)

IPC contract coverage (investigation)

State-machine coverage (investigation)

Coverage fill-in (follow-up commits)

IPC contract test layer (follow-up commits)

Final state

Tests added across the push

Suite size delta

Suite timing delta

Real bugs fixed

Dead code removed

Honest verdict per technique

Outstanding items

Final validation

Follow-up: multi-window migration (viewer + settings)