Status: in progress.
| # | Spec | Status | Tests before | Tests added | Notes |
|---|---|---|---|---|---|
| 1 | settings.spec.ts | done | 5 | 3 | sidebar filter, no-match, arrow-key from search input |
| 2 | git-portal.spec.ts | done | 2 | 2 | tags/v1.0.0, commits/ short-SHA listing |
| 3 | viewer.spec.ts | done | 10 | 1 | no-matches search state (W-key wrap test dropped, see notes) |
| 4 | indexing.spec.ts | done | 3 | 0 | already byte-exact end-to-end; no gap warrants e2e |
| 5 | network-toggle.spec.ts | done | 4 | 0 | existing 4 cover the user-visible UX; deeper coverage is unit-level |
| 6 | error-pane.spec.ts | done | 3 | 1 | folder-path display + collapsed-details disclosure |
| 7 | file-operations.spec.ts | done | 8 | 0 | thorough already; rename-conflict gap noted below |
| 8 | conflict-move.spec.ts | done | 3 | 0 | exhaustive merge / skip / rollback coverage |
| 9 | conflict-copy.spec.ts | done | 7 | 0 | policy matrix already saturated |
| 10 | conflict-edge-cases.spec.ts | done | 7 | 0 | rollback + symlinks + type mismatches all covered |
| 11 | app.spec.ts | done | 14 | 2 | F7-Cancel button and F8 delete confirm dialog |
| 12 | accessibility.spec.ts | done | 20 | 0 | already covers main flows in light + dark |
| 13 | file-watching.spec.ts | done | 11 | 0 | CRUD + batch + threshold + dedup all covered |
| 14 | mtp.spec.ts | done | 21 | 0 | 21 tests across browse/copy/move/delete/rename/read-only |
| 15 | mtp-conflicts.spec.ts | done | 5 | 0 | move-conflict matrix saturated |
| 16 | smb.spec.ts | skip | N/A | N/A | Out of scope per brief |
Source surface: src/routes/settings/+page.svelte, src/lib/settings/components/SettingsSidebar.svelte,
settings-search.ts, settings-registry.ts.
Behaviors covered (before):
- Renders, sidebar shows sections, expected section names present, search input accepts text, clicking a sidebar item selects it.
Gaps identified:
- Search debounce + sidebar filtering (search actually narrows the visible list).
- Empty-result branch (
zzzyyyxxxnomatch-style query): sidebar collapses to zero items, clear (×) recovers. - Clear-search button (
.search-clear) round-trip. - Arrow Up/Down in the search box drives section selection (the search box has its own
handleSearchKeydownseparate from the section-tree listbox). - Escape closes the settings window (out of scope: the spec runs many tests that need the window open).
?section=...URL deep-link (out of scope: requires a window reload).navigate-to-sectionTauri event (covered indirectly by the volume picker test; testing it from within the Settings window's own context is non-load-bearing).- Last section persistence (
saveLastSettingsSection): also needs a reload.
Tests added (3):
search narrows the visible sidebar sections and clearing restores them: drives the debounced filter withaccent(matches one Appearance row), then clicks the×and asserts the full list is back.search shows an empty sidebar for queries with no matches: covers the no-match branch and confirms the clear button is still reachable; cleans up search state for the next test.Arrow Down in the search box moves section selection forward: covers the dual-keydown path inSettingsSidebar.handleSearchKeydown(Arrow keys in input forward tonavigateSections); clears any leftover search up front so a.selectedrow is present.
Skipped (with reason): Escape-closes-window, URL deep-link, last-section persistence. All need a window reload that the shared test suite isn't set up to do cleanly.
Source surface: src-tauri/src/file_system/git/{virtual_listing,path,tree}.rs. Frontend pane orchestration is
generic; the portal lives entirely in the volume hooks.
Behaviors covered (before): 2 active tests (portal root entries; branches/main tree) + 2 skipped (cross-volume copy; portal toggle), both documented.
Gaps identified:
tags/<tag>navigation (exercisesresolve_ref_commit, including annotated-tag peel and the dot-bearing ref parse inclassify).commits/listing (exerciseslist_commitsend-to-end via the volume hook; M3-era code path).- Friendly error rendering inside the portal (would need an injected gix error;
error-pane.spec.tsalready covers FriendlyError for filesystem errors and a Rust test covers it for git).
Tests added (2):
navigates tags/v1.0.0 and sees the tree at the tagged commit: covers the tag-resolving branch and dot-in-ref parser path.navigates commits/ and shows the single HEAD commit by short SHA: coverslist_commitsintegration via the listing pipeline; regex-checks for a 7+ hex name to avoid pinning to a specific SHA across git versions.
Skipped (with reason): Friendly git error rendering. Already covered by Rust unit tests + the broader friendly-error
path is exercised by error-pane.spec.ts.
Source surface: src/routes/viewer/+page.svelte plus the composables viewer-search.svelte.ts,
viewer-scroll.svelte.ts, viewer-line-heights.svelte.ts. Backend: src-tauri/src/file_viewer/.
Behaviors covered (before):
- Render container, line elements, file name in status bar, line count, file size, backend badge, Ctrl+F opens search, search finds matches, Escape closes search, missing-path error.
Gaps identified:
- No-match search state (UI says "No matches"). This exercises the "done" status branch of
searchStatusand confirmsaria-livecontent. - W toggles word wrap (cross-state setting + CSS class flip).
- Enter advances to next match (already covered indirectly because
findMatchestest pulls a match count, but no test confirms navigation). - F3 from file list opens viewer (cross-component; opens a NEW Tauri window, outside the test's single-window scope; deferred).
- Line heights variant testing (FullLoad pretext path): deep internal, deferred.
Tests added (1):
shows "No matches" status for a query with no hits: fills withZ * 40(the fixture isA× 1024 so cannot match), polls the.match-countaria-live region for "No matches". Resets the query in cleanup.
Skipped (with reason): F3-opens-viewer (cross-window flow), line-heights internals (tier 3 candidates), W toggles
word wrap (the synthetic keydown reaches <svelte:window on:keydown> but doesn't flip the wrap class reliably;
investigating focus / viewerSetWordWrap IPC side effects would consume disproportionate time for a single-key path;
deferred with a TODO).
Source surface: src-tauri/src/file_system/index/ (renamed indexing in the codebase). Frontend reads via
get_dir_stats.
Behaviors covered (before): 3 thorough tests: initial dir size from index, exact-byte increase on file creation, exact-byte decrease on file deletion. UI-side numeric size in Full view also checked.
Gaps identified:
- All key flows are already covered. The Scanning... → numeric transition is implicitly covered. Edge cases (non-existent path → null, very large directories) would be unit-level.
Tests added: 0. Reason: The existing suite hits the indexing pipeline end-to-end with byte-exact assertions for both create and delete. Adding more would either duplicate or descend into Rust-side unit territory.
Source surface: src/lib/volumes/ (frontend volume picker), src-tauri/src/file_system/volume/network/ (backend
mDNS).
Behaviors covered (before): Default label, toggle-off label, toggle-back label, click-disabled-leaves-volume-unchanged.
Gaps identified:
- Settings deep-link to Network section when clicking the disabled entry:
settings-window.tsemits anavigate-to-sectionevent. Already exists implicitly in code, and the test author explicitly notes inspecting the settings window is awkward viaevaluate(). - mDNS-actually-stops behavior: unobservable from the UI side.
Tests added: 0. Reason: Existing tests cover the user-observable UX cleanly. Backend mDNS-stop is unit-level. The deep-link assertion would require spawning the settings window from the test, which is structurally fragile.
Source surface: src/lib/file-explorer/pane/ErrorPane.svelte +
src-tauri/src/file_system/listing/friendly_error.rs (error classification).
Behaviors covered (before): ETIMEDOUT (transient with retry), retry-clears-error-state, EACCES (NeedsAction without retry), accessibility (role/heading).
Gaps identified:
- Folder path display (user must see WHICH directory failed).
<details>technical-details disclosure default-collapsed + click-to-expand.- Retry info text rendering after multiple clicks (deep UX; gated by hitting retry repeatedly within seconds).
x-apple.systempreferences:link handling: unit-testable; production-impactful but doesn't load-bear here.
Tests added (1):
shows the offending folder path and a collapsed technical details disclosure: injects ETIMEDOUT, asserts.folder-pathends with/left/sub-dir, then verifies<details>starts without theopenattribute and gains it after clicking the summary.
Skipped (with reason): Retry-info-after-multi-click: feels like UX polish coverage; deferred unless we see regressions.
Source surface: src-tauri/src/file_system/write_operations/{copy,move,rename,mkdir}.rs. Frontend:
src/lib/file-operations/**, src/lib/file-explorer/views/**.
Behaviors covered (before) (8): F5 copy, F6 move, F2 rename, F7 mkdir, view mode toggle, hidden files toggle, command palette, empty directory.
Gaps identified:
- Local rename to an existing name (rejection →
rename-conflictdialog). MTP rename rejection IS tested inmtp.spec.ts; local equivalent isn't. - ⌘A / Ctrl+A select-all in pane (combined with F5 for multi-file copy).
- Cancel button on transfer dialog (only Escape tested).
Tests added: 0.
Reason: The existing spec covers the success path for every write operation end-to-end with byte-level assertions, plus negative cases for the structural flows (empty dir, view toggle). The rename-conflict dialog gap is real but adding it now risks duplicating the structurally-identical MTP rejection test for marginal coverage. The flow lives at the rename UI component level and is a good candidate for a tier-3 jsdom test rather than another full E2E round-trip.
Source surface: src-tauri/src/file_system/write_operations/{copy,move}.rs plus transfer-conflict-policy UI.
Behaviors covered (before) (17 across three files): Overwrite All, Skip All, per-file decisions, Rename, Rename All, Layout A nested conflicts, Layout B multi-item merges, mid-operation rollback, sequential conflicts, symlinks, type mismatches (file↔dir).
Gaps identified:
- Same-volume copy with both conflict and non-conflict mixed (already covered by Layout A/B).
- ⌃Z/⌘Z to undo a completed transfer (the app doesn't have undo today).
Tests added: 0. Reason: The conflict-policy matrix is saturated by the three files together. Adding more cases dilutes signal.
Source surface: src/routes/(main)/+page.svelte and the global keyboard dispatch (command-dispatch.ts,
command-registry.ts).
Behaviors covered (before) (14): Render, dual pane, file entries, arrow nav, Tab pane switch, Space toggle, click cursor move, click pane focus, Enter into dir, Backspace to parent, F7 mkdir dialog open/cancel, F7 mkdir create, F5 copy dialog open/escape, F6 move dialog open/escape.
Gaps identified:
- F8 opens delete confirmation (vs. ⇧F8 which is permanent delete).
- Cancel button on the mkdir dialog (was only tested via Escape).
- Cancel button on transfer dialogs.
- ⌘A select-all keyboard flow.
Tests added (2):
Cancel button closes the new folder dialog without creating anything: exercises the.btn-secondarypath throughModalDialog, asserts no folder was created (file-entry count unchanged).opens the delete confirmation dialog with F8: F8 opens thedelete-confirmationdialog (the recycle-bin path, not ⇧F8); Escape closes it and leaves the file under cursor in place.
Skipped (with reason): ⌘A and Cancel-button on the transfer dialog. The transfer-dialog Cancel button is wired
through the same path as Escape (both call the same closeDialog), so the additional test would duplicate signal.
Source surface: src/lib/test-a11y.ts runner + axe-core rules across each dialog snapshot in light + dark.
Behaviors covered (before) (20): Main explorer, every major dialog (Copy/Delete/Move/About/License/Command palette/Search/Settings/File viewer) in both light and dark modes.
Gaps identified:
- Error pane in axe (
error-pane.spec.tscovers ARIA explicitly). - Network volume picker open state: narrower coverage.
Tests added: 0. Reason: Each frame is already audited in both modes; structural a11y for individual components
lives at tier 3 (*.a11y.test.ts). Adding more axe snapshots without a clear missing-component would inflate the suite.
Source surface: src-tauri/src/file_system/watch/ + the frontend watcher subscription in file-explorer.
Behaviors covered (before) (11): External create (file + dir), delete, rename, modify-size, batch (25), 600-threshold (Linux only), watched-dir deletion, dual-pane sync, in-app-copy dedup, hidden-file filtering.
Gaps identified:
- Permissions-change watching: out of e2e scope.
- Watcher behavior under symlink resolution.
Tests added: 0. Reason: The spec already covers the full CRUD matrix plus the structural edge cases (threshold,
dedup, hidden-file filtering, watched-dir deletion). The remaining gaps are too low-level for an E2E round-trip and
would belong in notify-level Rust tests.
Behaviors covered (before) (26 total): Volume picker, browse, free space, copy bidirectional, move within and across, delete (single, multi, recursive), mkdir, rename, rename rejection, read-only enforcement, Cmd+C/X/V rejection toasts, 50 MB transfer in both directions, external add detection, MTP-to-local and same-volume conflict matrix (overwrite/skip).
Gaps identified:
- MTP rename to dotfile (filesystem reserved-name handling): valid but feels nichey.
- MTP filename Unicode round-trip: covered indirectly by the SMB unicode tests (skipped on macOS).
Tests added: 0. Reason: 26 tests cover the user-observable surface end-to-end. Adding marginal cases would test virtual-device internals rather than user flows.
One-off investigation: ran cargo-mutants (Rust) and
stryker-mutator (TypeScript) on two focused slices to find under-tested code paths in
already-unit-tested files. Both tools worked; neither is wired into CI. Total: 5 unit tests added (3 vitest, 2 nextest).
- cargo-mutants v27.0.0: works out of the box after
cargo install --locked cargo-mutants. Gotcha: the default test command runs the full crate's nextest suite, and fourindexing::reconcilertests fail when run from cargo-mutants's reflinked tmp directory (they create tempdirs in CWD to avoid/private/tmp/matchingEXCLUDED_PREFIXES, but/var/folders/...cargo-mutants-...tmp/also matches). Workaround: scope tests to the slice via positional args, e.g.cargo mutants --file '**/eta.rs' --timeout 60 --no-shuffle --test-tool nextest -- --lib file_system::write_operations::eta. - stryker-mutator v9.6.1 (
@stryker-mutator/core,@stryker-mutator/vitest-runner): worked after three config tweaks. Requiredplugins: ["@stryker-mutator/vitest-runner"]to load the runner;inPlace: trueto avoidoxctransformer failing on the sandbox copy ("Tsconfig not found" for unrelated.a11y.test.tsfiles); and leaving the vitest config alone (nodir, norelated: false):perTestcoverage analysis picks the right test files automatically. Took ~12 s total for one mutated file (45 mutants, 4 workers).
Rust slice: apps/desktop/src-tauri/src/file_system/write_operations/eta.rs (EWMA-based ETA estimator for write
operations, already had 10 unit tests).
| Metric | Count |
|---|---|
| Total | 58 |
| Caught | 34 |
| Missed | 19 |
| Unviable | 5 |
| Timeout | 0 |
Mutation score: 34 / (34 + 19) ≈ 64 %. Full run took ~9 min (after the 73 s baseline build), ~7–8 s per mutant.
TypeScript slice: apps/desktop/src/lib/file-operations/scan-throughput.ts (front-end rate calculator for the
scan-preview UI, 6 unit tests).
| Metric | Count |
|---|---|
| Total | 45 |
| Killed | 31 |
| Survived | 13 |
| Timeout | 1 |
| Compile errors | 0 |
Mutation score: 31 / 45 ≈ 69 %. Whole run took ~12 s.
All passed first clean run and again after ./scripts/check.sh.
scan-throughput.test.ts (vitest), targeting stryker survivors in dropStale:
keeps the right window of samples when many arrive over a long span. Four samples with non-linear progression; killscutoff = nowMs + windowMsmutant (line 68) and thelength > 2 → length >= 2 / <= 2boundary mutants on line 70 that the previous 3-sample window test couldn't differentiate (algebraically equal rates).always keeps at least two samples even after a long pause. Pushes one sample 60 s after a small starting pair; killswhile (true && ts < cutoff)andlength <= 2 && ts < cutoffmutants that would empty the buffer and return null.treats the cutoff timestamp as inclusive (strict less-than). Sample exactly on the cutoff boundary; kills< cutoff → <= cutoff / >= cutoffmutants on line 70.
eta.rs mod tests (nextest), targeting cargo-mutants survivors in EtaEstimator::update:
rate_division_uses_dt_not_a_constant. Drives the estimator withdt = 2.0 ssodelta / dtanddelta * dtdiffer by 4×; existing tests all used 1 s steps where the two are numerically identical. Kills/ → *and/ → %on lines 152, 153.first_post_seed_sample_initializes_rate_directly. Asserts exact rate after exactly twoupdate()calls: thesamples == 0branch sets the rate directly to the instantaneous rate; existing 3-sample tests masked the mutantsamples != 0because the EWMA caught up. Kills the== 0 → != 0mutant on line 159 and the1.0 - (-dt / TAU).exp()alpha arithmetic mutants on line 157 (those changes throwalphaoff by a large factor, which shows up immediately after one post-seed sample).
Stopped well short of the 10-test budget. The rest of the surviving cargo-mutants survivors land in eta_from_axes (a
saturating two-axis ETA selector). High-value follow-ups, in order:
eta.rs:223 > → ==/</>=ineta_from_axes(3 mutants):bytes_remaining > 0guard.eta.rs:224 / → %,eta.rs:231 / → %: axis-rate ÷ axis-remaining divisions.eta.rs:225, 232 == → !=: short-circuit early-return checks on rates.eta.rs:230 > → >=: files axis boundary check.eta.rs:247 > → ==/</>=(3 mutants),eta.rs:247:64 > → >=: final-pick comparison.eta.rs:190 && → ||incompute_stats: guard combining the two axes.- Stryker survivors that the new tests do not kill, e.g.
scan-throughput.ts:31 constructor body → {}(windowMs never read; easy fix: a test that constructs with two differentwindowMsand asserts different drop behavior).
The > 0 ? fps : 0 mutants on scan-throughput.ts:62, 63 (turning > into >=) are equivalent mutants: when
fps == 0, both branches return 0, so no test can differentiate them. Skip.
Don't wire either tool into the main ./scripts/check.sh pipeline. Both are too slow and too noisy for a per-commit
gate. Concretely:
- cargo-mutants: ~9 min for one 600-line file with the cmdr workspace's deps. Running the full
src-tauri/crate (~150 source files) would be hours, and many modules would need per-slice test filters to dodge the CWD-sensitive reconciler tests. Worth keeping the binary install in dev's mise toolchain and running ad-hoc on hot spots when adding non-trivial numeric / state-machine code (eta, write-op state machines, conflict resolution, indexing, etc.). - stryker: ~12 s per file is fast, but the config has sharp edges (
oxcsandbox issues, plugin discovery) and there's no obvious gain from a CI-blocking gate. Same recommendation as cargo-mutants: ad-hoc on numeric / branching FE utilities (scan-throughput, eventuallyfont-metrics,accent-color, etc.).
Concrete next steps if pushing this further:
cargo mutants --in-diff(only mutate lines changed by a PR) as an optional GitHub Actions workflow that runs on labelled PRs, not as a check. Output stays advisory.- Skip the workspace-setup churn by committing a
.cargo/mutants.tomlwithadditional_cargo_test_args = ["--lib", "file_system::write_operations::eta"]-style slice-scoped configs per hot module. - Don't add stryker config to the repo. The 3-tweak setup is small enough to redo when needed.
A second pass walked five more hot-spot modules. cargo-mutants --list is fast (no build), so I used it to enumerate
mutants per file, then read each module and added tests targeting the structurally surviving ones, skipping a full
cargo mutants run because the baseline build alone is ~10–15 min per file. Trade-off: the new tests aren't proven
mutation-killers in the strict sense, but they directly cover the mutated lines and behavior, which is what the killer
tests in Step 7 ended up doing anyway.
Total: 50 new unit tests across 5 modules, ~0.15 s combined runtime. All 1 699 lib tests still pass.
file_system/write_operations/state.rs(50 mutants, +30 tests): from zero existing tests. Covered theOperationIntentstate machine (from_u8,load_intent,is_cancelled),cancel_write_operationtransitions (Running→{RollingBack,Stopped}, RollingBack→Stopped, Stopped terminal, conflict-sender drop),cancel_all_write_operations,resolve_write_conflict, status-cache CRUD (register/update/unregister/list_active_operations/get_operation_status, including the bytes-vs-files percent axis and the.min(100.0)clamp),FileInfosort keys, andCopyTransactioncommit / rollback / Drop. Highest-leverage module of the pass: the state machine and status cache back every cancel click and every progress query, with zero coverage before.file_system/write_operations/copy_strategy.rs(16 mutants, +5 tests): theis_apfsandis_same_apfs_volumehelpers were only covered indirectly throughcopy_file_with_strategy. Direct positive/negative tests on macOS now pin the device-id comparison, the parent-fallback when the destination doesn't exist, and thef_fstypename == "apfs"check.file_system/watcher.rs(35 mutants, +6 tests):is_entry_modifiedwatches five axes (size, mtime, perms, is_directory, is_symlink); the existing tests only varied size, so every||chain mutant survived. New axis-by-axis tests plus one negative anchor (owner/group changes must NOT trigger a modify diff) and one structural pin forcompute_diff's index semantics (remove uses OLD index; add/modify use NEW).file_system/write_operations/chunked_copy.rs(16 mutants, +3 tests): existing tests covered byte fidelity and basic permissions but never checked the metadata-preservation side effects. New tests stamp a fixed mtime viafiletimeand roundtrip a user xattr (macOS): killscopy_timestamps → Ok(()),copy_xattrs → Ok(()),copy_metadata → Ok(()). Plus a multi-chunk byte-total assertion that kills thetotal_bytes += bytes_readarithmetic mutants the existing progress test couldn't differentiate.indexing/store.rs(~150 mutants overall, +6 tests onplatform_case_compare/normalize_for_comparison): the SQLite collation backing path resolution. Only test before was a single macOS happy-path that couldn't catch the→ Ordering::Equalmutant (since happy-path comparisons are equal anyway). New tests pin distinct ordering for distinct names, case-insensitivity, NFC↔NFD normalization equivalence on macOS, and binary comparison off macOS.
Modules examined but skipped:
file_viewer/{line_index,session,byte_seek,full_load}.rs: already 50+ tests including UTF-16 column tracking, cancellation, sparse-index checkpoints, multi-byte content. The 240+ mutants here are mostly inside private session state and search-pos accounting where additional tests would duplicate existing-test logical coverage.file_system/write_operations/{delete,trash,helpers}.rs: top-level*_with_progressfunctions depend ontauri::AppHandlefor event emission; meaningful tests would require a mock-emitter refactor that exceeds the "minimal refactor for testability" budget.move_to_trash_sync(the pure-Rust core) is already covered.
No bugs surfaced in this pass: every survivor was a real coverage gap, not buggy live code. The state-machine guard in
cancel_write_operation is correct: Running→{RollingBack,Stopped}, RollingBack→Stopped, Stopped terminal, exactly as
the doc-comment promises.
proptest and quickcheck aren't in use anywhere in apps/desktop/src-tauri/. The full investigation lives at
/tmp/cmdr-property-testing-report.md; this section is the short takeaway.
The most algorithmic, pure spots in the crate are already covered by tight example tests: eta::EtaEstimator (12 tests,
two named mutant-survivor targets), listing::sorting (32 tests), validation (13 tests). Adding proptest there gives
diminishing returns.
The clear net-positive proptest targets, in order:
indexing::aggregator::topological_sort_bottom_up: 1 example test for a function with non-trivial tree invariants. Cycle and duplicate-ID behavior isn't asserted today.search::query::glob_to_regex: 4 example tests; infinite input space; output feeds a regex engine that panics on malformed input. "Output is always valid regex" is a one-line property and a real safety net.search::query::split_scope_segments: 10 example tests for a parser with nested escape/quote rules. Round-trip and segment-count properties are cheap.indexing::store::platform_case_compare(macOS): comparator-law properties (reflexivity, antisymmetry, transitivity) plus NFC≡NFD equivalence. Highest user impact because miscompare corrupts the search index.
Verdict: worth adding proptest as a dev-dependency for these four targets specifically, ~half a day of work. Not worth
a project-wide convention. Don't introduce it for ETA, sorting, or validation: example tests already cover the
interesting cases.
How well are the 193 #[tauri::command] entry points (visible via bindings.ts) tested at the IPC layer (i.e., a
test actually calls the command function or mocks/invokes it by name)? Full report: /tmp/cmdr-ipc-coverage-report.md.
Counts (commit 742939e9):
- Well covered (happy + error path): 16 / 193 (8%)
- Happy path only: 11 / 193 (6%)
- Untested at the IPC layer: 166 / 193 (86%)
- Score
(well + happy/2) / total: 0.11
Caveat that softens the headline: most commands are thin pass-throughs to *_core / ops_* helpers (AGENTS.md: "Tauri
commands are pass-throughs"), and the helpers ARE broadly tested. The 86% measures the contract boundary, not business
logic. The bindings-fresh CI check and the no-raw-tauri-invoke ESLint rule mitigate most parameter-shape drift; what
they don't catch is permission-config drift or silent rename mismatches at runtime.
Biggest gaps by feature: viewer (9 commands, 0 IPC tests), MTP (~10 commands, 0 IPC tests), licensing (~10 commands, 0
IPC tests), settings/UI mutators (most untested). The write_ops surface (create_directory, create_file,
rename_file, move_to_trash) accounts for most of the "well covered" bucket because those _core tests happen to
call the command itself.
Verdict: weak at the IPC surface, strong underneath. If we want to raise contract coverage meaningfully, the
productive move is a vitest mockIPC layer that asserts each commands.foo(...) call returns a typed shape, not
Rust-side per-command tests.
Full report: /tmp/cmdr-state-machine-report.md (read-only scan, branch e2e-speedup @ 742939e9).
Surveyed 13 genuine state machines (backend + frontend; excludes derived / progress-only enums). About 60 transitions total, roughly 35 untested (~58%).
Coverage is uneven:
- Strong:
SmbVolume::ConnectionState(Direct ⇄ Disconnected, idempotency, single-flight reconnect),OperationIntent(atomic level),SmbReconnectManagerFE, AI notification FE, MTP FE, updater FE, error-reporterauto_dispatcherdebounce. - Weak:
IndexPhase(Disabled/Initializing/Running/ShuttingDown: no direct test of any transition or the start/stop race),ActivityPhase(six-state telemetry pipeline, no test),DiscoveryState(network mDNS: three transitions, no test),network-storeShareState+CredentialStatusFE (a11y tests only),ConnectToServerDialogFE. - Tested at wrong layer:
cancel_write_operation's validation guard (state.rs:306) is bypassed by all tests, which set the atomic directly. TheRollingBack → Stoppedvalid-transition assertion and the rejection of terminal-state writes are not exercised through the public API.
Top untested transitions worth adding tests for:
cancel_write_operationpublic function (validation guard +conflict_resolution_txdrop).IndexPhase::Initializing → Disabledrace when stop runs duringresume_or_scan.IndexPhase::Initializing → Runninghappy path.OperationIntent::RollingBack → Stoppedthrough public cancel (not just direct atomic store).SearchStatus::Running → Cancelledin file viewer.DiscoveryStatetransitions (Idle → Searching → Active → Idle).
Side finding: SmbVolume::ConnectionState::OsMount is a defined variant that is never written to the atomic. The smb2
hot-path branch handling OsMount (smb.rs:658, smb.rs:449) is dead code on the current implementation. Either wire the
transition or drop the variant.
Overall verdict: medium-strong. The transition-aware machines that matter for data safety are tested; the
orchestration-level lifecycles (IndexPhase, ActivityPhase, DiscoveryState) are not.
17 state-transition tests added (and one bug fix surfaced while writing them):
SmbVolume::ConnectionState: dropped the deadOsMountvariant. The internal state machine is now exactly the binary shape it was already operating as (Direct ⇄ Disconnected). The outerSmbConnectionState::OsMount(attached byenrich_smb_connection_statefor SMB shares with an OS mount but no Cmdr smb2 session) is unchanged.SearchStatus: fix + transition test.search_cancelwas clearingsession.search, which made theCancelledstatus (set by the search thread on cancel) unobservable: poll returnedIdle. Stopped nulling the state on cancel; the thread now writesCancelledand poll surfaces it. New test pinsRunning → Cancelledand the reset-on-new-start contract.DiscoveryState: three transition tests (Idle → Searching → Active → Idle,Searching → Idlevia drain, drain side effects). Factoredset_discovery_stateanddrain_discovered_hostsout of the event-emitting public paths so the state machine fragment is testable without standing up a Tauri runtime.ActivityPhase: nine tests covering the full scan pipeline (Idle → Replaying → Live,Scanning → Aggregating → Reconciling → Live), the shutdown path (* → Idle), the duration-closing branch the timeline UX depends on, the 20-entry ring-buffer cap,reset, andclose_phase_with_statsattaching to the current entry.IndexPhase: four tests:Initializing → Disabledvia the publicstop_indexingrace path, the two catch-all no-op arms (stop_indexingfromDisabled,clear_indexfrom non-Running), and the pureis_initializing_phaseclassifier the post-resume_or_scandecision now goes through.start_indexing's full happy path needs anAppHandleandIndexManagerand remains untested at unit-test level. The stress tests cover the writer-layer machinery underneath.
Honest verdict per machine:
- Easy:
SmbVolume::ConnectionState(already well-tested, just cleanup),ActivityPhase(pure journal, fresh instance per test),DiscoveryState(already had a global cell, only needed an emit/state split). - Awkward:
IndexPhase(carries owned non-Clonedata, transitions split acrossstart_indexing/stop_indexing/clear_index, and the race fragment we cared about needs a realIndexStore). - Impossible without a Tauri runtime:
start_indexing'sDisabled → Initializing → Runninghappy path: needs anAppHandleto spawn the writer and the verifier. The post-scan decision was extracted to a pure helper so the state-machine fragment that matters (the race) is at least testable.
A vitest mockIPC harness (apps/desktop/src/lib/ipc/test-helpers.ts) plus 23 contract tests for the three
highest-priority command groups:
- Write operations (9 tests):
copy_files,move_files,delete_files,trash_files,cancel_write_operation. Pins the payload shape (including the optional config object and thevolumeId/itemSizesshapes) and one typedWriteOperationErrorvariant on the error branch. - File viewer (8 tests):
viewer_open,viewer_get_lines,viewer_search_start/_poll/_cancel,viewer_close. Coverage report flagged this group as 9/9 untested at the IPC layer. - SMB connection (6 tests):
connect_to_server,list_shares_on_host,mount_network_share. The mount path has 6 positional args and AGENTS.md specifically calls out positional-soup as fragile.
What the harness catches: argument coercion, snake-case command name typos, payload-key shape drift, and the typed-error
discriminator round-tripping. What it doesn't catch: the real Tauri permission gate (mockIPC patches
__TAURI_INTERNALS__.invoke before the gate), business logic in *_core helpers (Rust unit tests own that), or
end-to-end behaviour (Playwright owns that).
Honest verdict: modest value, mostly mechanical. The coverage report already concluded that the bindings-fresh
check + cmdr/no-raw-tauri-invoke ESLint rule cover most of the realistic drift surface. This layer adds a thin runtime
check on top: it verifies that the FE actually drives the binding (not just that the binding compiles), and it documents
the wire format in a way that survives a refactor. Worth doing for the write-side, viewer, and SMB groups because those
are the ones where a renamed Rust function or a flipped payload key would surface as a generic runtime failure with no
obvious cause. Not worth expanding to all 193 commands: diminishing returns kick in fast once the binding shapes are
pinned for the destructive / cross-window surfaces.
No bugs surfaced during this pass. Side effect of writing the tests: confirmed that the typed-error discriminator shapes
(type for WriteOperationError / MountError / ShareListError, code for LicenseActivationError) are consistent
on the wire: the FE branching on error.type / error.code will see the values the bindings declare.
Branch e2e-speedup, 47 commits, ready for fast-forward to main.
| Category | Tests | Bugs surfaced |
|---|---|---|
| E2E coverage extension | 9 | 1: Cancel-copy rollback (Rust Ok(()) arm + Svelte settle-window race) |
| Mutation testing (Rust+Svelte) | 55 | 0 |
| State-machine transitions | 17 | 1: file_viewer SearchStatus::Cancelled unobservable to FE |
| Property-based (proptest) | 12 | 0 |
| IPC contract (mockIPC) | 23 | 0 |
| Total new unit / IPC tests | 107 |
Plus dead code removal: SmbVolume::ConnectionState::OsMount variant dropped (state machine collapsed to its real
binary Direct ⇄ Disconnected shape).
- Rust unit tests: 1649 → 1728 (+79, mutation 50 + state-machine 17 + proptest 12).
- Svelte unit tests: 1783 → 1812 (+29, IPC contract layer + scan-throughput mutants from Step 7).
- E2E Playwright: 122 → 131 active tests (+9 from coverage walk).
- E2E checker total: 13m 12s baseline → 4m 18s in the final slow pass (−67%).
- E2E Playwright wall-clock: 10m 12s baseline → ~1m 48s longest shard (−82%).
- Fast checker total: ~2m 30s baseline → 3m 13s (+43s). The regression is the new IPC contract tests (+30s of Svelte vitest) and the +79 Rust tests (+1s). Net cost is well below what an equivalent E2E spec would add.
- file_viewer search-cancel was unobservable to the FE.
search_cancelnulledsession.searchimmediately, so the spawned thread'sSearchStatus::Cancelledwrite was clobbered before the FE could see it. The FE couldn't distinguish "search completed naturally with zero matches" from "search was cancelled mid-flight." Fix: stop nulling on cancel; let the thread writeCancelledand the nextsearch_startreplaces it. - Cancel-copy mid-operation rollback was lost on fast filesystems. The Rust
Ok(())arm incopy_files_with_progressdidn't checkOperationIntentbefore committing the transaction, so a click during the < 1 µs window between the lastis_cancelledpoll and loop exit landed as a no-op. Plus the SvelteTransferProgressDialogleft the Rollback button enabled during theMIN_DISPLAY_MS = 400 mssettle window afterwrite-complete, so clicks during settle were silent no-ops. Both fixed in Step 6d.
SmbVolume::ConnectionState::OsMountvariant: never written to the atomic, two unreachablematcharms gone. The OS-mount fallback the UI renders lives at the outerSmbConnectionStateenriched byenrich_smb_connection_stateincommands/volumes.rs, not on this internal atomic.
- Mutation testing (cargo-mutants + stryker): zero bugs, 55 tests added. Tools work but are too slow / noisy for CI
gating. Worth ad-hoc runs on numeric / state-machine modules. Don't wire into
check.sh. - State-machine coverage: 17 tests, 1 real bug (file_viewer search-cancel). Highest signal-to-noise of the test-quality push. The bug was a real silent UX failure, surfaced by writing the transition test.
- Property-based (proptest): 12 tests, 0 bugs.
proptestadded as a dev-dep, scoped to four targets (topological_sort_bottom_up,glob_to_regex,split_scope_segments,platform_case_compare). Worth keeping for those specific algorithmic spots; not worth a project-wide convention. - IPC contract tests (vitest mockIPC): 23 tests, 0 bugs. Modest value:
bindings-fresh+no-raw-tauri-invokealready cover most drift. Worth doing for destructive / cross-window surfaces (write ops, viewer, SMB); diminishing returns past that. - E2E coverage extension: 9 tests, 1 real bug (cancel-copy rollback). Surfaced by walking the slowest test (32.7 s) in the post-Step-1 report.
- Linux SMB flakes:
50-share host shows correct share countandunicode shares render correctly: flake under GVFS race in Docker. Pre-existing, David is aware. Not addressed in this branch. - Step 6a parallel-load keystroke-dispatch flakes: rarely surface on warm runs; Step 6e converted the worst
offenders to
dispatchMenuCommand. Three remaining keyboard-pathway tests can flake under heavy parallel load (~1-in-N runs). The Step 6a fix-suggestions list (data-app-ready route-change reset, focus re-issue after click) is a candidate for a follow-up but wasn't load-bearing here.
./scripts/check.sh(fast pass): green in 3m 13s, 1728 Rust + 1812 Svelte tests pass../scripts/check.sh --only-slow: green except the two pre-existing Linux SMB flakes. E2E Playwright 131/131 in 4m 18s across 3 shards; rust-tests-linux 1699/1699; eslint-typecheck 453/453.
The viewer and settings UIs run in their own Tauri WebviewWindows in production (labels viewer-<ts> and settings).
Before the multi-window migration, the e2e tests bypassed this by routing the main window to /viewer?path=... and
/settings, which exercised the page components but not the cross-window plumbing (label uniqueness, capability
restrictions, focus/close lifecycle).
The migration uses tauri-plugin-playwright 0.3.0's new multi-window targeting:
tauriPage.waitForWindow(predicate, { timeout? }): pollslistWindows()every 100 ms, returns a TauriPage scoped to the matching window.tauriPage.window(label): fork a scoped page from an existing one (cheap, shared socket).
Canonical test pattern:
// 1. Open via prod trigger (same path the menu / shortcut / MCP uses).
const viewer = await openViewerWindow(tauriPage, filePath)
// ^^^ helper in helpers.ts; emits `open-file-viewer` and waits for a new viewer-* label
// 2. Wait for the new window's content.
await viewer.waitForSelector('.viewer-container', 15000)
await viewer.waitForSelector('.file-content', 10000)
// 3. Run all interactions through the scoped page.
await viewer.fill('.search-input', 'AAA')
// 4. Close via Escape + listWindows() poll for label disappearance.
await closeScopedWindow(tauriPage, viewer, viewer.targetWindow!)For settings: openSettingsWindowViaProd(tauriPage) (label settings).
Dependency override: while the upstream @srsholmes/tauri-playwright 0.3.0 + matching plugin crate aren't
published, the repo points at the vdavid/tauri-playwright fork on the multi-window branch:
- Rust:
apps/desktop/src-tauri/Cargo.toml→ git ref against the fork'smulti-windowbranch. Cargo's git source resolves the workspace member (packages/plugin/) automatically; the resolved commit hash is pinned inCargo.lock. - npm:
apps/desktop/package.json→file:.../packages/test/srsholmes-tauri-playwright-0.3.0.tgz. We can't use a GitHub ref here because the fork'spackages/test/package.jsonhas nopreparescript, so pnpm doesn't run thetsupbuild after fetching from git, leavingdist/missing and the package broken at import time. The tarball is pre-built and lives in the fork's tree. TODO: drop this once@srsholmes/tauri-playwright@0.3.0ships to npm.
Revert both to crates.io / npm registry refs (with the appropriate 0.3.x version) once the fork is merged upstream and
published.
Capability extension: the auto-generated playwright.json capability (in src-tauri/build.rs) now targets
["main", "settings", "viewer-*"] instead of just ["main"]. Without this, the plugin's pw_result IPC callback would
be rejected by Tauri's permission system when evaluating into a viewer or settings window. The eval itself would land in
the webview, but the result callback would never get back to the test runner.