Skip to content

Release v2.10.0 to main#931

Merged
erikdarlingdata merged 34 commits intomainfrom
dev
May 4, 2026
Merged

Release v2.10.0 to main#931
erikdarlingdata merged 34 commits intomainfrom
dev

Conversation

@erikdarlingdata
Copy link
Copy Markdown
Owner

Summary

Release v2.10.0 — bug-fix release.

No schema changes since v2.9.0 — no upgrade folder needed. See CHANGELOG.md for full details.

Validation

  • dotnet build -c Debug — 0 errors
  • Installer.Tests — 46/46 passing
  • sql2022 upgrade 2.9.0 → 2.10.0 — clean (54/54 scripts), data preserved
  • sql2022 idempotent re-run — clean
  • collection_log post-upgrade: 1164 SUCCESS, 0 ERROR

Test plan

  • Tag v2.10.0 after merge
  • Create GitHub Release to trigger the build workflow

🤖 Generated with Claude Code

erikdarlingdata and others added 30 commits April 29, 2026 09:28
Move-only refactor; no behavior changes. Three files split:
- Controls/ServerTab.xaml.cs (5824 → 1571) into 8 partials:
  Refresh, Charts, Pickers, Plans, DrillDown, Comparison, CopyExport, Filters
- MainWindow.xaml.cs (2352 → 1967): plan viewer extracted to MainWindow.PlanViewer.cs
- Controls/PlanViewerControl.xaml.cs (2376 → 209) into 4 partials:
  Rendering, Properties, Tooltips, Interaction

Build clean: 0 errors, 186 warnings (all pre-existing CA1873).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move-only refactor; no behavior changes. Five files split:
- Services/DatabaseService.QueryPerformance.cs (4941 → 1198) into 6 sub-partials:
  Snapshots, Stats, Blocking, History, Trends, Mcp
- ServerTab.xaml.cs (3871 → 533) into 9 partials:
  Refresh, Charts, Plans, TimeRange, Filters, DrillDown, Slicers, CopyExport, Alerts
- Controls/QueryPerformanceContent.xaml.cs (2869 → 1279) into 6 partials:
  Filters, Plans, Slicers, Comparison, Heatmap, CopyExport
- Controls/PlanViewerControl.xaml.cs (2539 → 368) into 4 partials:
  Rendering, Properties, Tooltips, Interaction
- MainWindow.xaml.cs (2523 → 2112): plan viewer extracted to MainWindow.PlanViewer.cs

Build clean: 0 errors, 0 warnings (pre-existing CA1806 warnings now
attached to ServerTab.Plans.cs since those methods moved there).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…plits

Split Lite + Dashboard god-class files into partial classes
Move-only refactor; no behavior changes. Five files split:

Services (partials of partials):
- Dashboard/Services/DatabaseService.FinOps.cs (2924 → 24) into 7 sub-partials:
  Inventory, Workload, Storage, Queries, IndexAnalysis, Recommendations, Models
- Lite/Services/LocalDataService.FinOps.cs (2702 → 525) into 7 sub-partials:
  ServerProperties, Inventory, StorageGrowth, Utilization, Workload,
  IndexAnalysis, Recommendations

Content code-behinds (per-tab partials):
- Dashboard/Controls/SystemEventsContent.xaml.cs (2263 → 363) into 10 partials:
  FilterPopup, SystemHealth, SevereErrors, IOIssues, SchedulerIssues,
  MemoryConditions, CPUTasks, MemoryBroker, MemoryNodeOOM, CopyExport
- Dashboard/Controls/ResourceMetricsContent.xaml.cs (1881 → 406) into 8 partials:
  LatchStats, SpinlockStats, TempdbStats, SessionStats, FileIoLatency,
  PerfmonCounters, WaitStatsDetail, CopyExport
- Dashboard/Controls/MemoryContent.xaml.cs (1356 → 256) into 6 partials:
  MemoryStats, MemoryGrants, MemoryClerks, PlanCache, MemoryPressure,
  CopyExport

Build: 0 errors. Smoke-tested: 30-min Lite watch with 0 error events;
collection_health snapshot 18,390 runs across 24 collectors, 100% SUCCESS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The fix in 562269f added IsTrayToolTipCrash() to OnDispatcherUnhandledException
in both apps, but the same race in Hardcodet.NotifyIcon.Wpf can also escape to
OnUnhandledException (AppDomain) — observed today when exiting via the tray
context menu. The Dispatcher's exception hooks tear down before the tray
library finishes processing late tooltip-hide messages, so the exception
arrives at AppDomain with no suppression and the user sees a Fatal Error
dialog despite the underlying crash being harmless.

Mirror the same IsTrayToolTipCrash() check in OnUnhandledException for both
Lite and Dashboard. Logs a Warn and returns without showing the dialog.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…inops-splits

Split FinOps services and content code-behinds into partial classes
Two workflows ("Build" and "CI") were both running on PR push to dev with
identically-named "build" jobs, doubling restore+compile work and showing
as duplicate "build" rows in `gh pr checks`.

Move the Lite.Tests build + dotnet test step into build.yml's existing job
and delete ci.yml. Also extend build.yml's push trigger to include dev so
post-merge runs still happen.

Verified locally: dotnet test Lite.Tests/Lite.Tests.csproj -c Release passes
257/257.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…orkflows

Consolidate CI workflows: fold ci.yml into build.yml
…shot

Both the Memory right-sizing check (#3) and the VM right-sizing memory
prescription (#12) read instantaneous values, which produced misleading
"buffer pool uses 0%" findings on servers that were either freshly restarted
or under genuine memory pressure where plan cache / workspace memory
dominate.

- Check #3 now reads P95 of collect.memory_stats.total_memory_mb over 7 days
  instead of TOP(1) buffer_pool_mb at the latest collection.
- Check #12 replaces the live perfmon "Database Cache Memory (KB)" read
  (data-cache slice only, instantaneous) with the same 7-day P95 of
  total_memory_mb. CPU side already uses 7-day P95.
- Both checks require >= 500 samples (~1 day at 1/min) before firing.
- Recommendation text now says "P95 SQL memory" rather than "buffer pool"
  to reflect what is actually being measured.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y-recommendation

Fix #917 — base FinOps memory recommendation on 7-day P95
WPF Popup with PlacementTarget = chart can wedge when the user navigates
away from a TabItem mid-hover: TabControl unloads the parent without firing
MouseLeave on the chart, so _popup.IsOpen stays true with a stale anchor.
On return, OnMouseMove sets IsOpen = true but it is already true — the
assignment is a no-op and the popup never re-anchors or appears.

Memory tab is the most visible victim because it has 6 charts inside a
nested TabControl, multiplying the chance of a wedged popup, but the bug
is general to ChartHoverHelper.

- Force _popup.IsOpen = false on chart Loaded / Unloaded / IsVisibleChanged.
- In OnMouseMove, toggle IsOpen off then on so WPF re-evaluates placement
  even when the popup believes it is already open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ooltip

Fix #916 — Memory tab tooltip stops working after returning to tab
… a snapshot

Mirrors the Dashboard fix in 7cc2265 for the Lite collector. Both the
Memory right-sizing check (#3) and the VM right-sizing memory prescription
(#12) read util.BufferPoolMb — a single-sample reading of perfmon
"Database Cache Memory", which is only the data-cache slice of the buffer
pool and could trigger right after a service restart or on servers where
plan cache / workspace memory dominates.

- Both checks now query DuckDB for the 7-day P95 of total_server_memory_mb
  (perfmon "Total Server Memory" — the full set of memory SQL has
  committed) from v_memory_stats.
- Both require >= 500 samples (~1 day at 1/min) before firing.
- Recommendation text now says "P95 SQL memory" rather than "buffer pool"
  to reflect what is actually being measured.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…memory-recommendation

Fix #917 (Lite) — base FinOps memory recommendation on 7-day P95
… tab

Mirrors Dashboard fix in 4e58e8a. WPF Popup with PlacementTarget = chart
can wedge when the user navigates away from a TabItem mid-hover: WPF
unloads the parent without firing MouseLeave on the chart, so
_popup.IsOpen stays true with a stale anchor. On return, OnMouseMove
sets IsOpen = true but it is already true — the assignment is a no-op
and the popup never re-anchors or appears.

Lite has the same nested TabControl pattern (parent Memory tab hosting
Memory Overview / Memory Clerks / Memory Grants / Memory Pressure Events
sub-tabs) so the bug surfaces the same way.

- Force _popup.IsOpen = false on chart Loaded / Unloaded / IsVisibleChanged.
- In OnMouseMove, toggle IsOpen off then on so WPF re-evaluates placement
  even when the popup believes it is already open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tab-tooltip

Fix #916 (Lite) — Memory tab tooltip stops working after returning to tab
… Lite)

Sweeping for repeats of the WPF Popup wedge bug found CorrelatedCrosshairManager
has the same shape as ChartHoverHelper — multiple charts feed a single tooltip
popup, IsOpen = true is set on every mouse move with no re-anchor toggle, and
no Loaded/Unloaded/IsVisibleChanged subscriptions on the chart lanes.

CorrelatedTimelineLanesControl is hosted inside Resource Metrics → Server
Trends (a TabItem inside a TabItem), so the same wedge applies: WPF unloads
the parent on tab switch without firing MouseLeave, leaving _tooltip.IsOpen
stuck at true with a stale anchor.

Both Dashboard and Lite copies updated in lockstep per the file's own SYNC
WARNING:

- Subscribe to chart Loaded / Unloaded / IsVisibleChanged in AddLane and
  force _tooltip.IsOpen = false on each event.
- In OnMouseMove, toggle IsOpen off then on so WPF re-evaluates placement
  even when the popup believes it is already open.
- Unhook the new event handlers in Dispose.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rosshair-tooltip

Apply #916 popup-wedge fix to CorrelatedCrosshairManager
…ests

PRs #918 and #920 added a sample-count guard (>= 500) to the new 7-day P95
memory recommendation logic. 500 was overly conservative and broke the
Lite FinOps test OverProvisionedEnterprise_MemoryRightSizingFires, which
seeds a single memory_stats row.

The real protection added by those PRs was switching from TOP 1 to P95;
the sample minimum is just a sanity check against degenerate single-point
inputs. ~16 samples is enough to compute a meaningful P95 and matches the
shape of SeedCpuUtilizationAsync's 16-row fixture, so tests can fire the
recommendation without artificial inflation.

- Lower threshold from 500 to 16 in both Dashboard and Lite (checks #3
  and #12) so the value reflects the actual ask: "more than one reading"
  rather than "8+ hours of data."
- Update Lite's SeedMemoryStatsAsync to insert 16 rows across the test
  period (matching SeedCpuUtilizationAsync's pattern). This makes
  OverProvisionedEnterprise_MemoryRightSizingFires pass again and keeps
  CleanServer_NoDuckDbRecommendations green (still no rows seeded for
  the clean scenario → P95 returns NULL → no recommendation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…st-threshold

Fix CI: relax FinOps memory sample-count guard, seed enough rows in tests
…ests

The CI test step was averaging ~7 minutes for 257 tests because xUnit's
default parallelization was implicit and slow classes (FactCollectorTests,
ScenarioTests) blocked the pipeline. Local runs of the full suite went
from sequential-feeling runs to ~2m21s with explicit parallelization
config.

Changes:

- Add Lite.Tests/xunit.runner.json with explicit parallelization settings:
  parallelizeAssembly=true, parallelizeTestCollections=true,
  maxParallelThreads=-1 (use all cores), parallelAlgorithm=aggressive.
  CopyToOutputDirectory wired up in the csproj.

- Serialize BaselineProviderTests and AnomalyDetectorTests into a shared
  collection (BaselineProviderCollection, DisableParallelization=true).
  Both classes mutate the static BaselineProvider.CacheTtl field, so they
  must run sequentially relative to each other; without this, parallel
  runs would race on the static state.

- Convert FinOpsTests to IClassFixture<FinOpsDuckDbFixture>, sharing one
  DuckDB across the class. Every seeder in TestDataSeeder.SeedFinOps*
  already calls ClearTestDataAsync first, so cross-test pollution is
  prevented without rewriting test code. Saves the schema-init cost on
  each test (modest, but free).

This is a pilot for the IClassFixture approach. The agent's analysis flagged
that converting other classes (FactCollectorTests etc.) would require
adding ClearTestDataAsync calls to many tests, so leaving those alone for
now. Parallel execution across classes is the bigger win and is fully
covered by the runner.json settings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Speed up Lite tests: enable parallelization + share DuckDB in FinOpsTests
The Restore dependencies step on every CI run takes ~1 minute because
nothing caches NuGet packages between runs. setup-dotnet@v4 has a
built-in cache option that uses packages.lock.json as the cache key —
on a hit, restore drops to ~5-10s.

- Add cache: true and cache-dependency-path to actions/setup-dotnet in
  build.yml and nightly.yml.
- Generate packages.lock.json for Dashboard, Lite, Installer,
  Installer.Core, and Lite.Tests via dotnet restore --use-lock-file.
- Switch the restore commands to --locked-mode so CI fails fast if
  someone forgets to commit an updated lock file (instead of silently
  resolving transitive packages and bypassing the cache key).

The lock files double as a supply-chain check — package versions are now
pinned across the graph, not just the direct refs in csproj.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI: enable NuGet caching to cut restore from ~60s
…ts-parallel"

This reverts commit 987dc61, reversing
changes made to 76d825f.
Revert PR #924: Lite tests parallelization (slowed CI)
Heavy Lite test classes (AnomalyDetector, FactCollector*,
BaselineProvider, InferenceEngine, Scenario, AnalysisService) only run
when paths under Lite/Analysis, Lite/Services, Lite/DuckDb, Lite/Models,
or Lite.Tests change. Those tests dominate the build at ~5-6 of the 8
test minutes, but the logic underneath rarely changes.

Cheap Lite tests (HealthCalculator, FactScorer, FinOps, FindingStore,
Dismiss*, AlertHistorySource, DuckDbSchema) always run — together they
finish in well under a minute.

Also wires Installer.Tests into CI for the first time, gated on changes
to Installer/, Installer.Core/, Installer.Tests/, install/, or upgrades/.
Generated packages.lock.json so the project participates in --locked-mode
restore + NuGet caching.

Releases force-run everything regardless of paths.
VersionDetectionTests, IdempotencyTests, and AdversarialTests require a
running SQL Server instance via TestDatabaseHelper, which the GitHub
runner doesn't provide. Filter to only the pure-logic classes
(UpgradeOrderingTests, FileFilteringTests) which already passed in the
prior run.
CI: split tests into fast and analysis-heavy lanes
dorny/paths-filter@v3 defaults to comparing against the default branch
on push events to non-default branches, which made every push to dev
match the lite_analysis filter (dev is ~1200 commits ahead of main).

Set base explicitly to github.event.before for push events; leave it
empty on pull_request so the action keeps its PR-base default.
erikdarlingdata and others added 4 commits May 3, 2026 10:29
…ter-base

CI: fix paths-filter to compare against previous commit on push
* Docs: per-database grants for FinOps Index Analysis (#915)

Adds a Permissions subsection covering the per-user-database mapping +
VIEW DATABASE STATE + VIEW DEFINITION grants that sp_IndexCleanup needs
to avoid hanging at 100% CPU when the dashboard/Lite login lacks access
to a target database. Includes a single-DB block and an sp_MSforeachdb
helper, plus the engine-bug explanation (infinite recompile loop, not
sys.dm_db_partition_stats per se).

Adds a matching troubleshooting entry naming the hang symptom and
linking to the new section.

* Docs: add SELECT on sys.sql_expression_dependencies grant (#915)

Earlier permissions block was incomplete. VIEW DATABASE STATE +
VIEW DEFINITION are necessary but not sufficient: sp_IndexCleanup
also queries sys.sql_expression_dependencies via three-part name
when scanning for computed columns / check constraints with UDF
references, and SELECT on that catalog view defaults to db_owner
only. VIEW DEFINITION does not include it.

OP confirmed in the issue thread that the previous grant set still
yielded Msg 229 on a real workload database. Reproduced locally on
SQL2019 with a database containing a UDF-bound computed column and
check constraint; adding GRANT SELECT ON sys.sql_expression_dependencies
clears it. Run completes in <1s and returns the expected UDF rows.

Updates:
- Single-DB and sp_MSforeachdb blocks now include the third grant.
- Symptoms section split into the two distinct failure modes (hang
  vs. Msg 229) so readers can identify which grant they are missing.
- Troubleshooting bullet covers both symptoms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug-fix release covering #916 (Memory tab tooltip survives tab switch
in Dashboard + Lite, with same popup-wedge fix extended to
CorrelatedCrosshairManager) and #917 (FinOps memory recommendation
now bases sizing on a 7-day P95 instead of a single snapshot in both
apps). Also adds README documentation for per-database EXECUTE grants
required by FinOps Index Analysis (#915).

No schema changes since v2.9.0 — no upgrade folder needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@erikdarlingdata erikdarlingdata merged commit d429e44 into main May 4, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Finops memory reccomendation might be off [BUG] Tooltip in Memory-charts stops working after a while

1 participant