Proposal: richer element selector grammar for winapp ui
The gap
Today winapp ui resolves a selector string in essentially two modes:
- AutomationId exact match (
invoke '<id>', set-value '<id>', get-property '<id>', etc.) — works when the target has a stable AutomationProperties.AutomationId.
- Substring search across Name + AutomationId (
search '<text>' --json) — returns up to ~4 matches with no scoping refinement.
Reality of any real WinUI 3 / WPF / Win32 app: many elements have NO AutomationId (layout thumbnails, generated ListItems, context-menu items, ComboBoxItem entries, …). For those the only options today are inspect → text or JSON output → client-side regex / Where-Object filter → re-issue a follow-up command with the extracted id.
That dance is verbose, fragile (stale-tree race between inspect and the follow-up; runtimeId isn't stable across UIA tree refreshes), and impossible when the inner element simply has no stable identity of its own. Every author re-invents the same client-side filtering boilerplate.
Direction (not a locked-in design)
Add a small CSS/jQuery-flavored selector vocabulary to winapp ui selector strings. The shape I'd propose, in priority order:
- Attribute predicates —
[Name="Save"], [AutomationId^="itm-"], [Name*="ettings"], regex variant [AutomationId~=/^itm-.+/], plus :not(...). Backed by UIA PropertyCondition + AndCondition. (P0 — highest payoff, additive.)
- Bare ControlType filter —
winapp ui search 'Button' -w $h, 'ListItem', 'ComboBoxItem', …; * for any. One PropertyCondition, trivial to implement. (P0.)
- Hierarchy combinators — descendant (
) and direct-child (>); #Id as a shortcut for [AutomationId="Id"]. E.g. '#ItemsList > ListItem[AutomationId^="itm-"]'. Backed by TreeWalker. Sibling combinators (+, ~) can land in a v2. (P1 — unlocks selection in regions without stable AutomationIds, which is most of WinUI 3 today.)
- Element-anchored scope — a
-e <ElementHandle> flag so a found element can become the root for a subsequent command, instead of always re-rooting at the window. Handle could be a stateless RuntimeId-encoded token (same approach Selenium uses for WebElement). (P1.)
- Pseudo-classes for state —
:enabled, :disabled, :visible (!IsOffscreen), :focused, :checked. Pairs especially well with wait-for, which today only checks presence: winapp ui wait-for 'Button[Name="Save"]:enabled' --timeout 5000. (P2.)
Before / after
# Today — find an enabled "Copy" button inside MainPanel
$ins = winapp ui inspect 'MainPanel' -w $h --depth 3 --json | ConvertFrom-Json
$btn = $ins.children | Where-Object { $_.controlType -eq 'Button' -and $_.name -eq 'Copy' -and $_.properties.IsEnabled }
if (-not $btn) { throw "no enabled Copy button" }
winapp ui invoke $btn.automationId -w $h
# After
winapp ui invoke '#MainPanel Button[Name="Copy"]:enabled' -w $h
Half the code, no JSON poking, no client-side filter, no stale-tree race window, and it works even when the inner element has no AutomationId of its own.
Hard limits / honest scope
- XPath axes (
ancestor::, following::), :has() / :contains(), selector caching / compilation, and JS-style callback predicates are explicitly out of scope — additive later if demand emerges.
- Type names should be UIA
LocalizedControlType (case-insensitive), not Win32 ClassName, to stay stable across tech stacks.
Open questions worth deciding up front
- Does the existing CLI already do any selector-string parsing beyond bare
AutomationId? Any existing grammar should be honored.
- When a selector matches N elements, does
search --json keep its current shape, or evolve to a richer per-element record (with hierarchical-path info for debugging)?
- Should
wait-for accept the new grammar from day 1, or stay scoped to bare-AutomationId presence and adopt selectors later?
- Element-handle lifecycle for proposal 4: stateless
RuntimeId-encoded token vs server-side cache? Recommend stateless.
Happy to write up the full grammar, parser test corpus, per-proposal implementation notes, and priority/ROI matrix as a follow-up doc / PR once the direction is acknowledged.
Proposal: richer element selector grammar for
winapp uiThe gap
Today
winapp uiresolves a selector string in essentially two modes:invoke '<id>',set-value '<id>',get-property '<id>', etc.) — works when the target has a stableAutomationProperties.AutomationId.search '<text>' --json) — returns up to ~4 matches with no scoping refinement.Reality of any real WinUI 3 / WPF / Win32 app: many elements have NO AutomationId (layout thumbnails, generated ListItems, context-menu items, ComboBoxItem entries, …). For those the only options today are
inspect→ text or JSON output → client-side regex /Where-Objectfilter → re-issue a follow-up command with the extracted id.That dance is verbose, fragile (stale-tree race between
inspectand the follow-up;runtimeIdisn't stable across UIA tree refreshes), and impossible when the inner element simply has no stable identity of its own. Every author re-invents the same client-side filtering boilerplate.Direction (not a locked-in design)
Add a small CSS/jQuery-flavored selector vocabulary to
winapp uiselector strings. The shape I'd propose, in priority order:[Name="Save"],[AutomationId^="itm-"],[Name*="ettings"], regex variant[AutomationId~=/^itm-.+/], plus:not(...). Backed by UIAPropertyCondition+AndCondition. (P0 — highest payoff, additive.)winapp ui search 'Button' -w $h,'ListItem','ComboBoxItem', …;*for any. OnePropertyCondition, trivial to implement. (P0.)) and direct-child (>);#Idas a shortcut for[AutomationId="Id"]. E.g.'#ItemsList > ListItem[AutomationId^="itm-"]'. Backed byTreeWalker. Sibling combinators (+,~) can land in a v2. (P1 — unlocks selection in regions without stable AutomationIds, which is most of WinUI 3 today.)-e <ElementHandle>flag so a found element can become the root for a subsequent command, instead of always re-rooting at the window. Handle could be a statelessRuntimeId-encoded token (same approach Selenium uses forWebElement). (P1.):enabled,:disabled,:visible(!IsOffscreen),:focused,:checked. Pairs especially well withwait-for, which today only checks presence:winapp ui wait-for 'Button[Name="Save"]:enabled' --timeout 5000. (P2.)Before / after
Half the code, no JSON poking, no client-side filter, no stale-tree race window, and it works even when the inner element has no AutomationId of its own.
Hard limits / honest scope
ancestor::,following::),:has()/:contains(), selector caching / compilation, and JS-style callback predicates are explicitly out of scope — additive later if demand emerges.LocalizedControlType(case-insensitive), not Win32ClassName, to stay stable across tech stacks.Open questions worth deciding up front
AutomationId? Any existing grammar should be honored.search --jsonkeep its current shape, or evolve to a richer per-element record (with hierarchical-path info for debugging)?wait-foraccept the new grammar from day 1, or stay scoped to bare-AutomationIdpresence and adopt selectors later?RuntimeId-encoded token vs server-side cache? Recommend stateless.Happy to write up the full grammar, parser test corpus, per-proposal implementation notes, and priority/ROI matrix as a follow-up doc / PR once the direction is acknowledged.