fix(skill/sandboxes): routing, disambiguation, portal, ask-first-on-ambiguous#1733
Open
paulyuk wants to merge 1 commit into
Open
fix(skill/sandboxes): routing, disambiguation, portal, ask-first-on-ambiguous#1733paulyuk wants to merge 1 commit into
paulyuk wants to merge 1 commit into
Conversation
…mbiguous Addresses 4 routing/selection bug clusters surfaced by the ACA Sandboxes Vally eval suite (Run 3, see coreai-microsoft/adc-devx#214 baselines): 1. Dynamic Sessions disambiguation (CRITICAL). Skill currently activates and recommends ACA Sandboxes when the user asks for code-interpreter / untrusted-code-from-LLM-agent / ephemeral-seconds workloads. Adds an explicit 'Do NOT activate for' block listing Container Apps Dynamic Sessions with the cues (code interpreter, execute LLM-generated code, untrusted code from my agent, session pool, ephemeral seconds), and a side-by-side comparison table making the lifetime / isolation / audience differences explicit. 2. Over-triggering on bare 'sandbox' / 'microVM' / 'VM' keywords. Currently SKILL.md trigger list includes broad terms like 'create sandbox', 'microVM', 'code interpreter' which fire on AKS sandbox namespaces, Cosmos sandbox containers, Windows Sandbox, etc. Tightens trigger requirements to include ACA / dev-loop / microVM-in-Azure context, removes 'code interpreter' as a hard trigger (relocated to Dynamic Sessions guidance), and adds explicit 'Do NOT activate for' entries for the common false positives. 3. Ask-first-on-ambiguous-prompts. Currently the skill jumps to provisioning on prompts like 'sandbox' / 'I need an ephemeral VM' / 'set up a sandbox for my coding agent' / 'what should I use to run my AI agent'. Adds a 'When the prompt is ambiguous, ASK ONE clarifying question before provisioning' branch in the description with 3 concrete patterns (single-word / lifetime-ambiguous / agent-runtime-ambiguous) and a target-product enumeration for each. 4. Portal capability correction. Skill currently says nothing about the portal at https://containerapps.azure.com/sandbox-groups, leading the model to claim 'the portal has no sandbox management surface' which is the opposite of the truth. Adds a Portal row to the Get-started table and references it in the comparison table. Also: - Auth canonical entry point updated from 'az login' to 'aca auth login' (delegates to az login). Matches the install.md PR. - 'Code interpreter' relabeled in Scenarios as 'Developer code-interpreter loop' to distinguish from the managed Dynamic Sessions product. No references/ behaviour changes in this PR; canonical-command-shown-verbatim work (the 5th bug cluster) deferred to a later references/scenarios.md PR so this one stays reviewable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses 4 routing / selection bug clusters in
plugin/skills/sandboxes/SKILL.mdsurfaced by the ACA Sandboxes Vally eval suite (Run 3, seecoreai-microsoft/adc-devx#214baselines).What changes in SKILL.md
1. Dynamic Sessions disambiguation (CRITICAL)
Adds an explicit Do NOT activate for section listing Container Apps Dynamic Sessions with the cues (
code interpreter,execute LLM-generated code,untrusted code from my agent,session pool,ephemeral seconds). Adds a side-by-side comparison table (ACA Sandboxes vs Dynamic Sessions vs Container App) making the lifetime / isolation / audience differences explicit. Today the skill activates and recommends ACA Sandboxes for these workloads, which is wrong.2. Over-triggering on bare keywords
Tightens the
Triggerslist: baresandbox/microVM/VMno longer match; trigger phrases now require ACA / dev-loop / microVM-in-Azure context (create ACA sandbox,ACA microVM, etc.). Adds Do NOT activate for entries covering the common false positives: AKS / Kubernetes sandbox namespaces, Cosmos DB sandbox containers, Windows Sandbox, Linux namespace sandboxes, Salesforce/Playwright sandboxes, generic Azure VM requests.3. Ask-first-on-ambiguous-prompts branch
Adds a When the prompt is ambiguous, ASK ONE clarifying question before provisioning section in the description with 3 concrete patterns:
sandbox,microVM,VM) → ask which product family.ephemeral VM,VM for testing) → ask expected lifetime; seconds → Dynamic Sessions, hours\u2013days → ACA Sandboxes, long-lived workstation → Dev Box / Azure VM.set up a sandbox for my coding agent,what should I use to run my AI agent?) → ask whether the agent needs its OWN dev env (ACA Sandboxes) or to execute end-user/LLM-generated code (Dynamic Sessions).Today the skill jumps straight to provisioning on these prompts \u2014 the eval suite caught this on all 6
syn-*stimuli.4. Portal capability correction
Adds a Portal row to the Get-started table pointing at
https://containerapps.azure.com/sandbox-groups, plus a Portal column in the comparison table. Today the skill claims "the Azure portal has no sandbox management surface", which is the opposite of the truth and contradicts the docs.Bonus
az logintoaca auth login(matches the install.md PR fix(skill/sandboxes): use canonical aka.ms install URLs and aca auth login #1732 \u2014aca auth logindelegates toaz login).Code interpreterrelabeled in the Scenarios list asDeveloper code-interpreter loopto distinguish from the managed Dynamic Sessions product.Eval coverage
When this lands, expected improvements in the Vally suite:
neg-dynamic-session-query(CRITICAL) \u2014 should pass (skill no longer activates)compare-sandbox-vs-dynamic-session(CRITICAL) \u2014 should pass (skill activates with table)compare-aca-vs-portal\u2014 should pass (skill correctly cites portal)neg-list-k8s-pods,neg-cosmos-query\u2014 should pass (skill no longer over-triggers)syn-ai-agent-runtime,syn-coding-agent-sandbox,syn-ephemeral-vm,syn-microvm-vague,syn-sandbox-single-word,syn-vm-for-dev\u2014 should pass (skill asks clarifying question first)Out of scope (separate PR)
pos-*stimuli where the skill executes commands without showing the canonicalaca \u2026form first) deferred to a laterreferences/scenarios.mdPR so this one stays reviewable.Related
coreai-microsoft/adc-devx#214Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com