Skip to content

Release v0.3.1#209

Merged
placerda merged 29 commits into
mainfrom
release/v0.3.1
May 29, 2026
Merged

Release v0.3.1#209
placerda merged 29 commits into
mainfrom
release/v0.3.1

Conversation

@placerda
Copy link
Copy Markdown
Contributor

Release v0.3.1

Automated release branch created from develop.

What happened

  • Branch release/v0.3.1 created from develop
  • CHANGELOG.md updated: versioned section [0.3.1] added
  • Plugin versions synced to 0.3.1 (package.json, plugin.json, marketplace.json)
  • Staging pipeline triggered automatically (build → TestPyPI + VSIX pre-release → verify)

Next steps

  1. Wait for the Staging pipeline to pass
  2. Review and approve this PR
  3. Merge to main
  4. Tag and push: git tag v0.3.1 && git push origin v0.3.1
  5. Approve the PyPI publish and VSIX stable publish in the Release workflow
  6. Sync develop: git checkout develop && git merge main && git push origin develop

Checklist

  • Staging pipeline passes (build + TestPyPI + VSIX pre-release + verify)
  • CHANGELOG entries reviewed
  • PR approved and merged to main
  • Tag v0.3.1 pushed
  • PyPI publish approved
  • VSIX stable publish approved
  • develop synced from main

placerda and others added 29 commits May 28, 2026 21:53
Adds an optional `prompt_agent_bootstrap` block to `agentops.yaml`
that lets the prompt-agent deploy workflow create the first version of
an agent in a dev / qa / prod Foundry project that does not yet contain
one. When the stage step looks up the seed agent and gets a 404, it
reads the model deployment (required) plus optional description,
model_parameters, and tools from prompt_agent_bootstrap, combines them
with prompt_file, and creates the first version automatically.

The deployment artifact records the new `action: bootstrapped` for
that first run; subsequent deploys follow the normal reuse /
next-version flow.

This eliminates the previous per-environment manual seeding step that
forced users to recreate the same prompt agent in every Foundry
project before CI could promote prompts to it. The first PR / deploy
into an empty dev project now succeeds out of the box; only the
sandbox project needs a manual seed (so authors still have a playground
to iterate in).

Authentication (401 / 403) and other non-404 errors continue to
propagate — the bootstrap path only triggers on a genuine `agent does
not exist` 404 via a strict _is_not_found_error helper.

`agentops workflow analyze` now emits a warning when a prompt-agent
workspace is missing prompt_agent_bootstrap to surface the recommended
configuration before operators hit it in CI.

Documentation:
- tutorial-prompt-agent-quickstart.md: rewrote section 4 (sandbox-only
  seed), section 9 (yaml example + bootstrap callout +
  project_endpoint warning), section 11 (stage-on-empty flow), and
  section 13 (action: bootstrapped on first deploy artifact).
- tutorial-end-to-end.md: added a multi-env bootstrap callout pointing
  at the prompt-agent quickstart for the full journey.
- agentops-config skill: documented the new optional block.
- agentops-workflow skill: documented the bootstrap-on-empty branch.
- CHANGELOG Unreleased: Added entry under Added.

Tests: 16 new unit tests across test_prompt_deploy.py (bootstrap path,
missing config raises, auth errors propagate, ignores bootstrap when
seed exists), test_agentops_config.py (schema validation), and
test_workflow_analysis.py (missing-bootstrap warning, silent when
present). Full suite: 784 passed, 3 skipped.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ent.json field names (#189)

The prompt-agent quickstart tutorial assumed Foundry assigns version
:1 when you save+publish a prompt agent for the first time in the
portal. In practice, Foundry numbers the unpublished draft :1 and
assigns :2 on first publish, so the seed pointer in agentops.yaml
should be travel-agent:2, not :1.

Also correct the example foundry-agent.json: the artifact written by
prompt_deploy uses flat field names (source_agent, candidate_agent,
prompt_sha256, git_sha, workflow_url), not the nested agentops.* shape
the tutorial previously showed.

Add a callout explaining the SDK vs. portal numbering asymmetry:
because CI uses the SDK and starts at :1 in an empty project, the
bootstrap may fire on the first one or two deploys per environment
before the env catches up to the seed value. After that, normal
reuse/create flow takes over. This is fine because prompt_sha256 +
git_sha are the durable cross-environment identity, not the per-
project version numbers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ertion

The verify-testpypi smoke test in staging.yml and release.yml runs
'agentops init' then asserts 'test -f .azure/config.json'. In CI
stdin is not a TTY so init skips the interactive wizard, which means
the azd bootstrap (the only path that creates .azure/config.json
without explicit flags) never runs and the assertion always fails.

Add --no-prompt --azd-env testenv to both smoke tests so init
unconditionally bootstraps a deterministic azd env directory.
Verified locally: init --no-prompt --azd-env testenv produces
.azure/config.json (defaultEnvironment=testenv) and
.azure/testenv/.env without any prompts; rc=0.

Failure surfaced by staging run 26608637027 against release/v0.3.0
after the OIDC publisher gap for agentops-accelerator was fixed and
publish-testpypi started succeeding.
…190)

Section 8 told readers to add APPLICATIONINSIGHTS_CONNECTION_STRING to
.azure/dev/.env but never said where the string comes from. Reword to:

- Lead with the fact that the manual step is optional - AgentOps
  auto-discovers the connection string through the Azure AI Projects
  SDK when the dev Foundry project has App Insights connected.
- Document three discovery paths for the manual case: Foundry portal
  (Tracing tab), Azure Portal (App Insights Overview), and az CLI.
- Promote the section 7 callout to set the same expectation earlier.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
)

PO running through the tutorial pointed out that section 8's optional
App Insights subsection reads as if the resource might not exist,
when in fact step 3 (both Path A and Path B) already instructs the
reader to attach App Insights to the dev project. Reframe accordingly:

- Section 7 callout now leads with 'should already be wired from
  step 3' and gives a 10-second portal check to confirm.
- Section 8 optional subsection now starts with 'you can skip this
  subsection', shows a 2-row decision table for the verification
  outcome, and only walks through the override case (a dedicated
  observability resource) at the end. Adds a 'just click Connect
  Application Insights in the Foundry portal' fallback for the rare
  case the resource was not created in step 3.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…smatch

In staging.yml the agentops-accelerator package is built with
setuptools-scm and published to TestPyPI as 0.2.3.devN, while the
tombstone (agentops-toolkit) hardcodes 'agentops-accelerator>=0.3.0'
in its pyproject.toml. A standard pip install of the tombstone then
fails because TestPyPI carries no >=0.3.0 candidate for the main
package — the version is only minted when release.yml runs against
the real v0.3.0 tag.

Restructure the verify job into two install steps:
  1. agentops-toolkit==<tombstone_version> --no-deps from TestPyPI
     (installs the wheel for its metadata only; --no-deps preserves
     'pip show Requires:' content so the redirect assertion still
     holds, verified empirically by Critic)
  2. agentops-accelerator --pre from TestPyPI (satisfies the runtime
     dep so 'import agentops' resolves; --pre is required so pip
     accepts the dev versions present in staging)

Both pre-existing assertions are preserved unchanged:
  - pip show agentops-toolkit lists agentops-accelerator under Requires
  - python -c 'import agentops' returns successfully

release.yml is intentionally NOT modified: its verify-tombstone-testpypi
works correctly because publish-testpypi (in release.yml) publishes
agentops-accelerator at the real 0.3.0 tag version, so the tombstone
dep resolves cleanly without --no-deps.
…tops-toolkit

The VSIX tombstone scaffolded in 2f128d9 mistakenly identified itself as
AgentOpsToolkit.agentops-skills — an extension name that has never existed
on the VS Code Marketplace. The actual deprecated extension is
AgentOpsToolkit.agentops-toolkit (live since v0.1.4, latest v0.1.8). It
also pointed users to AgentOpsAccelerator.agentops-skills as the migration
target, but the new extension's real identity (per
plugins/agentops/package.json) is AgentOpsAccelerator.agentops-accelerator.

Without this fix, publishing the v0.3.0 tombstone would create a brand-new
extension nobody has installed, and the ~5 versions of the legacy extension
already in users' VS Code installs would never receive a deprecation prompt.

Changes:
  tombstones/vscode/package.json   - rename 'agentops-skills' to
                                     'agentops-toolkit' so the published
                                     identity is AgentOpsToolkit.agentops-toolkit
                                     (republishes onto the existing listing,
                                     triggers in-place update for current users)
  tombstones/vscode/src/extension.ts
                                   - NEW_EXTENSION_ID points at
                                     AgentOpsAccelerator.agentops-accelerator
                                   - DEPRECATION_MESSAGE interpolates the
                                     constant instead of hardcoding the literal
                                     (prevents future drift)
  tombstones/vscode/README.md      - corrected legacy ID + migration target
  tombstones/vscode/CHANGELOG.md   - corrected legacy ID + migration target
  tombstones/vscode/CDN_DEPRECATION_REQUEST.md
                                   - corrected all subject lines, body IDs,
                                     and pre-flight checklist IDs
  CHANGELOG.md                     - corrected the [0.3.0] VS Code
                                     Marketplace tombstone bullet (flows
                                     into GitHub release body and PyPI
                                     tombstone long description)
  docs/verifying-tombstones.md     - corrected all 10 publisher-prefixed
                                     identifiers in verification commands
                                     (vsce show, --install-extension,
                                     marketplace URLs)

STORAGE_KEY ('agentops-toolkit.deprecation-prompt-shown') is unchanged —
already aligned with the legacy namespace. Version 0.3.0 > existing 0.1.8
satisfies VSCE monotonicity. Legacy versions were declarative-only (no
commands/keybindings/settings/views — only contributes.chatSkills), so
replacing them with the tombstone manifest carries no user-binding break
risk.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The two staging VSIX publish steps (main extension at line 348 and tombstone
at line 425) carried 'continue-on-error: true' which silently swallowed
every failure mode — including an expired VSCE_PAT. This made staging look
green when nothing was actually published to the Marketplace, hiding the
auth issue for an extended period and producing zero pre-release versions
for either AgentOpsAccelerator.agentops-accelerator or
AgentOpsToolkit.agentops-toolkit.

Replaces the blanket tolerance with the same conditional pattern release.yml
already uses (lines 469-489): capture combined stdout+stderr, swallow the
exit code ONLY when the output contains 'already exists' (re-run of an
already-published pre-release version, the only benign failure mode), and
propagate every other exit code so auth, network, validation, and
marketplace errors fail the job and surface in CI.

Also moves the PAT into an env: block as VSCE_PAT (out of the interpolated
shell line) to match release.yml's pattern and avoid leaking the secret
into command history or logs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Four agent modules imported PyYAML (`import yaml`) but `pyproject.toml` only declares `ruamel.yaml`. On clean CI runners (no PyYAML in the venv) this raised `ModuleNotFoundError: yaml` at `agentops doctor` time, breaking the doctor step of the generated deploy workflow.

Refactored to the existing `ruamel.yaml` dependency (same pattern as `src/agentops/utils/yaml.py`):

- `agent/checks/opex_workspace.py`

- `agent/checks/spec_conformance.py`

- `agent/llm_assist/_bundle_rule.py`

- `agent/cockpit.py` (preserved the lazy-import semantics inside `_resolve_agent_identity`)

Behaviour preserved: `YAML(typ='safe').load(...)` returns plain dicts/lists just like `yaml.safe_load(...)`. No new runtime dependency.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…result shape (#195)

The cloud-eval parser was returning value=null for every metric in real Foundry runs even when graders completed successfully, causing the PR / deploy gate to fire 'Threshold status: FAILED' with all thresholds showing actual=missing on the very first tutorial pass.

Root cause: _metric_from_result only probed {score|value|result|passed} at the top level. The real azure_ai_evaluator shape (verified against Azure/azure-sdk-for-python fixture evaluation_util_convert_expected_output.json) emits {type, name, metric, score, label, reason, threshold, passed, sample, status}, and some custom prompt-based graders nest the score under sample.score / details.score.

Fix: widen the probe to (score, value, result, metric_value, rating, grader_score, numeric_value), then passed (bool), then label ('pass'/'fail'), then descend into sample/details. Treat score: 0 as a legitimate value (was being lost). When still nothing found, record a structured error pointing at the new raw-items artifact.

Also: always persist the raw Foundry output_items as cloud_output_items.json next to results.json so future parser regressions are debuggable from the artifact bundle alone, and emit an explicit progress warning when a cloud run yields zero usable scores despite returning rows.

Tests: +5 new tests covering the real Foundry shape, score=0 boundary, label-only fallback, nested sample.score, and the diagnostic error path. Full suite: 789 passed, 3 skipped.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The agentops-toolkit -> agentops-accelerator deprecation tombstones
were shipped one-shot for v0.3.0 / v0.3.1 (PyPI manual upload, VSIX
via marketplace). Going forward, the release pipeline must not touch
any tombstone work.

Workflow deletions:
- release.yml: build-pypi-tombstone, publish-tombstone-testpypi,
  verify-tombstone-testpypi, publish-tombstone-pypi,
  publish-tombstone-vsix
- staging.yml: build-pypi-tombstone, publish-tombstone-testpypi,
  verify-tombstone-testpypi, publish-tombstone-vsix-prerelease
- cut-release.yml: 3 tombstone version-sync sub-blocks + git-add
  paths + PR-body bullets

github-release simplification:
- needs: [publish-pypi, publish-vsix] (was: + 2 tombstone jobs)
- if: drop dead 'always() &&' guard (both deps are required, so
  always() never changes behavior). Verified equivalent on all four
  outcome combinations: both-success, partial-success, cancelled,
  upstream-failure.

Repo hygiene:
- Delete scripts/verify_tombstones.py (already-done one-shot harness;
  would now fail with 'asset not found' on its 6-asset expectation
  because the pipeline ships only 3 assets).
- Delete docs/verifying-tombstones.md (references deleted CI jobs).
- Delete tombstones/pypi/ and tombstones/vscode/{LICENSE,CHANGELOG.md,
  icon.png,package.json,README.md,tsconfig.json,.vscodeignore,src} —
  orphaned source no workflow consumes anymore.
- Keep tombstones/vscode/CDN_DEPRECATION_REQUEST.md as the template
  for the still-pending Microsoft CDN deprecation request.

Line-ending consistency:
- Add .gitattributes pinning *.yml / *.yaml / *.sh / *.md / *.py to
  LF so Windows clones with core.autocrlf=true don't repeatedly flip
  workflow files between CRLF and LF.
- Normalize _build.yml and ci.yml (previously CRLF) to LF so the
  entire .github/workflows/ tree uses one convention.

Tombstone references remaining (intentional): CHANGELOG.md historical
v0.3.0 entries; tombstones/vscode/CDN_DEPRECATION_REQUEST.md.

Refs: #181

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The cloud-eval parser fix in the previous release added a `0 usable
scores` warning that points operators at
`.agentops/results/latest/cloud_output_items.json` for triage. That
file was being written by the orchestrator but never made it into the
GitHub Actions / Azure DevOps artifact bundle, so the very people who
needed it most (anyone hitting an unrecognized Foundry grader shape in
CI) could not actually inspect it without re-running the eval locally.

Add the raw dump to both `__EVAL_ARTIFACT_PATHS__` upload lists in
`services/cicd.py`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tone-pipeline-jobs

# Conflicts:
#	CHANGELOG.md
chore(ci): retire one-shot tombstone publish jobs
The publish-dev job has been failing on every push to develop with
'OpenID Connect token retrieval failed: missing or insufficient OIDC
token permissions'.

Root cause:
- Job passed password: ${{ secrets.TEST_PYPI_TOKEN }} to
  pypa/gh-action-pypi-publish
- TEST_PYPI_TOKEN secret was never configured in the staging environment
- The action fell back to OIDC Trusted Publishing, which failed because
  the job had no 'permissions: id-token: write' declaration

Fix:
- Add job-level 'permissions: id-token: write' to publish-dev
- Remove the password: line so the action uses OIDC unambiguously
- Document the new Trusted Publisher setup steps in the workflow header,
  mirroring the pattern proven in staging.yml and release.yml

Pre-merge operational prerequisites:
1. Register a Trusted Publisher on TestPyPI for this workflow:
   https://test.pypi.org/manage/project/agentops-accelerator/settings/publishing/
   owner=Azure, repo=agentops, workflow=ci.yml, environment=staging
   (separate entry from the workflow=staging.yml / workflow=release.yml
   publishers — TestPyPI matches workflow filename exactly).
2. Confirm the 'staging' GitHub Environment allows develop under
   'Deployment branches and tags'.

Without these, the first post-merge run will fail with 'not a trusted
publisher' or environment-policy errors, not the current OIDC error.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fix(ci): migrate publish-dev to OIDC Trusted Publishing
)

When a Foundry azure_ai_evaluator grader fails to execute (e.g., the evaluator service principal lacks Cognitive Services OpenAI User on the model deployment), the per-metric score returns null and the real cause is buried in result.sample.error.message. Without surfacing it, operators see only actual=missing in the threshold table and have to dig into cloud_output_items.json to find the RBAC failure.

The parser now extracts sample.error.message (and top-level error dicts), prefixing the error code when present. The orchestrator's 0-usable-scores warning quotes the first grader error so CI logs carry the actionable cause.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts:
#	CHANGELOG.md
#	src/agentops/pipeline/cloud_results.py
#	src/agentops/pipeline/orchestrator.py
#	tests/unit/test_cloud_results.py
…le (#203)

Foundry `azure_ai_evaluator` graders impersonate the OIDC principal
to call OpenAI; without `Cognitive Services OpenAI User` on the
underlying AI Services account the graders fail with a 401
PermissionDenied and every cloud eval metric returns null. Verified
end-to-end on placerda/agentops-prompt-quickstart: after granting the
role, the first PR run goes green from scratch.

- agentops-workflow SKILL.md: pre-dispatch checks now list both Foundry
  User (Foundry project) AND Cognitive Services OpenAI User (AI
  Services account), with role ids and az role assignment create
  commands for each.
- tutorial-prompt-agent-quickstart.md: step 12's Copilot prompt and the
  workflow-skill walkthrough list both roles.
- tutorial-end-to-end.md: both workflow-skill prompts list both roles.
- docs/ci-github-actions.md: prerequisite section lists both roles with
  the OpenAI graders' failure mode spelled out.
- plugins/agentops/skills/agentops-workflow/SKILL.md: synced from src/.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…licit (#205)

When the user runs `agentops init --azd-env <name>` to bootstrap a fresh env (e.g. `--azd-env dev` while the active env is `sandbox`), the wizard previously pre-filled `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT` from whatever source it could find first: process environment, legacy top-level `project_endpoint:` in `agentops.yaml`, or a *different* `.azure/<env>/.env` file. This silently leaked the sandbox endpoint into the dev env and caused the multi-env tutorial users to push the wrong URL to GitHub.

Now `discover_defaults` tracks the endpoint provenance, and when the wizard is called with an explicit `target_env_name` it refuses to pre-fill the endpoint default if the source is not the targeted env's own .env file. The user gets a short note explaining where the suspect default came from and is prompted with no default. Values picked up from the targeted env's own `.azure/<env>/.env` are still honored. Bare `agentops init` (no `--azd-env`) keeps its existing best-effort behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The three documentation entries that were labeled "Prompt Agent quickstart", "Hosted Agent quickstart", and "End-to-end workshop" now read as "Foundry Prompt Agent tutorial", "Hosted or HTTP Agent tutorial", and "End-to-end tutorial" across README.md, plugins/agentops/README.md, AGENTS.md, docs/concepts.md, docs/doctor-explained.md, the agentops-workflow skill (both synced copies), and the H1s plus cross-references inside each tutorial doc.

The README description for the end-to-end tutorial now also states explicitly that it extends either of the type-specific tutorials (sandbox -> dev -> qa -> prod plus Foundry red-team scans plus trace-to-regression promotion) so the difference between the three is obvious at a glance.

The "quickstart" framing no longer fits doc bodies that grew past 1000 lines covering multi-environment promotion, regression injection, Doctor evidence, and Cockpit. Tutorial filenames are intentionally preserved (tutorial-*-quickstart.md) to keep inbound links and bookmarks stable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… steps (#207)

When users run the agentops-workflow skill in the CI-wiring step of either

the prompt-agent tutorial (step 12B) or the end-to-end tutorial, the skill

already commits the workspace, pushes main to GitHub, and triggers a first

verification run of agentops-pr.yml. Step 13 of tutorial-prompt-agent-quickstart.md

and the baseline-run paragraph in tutorial-end-to-end.md now open with an

explicit 'if you used the workflow skill above, this is already done' callout

and reframe the manual git add/commit/push and gh workflow run as a fallback

for users who skipped the skill or wired CI by hand.

The deliberate baseline-PR step that follows (open feature branch, open PR,

merge once green) is unchanged: it must still go through a real pull request,

which the skill does not do for you, so that the rolling Doctor history is seeded.

The hosted-agent tutorial is untouched: its post-skill step (step 11, Open

Cockpit) does not repeat any of the skill's setup actions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two changes:

1. tutorial-prompt-agent-quickstart.md step 13 callout said 'step 12B',

   but step 12 has no A/B split anymore since the wording rename.

   Changed to 'step 12'.

2. tutorial-end-to-end.md step 6 callout said 'workflow skill above',

   which is technically correct but vague. Tightened to

   'workflow skill in step 5 above' so readers can scroll back without

   guessing.

Also fixed the matching CHANGELOG entry from 'step 12B' to 'step 12'

and added 'step 5' for the end-to-end reference. Audited all other

step-number references across the three tutorials and confirmed they

are consistent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@placerda placerda merged commit 07e0c28 into main May 29, 2026
5 checks passed
@placerda placerda deleted the release/v0.3.1 branch May 29, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants