Release v0.3.1#209
Merged
Merged
Conversation
Adds an optional `prompt_agent_bootstrap` block to `agentops.yaml` that lets the prompt-agent deploy workflow create the first version of an agent in a dev / qa / prod Foundry project that does not yet contain one. When the stage step looks up the seed agent and gets a 404, it reads the model deployment (required) plus optional description, model_parameters, and tools from prompt_agent_bootstrap, combines them with prompt_file, and creates the first version automatically. The deployment artifact records the new `action: bootstrapped` for that first run; subsequent deploys follow the normal reuse / next-version flow. This eliminates the previous per-environment manual seeding step that forced users to recreate the same prompt agent in every Foundry project before CI could promote prompts to it. The first PR / deploy into an empty dev project now succeeds out of the box; only the sandbox project needs a manual seed (so authors still have a playground to iterate in). Authentication (401 / 403) and other non-404 errors continue to propagate — the bootstrap path only triggers on a genuine `agent does not exist` 404 via a strict _is_not_found_error helper. `agentops workflow analyze` now emits a warning when a prompt-agent workspace is missing prompt_agent_bootstrap to surface the recommended configuration before operators hit it in CI. Documentation: - tutorial-prompt-agent-quickstart.md: rewrote section 4 (sandbox-only seed), section 9 (yaml example + bootstrap callout + project_endpoint warning), section 11 (stage-on-empty flow), and section 13 (action: bootstrapped on first deploy artifact). - tutorial-end-to-end.md: added a multi-env bootstrap callout pointing at the prompt-agent quickstart for the full journey. - agentops-config skill: documented the new optional block. - agentops-workflow skill: documented the bootstrap-on-empty branch. - CHANGELOG Unreleased: Added entry under Added. Tests: 16 new unit tests across test_prompt_deploy.py (bootstrap path, missing config raises, auth errors propagate, ignores bootstrap when seed exists), test_agentops_config.py (schema validation), and test_workflow_analysis.py (missing-bootstrap warning, silent when present). Full suite: 784 passed, 3 skipped. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ent.json field names (#189) The prompt-agent quickstart tutorial assumed Foundry assigns version :1 when you save+publish a prompt agent for the first time in the portal. In practice, Foundry numbers the unpublished draft :1 and assigns :2 on first publish, so the seed pointer in agentops.yaml should be travel-agent:2, not :1. Also correct the example foundry-agent.json: the artifact written by prompt_deploy uses flat field names (source_agent, candidate_agent, prompt_sha256, git_sha, workflow_url), not the nested agentops.* shape the tutorial previously showed. Add a callout explaining the SDK vs. portal numbering asymmetry: because CI uses the SDK and starts at :1 in an empty project, the bootstrap may fire on the first one or two deploys per environment before the env catches up to the seed value. After that, normal reuse/create flow takes over. This is fine because prompt_sha256 + git_sha are the durable cross-environment identity, not the per- project version numbers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ertion The verify-testpypi smoke test in staging.yml and release.yml runs 'agentops init' then asserts 'test -f .azure/config.json'. In CI stdin is not a TTY so init skips the interactive wizard, which means the azd bootstrap (the only path that creates .azure/config.json without explicit flags) never runs and the assertion always fails. Add --no-prompt --azd-env testenv to both smoke tests so init unconditionally bootstraps a deterministic azd env directory. Verified locally: init --no-prompt --azd-env testenv produces .azure/config.json (defaultEnvironment=testenv) and .azure/testenv/.env without any prompts; rc=0. Failure surfaced by staging run 26608637027 against release/v0.3.0 after the OIDC publisher gap for agentops-accelerator was fixed and publish-testpypi started succeeding.
…190) Section 8 told readers to add APPLICATIONINSIGHTS_CONNECTION_STRING to .azure/dev/.env but never said where the string comes from. Reword to: - Lead with the fact that the manual step is optional - AgentOps auto-discovers the connection string through the Azure AI Projects SDK when the dev Foundry project has App Insights connected. - Document three discovery paths for the manual case: Foundry portal (Tracing tab), Azure Portal (App Insights Overview), and az CLI. - Promote the section 7 callout to set the same expectation earlier. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
) PO running through the tutorial pointed out that section 8's optional App Insights subsection reads as if the resource might not exist, when in fact step 3 (both Path A and Path B) already instructs the reader to attach App Insights to the dev project. Reframe accordingly: - Section 7 callout now leads with 'should already be wired from step 3' and gives a 10-second portal check to confirm. - Section 8 optional subsection now starts with 'you can skip this subsection', shows a 2-row decision table for the verification outcome, and only walks through the override case (a dedicated observability resource) at the end. Adds a 'just click Connect Application Insights in the Foundry portal' fallback for the rare case the resource was not created in step 3. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…smatch
In staging.yml the agentops-accelerator package is built with
setuptools-scm and published to TestPyPI as 0.2.3.devN, while the
tombstone (agentops-toolkit) hardcodes 'agentops-accelerator>=0.3.0'
in its pyproject.toml. A standard pip install of the tombstone then
fails because TestPyPI carries no >=0.3.0 candidate for the main
package — the version is only minted when release.yml runs against
the real v0.3.0 tag.
Restructure the verify job into two install steps:
1. agentops-toolkit==<tombstone_version> --no-deps from TestPyPI
(installs the wheel for its metadata only; --no-deps preserves
'pip show Requires:' content so the redirect assertion still
holds, verified empirically by Critic)
2. agentops-accelerator --pre from TestPyPI (satisfies the runtime
dep so 'import agentops' resolves; --pre is required so pip
accepts the dev versions present in staging)
Both pre-existing assertions are preserved unchanged:
- pip show agentops-toolkit lists agentops-accelerator under Requires
- python -c 'import agentops' returns successfully
release.yml is intentionally NOT modified: its verify-tombstone-testpypi
works correctly because publish-testpypi (in release.yml) publishes
agentops-accelerator at the real 0.3.0 tag version, so the tombstone
dep resolves cleanly without --no-deps.
…tops-toolkit The VSIX tombstone scaffolded in 2f128d9 mistakenly identified itself as AgentOpsToolkit.agentops-skills — an extension name that has never existed on the VS Code Marketplace. The actual deprecated extension is AgentOpsToolkit.agentops-toolkit (live since v0.1.4, latest v0.1.8). It also pointed users to AgentOpsAccelerator.agentops-skills as the migration target, but the new extension's real identity (per plugins/agentops/package.json) is AgentOpsAccelerator.agentops-accelerator. Without this fix, publishing the v0.3.0 tombstone would create a brand-new extension nobody has installed, and the ~5 versions of the legacy extension already in users' VS Code installs would never receive a deprecation prompt. Changes: tombstones/vscode/package.json - rename 'agentops-skills' to 'agentops-toolkit' so the published identity is AgentOpsToolkit.agentops-toolkit (republishes onto the existing listing, triggers in-place update for current users) tombstones/vscode/src/extension.ts - NEW_EXTENSION_ID points at AgentOpsAccelerator.agentops-accelerator - DEPRECATION_MESSAGE interpolates the constant instead of hardcoding the literal (prevents future drift) tombstones/vscode/README.md - corrected legacy ID + migration target tombstones/vscode/CHANGELOG.md - corrected legacy ID + migration target tombstones/vscode/CDN_DEPRECATION_REQUEST.md - corrected all subject lines, body IDs, and pre-flight checklist IDs CHANGELOG.md - corrected the [0.3.0] VS Code Marketplace tombstone bullet (flows into GitHub release body and PyPI tombstone long description) docs/verifying-tombstones.md - corrected all 10 publisher-prefixed identifiers in verification commands (vsce show, --install-extension, marketplace URLs) STORAGE_KEY ('agentops-toolkit.deprecation-prompt-shown') is unchanged — already aligned with the legacy namespace. Version 0.3.0 > existing 0.1.8 satisfies VSCE monotonicity. Legacy versions were declarative-only (no commands/keybindings/settings/views — only contributes.chatSkills), so replacing them with the tombstone manifest carries no user-binding break risk. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The two staging VSIX publish steps (main extension at line 348 and tombstone at line 425) carried 'continue-on-error: true' which silently swallowed every failure mode — including an expired VSCE_PAT. This made staging look green when nothing was actually published to the Marketplace, hiding the auth issue for an extended period and producing zero pre-release versions for either AgentOpsAccelerator.agentops-accelerator or AgentOpsToolkit.agentops-toolkit. Replaces the blanket tolerance with the same conditional pattern release.yml already uses (lines 469-489): capture combined stdout+stderr, swallow the exit code ONLY when the output contains 'already exists' (re-run of an already-published pre-release version, the only benign failure mode), and propagate every other exit code so auth, network, validation, and marketplace errors fail the job and surface in CI. Also moves the PAT into an env: block as VSCE_PAT (out of the interpolated shell line) to match release.yml's pattern and avoid leaking the secret into command history or logs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Four agent modules imported PyYAML (`import yaml`) but `pyproject.toml` only declares `ruamel.yaml`. On clean CI runners (no PyYAML in the venv) this raised `ModuleNotFoundError: yaml` at `agentops doctor` time, breaking the doctor step of the generated deploy workflow. Refactored to the existing `ruamel.yaml` dependency (same pattern as `src/agentops/utils/yaml.py`): - `agent/checks/opex_workspace.py` - `agent/checks/spec_conformance.py` - `agent/llm_assist/_bundle_rule.py` - `agent/cockpit.py` (preserved the lazy-import semantics inside `_resolve_agent_identity`) Behaviour preserved: `YAML(typ='safe').load(...)` returns plain dicts/lists just like `yaml.safe_load(...)`. No new runtime dependency. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…result shape (#195) The cloud-eval parser was returning value=null for every metric in real Foundry runs even when graders completed successfully, causing the PR / deploy gate to fire 'Threshold status: FAILED' with all thresholds showing actual=missing on the very first tutorial pass. Root cause: _metric_from_result only probed {score|value|result|passed} at the top level. The real azure_ai_evaluator shape (verified against Azure/azure-sdk-for-python fixture evaluation_util_convert_expected_output.json) emits {type, name, metric, score, label, reason, threshold, passed, sample, status}, and some custom prompt-based graders nest the score under sample.score / details.score. Fix: widen the probe to (score, value, result, metric_value, rating, grader_score, numeric_value), then passed (bool), then label ('pass'/'fail'), then descend into sample/details. Treat score: 0 as a legitimate value (was being lost). When still nothing found, record a structured error pointing at the new raw-items artifact. Also: always persist the raw Foundry output_items as cloud_output_items.json next to results.json so future parser regressions are debuggable from the artifact bundle alone, and emit an explicit progress warning when a cloud run yields zero usable scores despite returning rows. Tests: +5 new tests covering the real Foundry shape, score=0 boundary, label-only fallback, nested sample.score, and the diagnostic error path. Full suite: 789 passed, 3 skipped. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ones) # Conflicts: # CHANGELOG.md
The agentops-toolkit -> agentops-accelerator deprecation tombstones
were shipped one-shot for v0.3.0 / v0.3.1 (PyPI manual upload, VSIX
via marketplace). Going forward, the release pipeline must not touch
any tombstone work.
Workflow deletions:
- release.yml: build-pypi-tombstone, publish-tombstone-testpypi,
verify-tombstone-testpypi, publish-tombstone-pypi,
publish-tombstone-vsix
- staging.yml: build-pypi-tombstone, publish-tombstone-testpypi,
verify-tombstone-testpypi, publish-tombstone-vsix-prerelease
- cut-release.yml: 3 tombstone version-sync sub-blocks + git-add
paths + PR-body bullets
github-release simplification:
- needs: [publish-pypi, publish-vsix] (was: + 2 tombstone jobs)
- if: drop dead 'always() &&' guard (both deps are required, so
always() never changes behavior). Verified equivalent on all four
outcome combinations: both-success, partial-success, cancelled,
upstream-failure.
Repo hygiene:
- Delete scripts/verify_tombstones.py (already-done one-shot harness;
would now fail with 'asset not found' on its 6-asset expectation
because the pipeline ships only 3 assets).
- Delete docs/verifying-tombstones.md (references deleted CI jobs).
- Delete tombstones/pypi/ and tombstones/vscode/{LICENSE,CHANGELOG.md,
icon.png,package.json,README.md,tsconfig.json,.vscodeignore,src} —
orphaned source no workflow consumes anymore.
- Keep tombstones/vscode/CDN_DEPRECATION_REQUEST.md as the template
for the still-pending Microsoft CDN deprecation request.
Line-ending consistency:
- Add .gitattributes pinning *.yml / *.yaml / *.sh / *.md / *.py to
LF so Windows clones with core.autocrlf=true don't repeatedly flip
workflow files between CRLF and LF.
- Normalize _build.yml and ci.yml (previously CRLF) to LF so the
entire .github/workflows/ tree uses one convention.
Tombstone references remaining (intentional): CHANGELOG.md historical
v0.3.0 entries; tombstones/vscode/CDN_DEPRECATION_REQUEST.md.
Refs: #181
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The cloud-eval parser fix in the previous release added a `0 usable scores` warning that points operators at `.agentops/results/latest/cloud_output_items.json` for triage. That file was being written by the orchestrator but never made it into the GitHub Actions / Azure DevOps artifact bundle, so the very people who needed it most (anyone hitting an unrecognized Foundry grader shape in CI) could not actually inspect it without re-running the eval locally. Add the raw dump to both `__EVAL_ARTIFACT_PATHS__` upload lists in `services/cicd.py`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tone-pipeline-jobs # Conflicts: # CHANGELOG.md
chore(ci): retire one-shot tombstone publish jobs
The publish-dev job has been failing on every push to develop with
'OpenID Connect token retrieval failed: missing or insufficient OIDC
token permissions'.
Root cause:
- Job passed password: ${{ secrets.TEST_PYPI_TOKEN }} to
pypa/gh-action-pypi-publish
- TEST_PYPI_TOKEN secret was never configured in the staging environment
- The action fell back to OIDC Trusted Publishing, which failed because
the job had no 'permissions: id-token: write' declaration
Fix:
- Add job-level 'permissions: id-token: write' to publish-dev
- Remove the password: line so the action uses OIDC unambiguously
- Document the new Trusted Publisher setup steps in the workflow header,
mirroring the pattern proven in staging.yml and release.yml
Pre-merge operational prerequisites:
1. Register a Trusted Publisher on TestPyPI for this workflow:
https://test.pypi.org/manage/project/agentops-accelerator/settings/publishing/
owner=Azure, repo=agentops, workflow=ci.yml, environment=staging
(separate entry from the workflow=staging.yml / workflow=release.yml
publishers — TestPyPI matches workflow filename exactly).
2. Confirm the 'staging' GitHub Environment allows develop under
'Deployment branches and tags'.
Without these, the first post-merge run will fail with 'not a trusted
publisher' or environment-policy errors, not the current OIDC error.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fix(ci): migrate publish-dev to OIDC Trusted Publishing
) When a Foundry azure_ai_evaluator grader fails to execute (e.g., the evaluator service principal lacks Cognitive Services OpenAI User on the model deployment), the per-metric score returns null and the real cause is buried in result.sample.error.message. Without surfacing it, operators see only actual=missing in the threshold table and have to dig into cloud_output_items.json to find the RBAC failure. The parser now extracts sample.error.message (and top-level error dicts), prefixing the error code when present. The orchestrator's 0-usable-scores warning quotes the first grader error so CI logs carry the actionable cause. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts: # CHANGELOG.md # src/agentops/pipeline/cloud_results.py # src/agentops/pipeline/orchestrator.py # tests/unit/test_cloud_results.py
…le (#203) Foundry `azure_ai_evaluator` graders impersonate the OIDC principal to call OpenAI; without `Cognitive Services OpenAI User` on the underlying AI Services account the graders fail with a 401 PermissionDenied and every cloud eval metric returns null. Verified end-to-end on placerda/agentops-prompt-quickstart: after granting the role, the first PR run goes green from scratch. - agentops-workflow SKILL.md: pre-dispatch checks now list both Foundry User (Foundry project) AND Cognitive Services OpenAI User (AI Services account), with role ids and az role assignment create commands for each. - tutorial-prompt-agent-quickstart.md: step 12's Copilot prompt and the workflow-skill walkthrough list both roles. - tutorial-end-to-end.md: both workflow-skill prompts list both roles. - docs/ci-github-actions.md: prerequisite section lists both roles with the OpenAI graders' failure mode spelled out. - plugins/agentops/skills/agentops-workflow/SKILL.md: synced from src/. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts: # CHANGELOG.md
…licit (#205) When the user runs `agentops init --azd-env <name>` to bootstrap a fresh env (e.g. `--azd-env dev` while the active env is `sandbox`), the wizard previously pre-filled `AZURE_AI_FOUNDRY_PROJECT_ENDPOINT` from whatever source it could find first: process environment, legacy top-level `project_endpoint:` in `agentops.yaml`, or a *different* `.azure/<env>/.env` file. This silently leaked the sandbox endpoint into the dev env and caused the multi-env tutorial users to push the wrong URL to GitHub. Now `discover_defaults` tracks the endpoint provenance, and when the wizard is called with an explicit `target_env_name` it refuses to pre-fill the endpoint default if the source is not the targeted env's own .env file. The user gets a short note explaining where the suspect default came from and is prompted with no default. Values picked up from the targeted env's own `.azure/<env>/.env` are still honored. Bare `agentops init` (no `--azd-env`) keeps its existing best-effort behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The three documentation entries that were labeled "Prompt Agent quickstart", "Hosted Agent quickstart", and "End-to-end workshop" now read as "Foundry Prompt Agent tutorial", "Hosted or HTTP Agent tutorial", and "End-to-end tutorial" across README.md, plugins/agentops/README.md, AGENTS.md, docs/concepts.md, docs/doctor-explained.md, the agentops-workflow skill (both synced copies), and the H1s plus cross-references inside each tutorial doc. The README description for the end-to-end tutorial now also states explicitly that it extends either of the type-specific tutorials (sandbox -> dev -> qa -> prod plus Foundry red-team scans plus trace-to-regression promotion) so the difference between the three is obvious at a glance. The "quickstart" framing no longer fits doc bodies that grew past 1000 lines covering multi-environment promotion, regression injection, Doctor evidence, and Cockpit. Tutorial filenames are intentionally preserved (tutorial-*-quickstart.md) to keep inbound links and bookmarks stable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… steps (#207) When users run the agentops-workflow skill in the CI-wiring step of either the prompt-agent tutorial (step 12B) or the end-to-end tutorial, the skill already commits the workspace, pushes main to GitHub, and triggers a first verification run of agentops-pr.yml. Step 13 of tutorial-prompt-agent-quickstart.md and the baseline-run paragraph in tutorial-end-to-end.md now open with an explicit 'if you used the workflow skill above, this is already done' callout and reframe the manual git add/commit/push and gh workflow run as a fallback for users who skipped the skill or wired CI by hand. The deliberate baseline-PR step that follows (open feature branch, open PR, merge once green) is unchanged: it must still go through a real pull request, which the skill does not do for you, so that the rolling Doctor history is seeded. The hosted-agent tutorial is untouched: its post-skill step (step 11, Open Cockpit) does not repeat any of the skill's setup actions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two changes: 1. tutorial-prompt-agent-quickstart.md step 13 callout said 'step 12B', but step 12 has no A/B split anymore since the wording rename. Changed to 'step 12'. 2. tutorial-end-to-end.md step 6 callout said 'workflow skill above', which is technically correct but vague. Tightened to 'workflow skill in step 5 above' so readers can scroll back without guessing. Also fixed the matching CHANGELOG entry from 'step 12B' to 'step 12' and added 'step 5' for the end-to-end reference. Audited all other step-number references across the three tutorials and confirmed they are consistent. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release v0.3.1
Automated release branch created from
develop.What happened
release/v0.3.1created fromdevelopCHANGELOG.mdupdated: versioned section[0.3.1]added0.3.1(package.json, plugin.json, marketplace.json)Next steps
maingit tag v0.3.1 && git push origin v0.3.1git checkout develop && git merge main && git push origin developChecklist
v0.3.1pushed