Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
9e76b3f
feat(core): add flat agentops.yaml schema and evaluator inference
placerda Apr 27, 2026
f0d11c7
feat(pipeline): add 1.0 evaluation orchestrator with 4 backends
placerda Apr 27, 2026
a1c6a31
feat(cli,init,docs): wire flat schema into CLI and docs
placerda Apr 27, 2026
7f0e435
docs(readme): remove legacy multi-file quickstart section
placerda Apr 27, 2026
ff62b79
docs: drop unimplemented commands and stale planning docs from README
placerda Apr 27, 2026
a427ff4
feat(pipeline): opt-in publish to Foundry Evaluations panel
placerda Apr 27, 2026
167cf44
feat(e2e): add offline e2e demo script and GH Actions workflow
placerda Apr 27, 2026
5e90e52
fix(e2e): install azure-ai-evaluation in CI; check endpoint before SD…
placerda Apr 27, 2026
57bfefa
refactor(cli): remove planned-stub commands and dead modules
placerda Apr 27, 2026
2231af9
refactor(core): drop legacy schema, runner, backends, and templates
placerda Apr 27, 2026
b0249aa
refactor(skills): rewrite skills for flat 1.0 schema; drop monitor/tr…
placerda Apr 27, 2026
7687915
feat(mcp): add 'agentops mcp serve' MCP stdio server
placerda Apr 27, 2026
d96c138
docs(changelog): refresh 1.0 revamp section
placerda Apr 27, 2026
5f56188
ci(e2e): switch to manual-only trigger (workflow_dispatch)
placerda Apr 27, 2026
50e4a66
ci(e2e): add live Azure scenarios via OIDC + dynamic provisioning
placerda Apr 28, 2026
b8e6acf
test(e2e): lock in render_config schema contract
placerda Apr 28, 2026
989698b
docs(e2e): replace specific RG example with placeholder
placerda Apr 28, 2026
1ac31b2
fix(e2e): bootstrap.bicep API version and project endpoint output
placerda Apr 28, 2026
06a0ad4
fix(e2e): use 'agentops eval run' subcommand and skip foundry-hosted …
Copilot Apr 28, 2026
ff62b6f
fix(e2e): pass AZURE_OPENAI_ENDPOINT/DEPLOYMENT to eval steps
Copilot Apr 28, 2026
e1ddb76
fix(evaluators): match _latency sentinel name for runtime loader
Copilot Apr 28, 2026
961c838
fix(e2e): correct echo response_field to json.message; bump default m…
Copilot Apr 28, 2026
fc17823
fix(e2e): permissive thresholds, login on http-aca, latency-only for …
Copilot Apr 28, 2026
d1027ac
ci(e2e): tie live jobs to GitHub Environment 'e2e' for OIDC
Copilot Apr 28, 2026
2962d3b
feat(e2e): transient hosted agent with tools + per-scenario transcripts
Copilot Apr 28, 2026
197e9fd
fix(eval): drop unsupported TaskCompletionEvaluator; transcript grace…
placerda Apr 28, 2026
317824b
fix(foundry): tolerate tool-only responses; drop task_completion thre…
placerda Apr 28, 2026
26542df
fix(foundry): synthesize text from tool_call summaries when no message
placerda Apr 28, 2026
c84db07
fix(e2e): relax coherence/fluency/similarity thresholds for tool-only…
placerda Apr 28, 2026
b7cc02e
feat(e2e): markdown transcripts, force node24, document hosted-agent …
placerda Apr 28, 2026
b9b7811
ci(e2e): replace azure/login@v2 with bash composite action (no Node20)
placerda Apr 28, 2026
e18b8e0
ci(e2e): teardown also needs actions/checkout for local composite action
placerda Apr 28, 2026
8492bb7
ci(e2e): bump teardown checkout to v6 to silence Node20 warning
placerda Apr 28, 2026
7d04c61
e2e: replace echo http-aca with Microsoft Agent Framework hello-agent…
placerda Apr 28, 2026
14a040a
e2e(agent-app): drop pydantic pin (let agent-framework pick compatibl…
placerda Apr 28, 2026
fa88434
e2e(http-aca): drop explicit evaluators list (let auto-inference hand…
placerda Apr 28, 2026
a5b5fa3
e2e: long-lived UAMI for hello-agent (avoids 401 from AAD propagation)
placerda Apr 28, 2026
020a021
e2e: align artifact names with job names
placerda Apr 28, 2026
50063b4
e2e: consolidated run summary aggregating every job
placerda Apr 28, 2026
b1b6b43
e2e: fix consolidated summary metrics + bump download-artifact to v8
placerda Apr 28, 2026
d904a65
e2e: drop remaining hardcoded model name from HEADER text
placerda Apr 28, 2026
223f809
e2e(http-aca): add tool calling to hello-agent + tools dataset
placerda Apr 28, 2026
ed8124d
e2e(http-aca): add permissive thresholds for all auto-inferred evalua…
placerda Apr 28, 2026
c03b3f6
chore: gitignore tmp/ (was accidentally committed)
placerda Apr 28, 2026
b050b08
templates(workflows): bump to Node24 actions + silence azure/login wa…
placerda Apr 28, 2026
201828e
evaluators: pass tool_call trace to TaskAdherence/IntentResolution
placerda Apr 28, 2026
b0e31e4
e2e: retry transient 5xx/429 in hosted-agent bootstrap and invocations
placerda Apr 28, 2026
156d9b0
evaluators: pass tool_definitions to TaskAdherenceEvaluator
placerda Apr 28, 2026
f7e8ecf
results: capture evaluator reason field per row metric
placerda Apr 28, 2026
56d3eed
debug: dump raw payload for task_adherence/intent_resolution
placerda Apr 28, 2026
1c6530c
evaluators: fix task_adherence threshold for binary 0/1 SDK scoring
placerda Apr 28, 2026
8064fae
watchdog: add WAF-AI security posture audit (4th, opt-in category)
placerda Apr 28, 2026
287143a
scripts: one-shot bootstrap for E2E pipeline against a new Azure tenant
placerda Apr 29, 2026
048bb48
docs: drop legacy --flat flag from init examples
placerda Apr 29, 2026
44b0809
docs: rewrite end-to-end tutorial around a tool-calling support agent
placerda Apr 29, 2026
7aafa0f
docs(tutorial-end-to-end): add PowerShell env-var snippets
placerda Apr 29, 2026
e584711
docs(tutorial-end-to-end): make PowerShell the default shell
placerda Apr 29, 2026
754a73d
scripts(create_support_agent): preflight token + 'az login' hint
placerda Apr 29, 2026
0818e33
scripts(create_support_agent): print registered tools after creation
placerda Apr 29, 2026
6f44709
docs(tutorial-end-to-end): include azure-ai-evaluation in install step
placerda Apr 29, 2026
e016e7d
docs(tutorial-end-to-end): clarify AOAI endpoint shape and pin api-ve…
placerda Apr 29, 2026
37100ee
feat(runtime): default AZURE_OPENAI_API_VERSION to a New-Foundry-comp…
placerda Apr 29, 2026
f53025e
docs: end-to-end tutorial polish + flat-schema fixes; CLI output clea…
placerda May 7, 2026
afbb0f9
ci: run PR gate inside dev environment so OIDC resolves vars+subject
placerda May 7, 2026
28fd2fa
ci: workflow templates use agentops.yaml (1.0 schema), drop .agentops…
placerda May 7, 2026
cd8f3b5
ci(templates): install agentops-toolkit from develop until 1.0 ships …
placerda May 7, 2026
6136ef7
merge: develop into feature/revamp-1.0
placerda May 7, 2026
fa3c95d
test(cicd): PR template now runs inside environment: dev (OIDC fix)
placerda May 7, 2026
8392891
fix: ruff E701/F401 + Click 8.2 CliRunner.mix_stderr removal
placerda May 7, 2026
7f99d1b
test: skip integration tests when azure-ai-evaluation is missing; add…
placerda May 7, 2026
31b9af8
fix: clean mypy errors (pydantic plugin + 6 typing fixes)
placerda May 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions .github/actions/azure-oidc-login/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
name: Azure OIDC login (composite, no node20 deps)
description: |
Drop-in replacement for azure/login@v2 that performs the OIDC federated
token exchange entirely in bash, so it does not pull any Node.js 20
JavaScript actions and keeps the Actions run free of "Node.js 20 is
deprecated" annotations.

After this step runs, the az CLI is authenticated and AZURE_* environment
variables are exported for downstream tools (azure-identity etc).

inputs:
client-id:
description: "Microsoft Entra application (client) ID"
required: true
tenant-id:
description: "Microsoft Entra tenant ID"
required: true
subscription-id:
description: "Azure subscription ID"
required: true
audience:
description: "Federated identity audience"
required: false
default: "api://AzureADTokenExchange"

runs:
using: composite
steps:
- name: Federated OIDC login (bash)
shell: bash
env:
AZURE_CLIENT_ID: ${{ inputs.client-id }}
AZURE_TENANT_ID: ${{ inputs.tenant-id }}
AZURE_SUBSCRIPTION_ID: ${{ inputs.subscription-id }}
OIDC_AUDIENCE: ${{ inputs.audience }}
run: |
set -euo pipefail

: "${ACTIONS_ID_TOKEN_REQUEST_TOKEN:?id-token permission missing on the job}"
: "${ACTIONS_ID_TOKEN_REQUEST_URL:?id-token permission missing on the job}"

echo "::group::Requesting OIDC ID token from GitHub"
ID_TOKEN_JSON=$(curl -sS \
-H "Authorization: bearer ${ACTIONS_ID_TOKEN_REQUEST_TOKEN}" \
-H "Accept: application/json" \
"${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=${OIDC_AUDIENCE}")
ID_TOKEN=$(printf '%s' "$ID_TOKEN_JSON" | python3 -c 'import sys,json;print(json.load(sys.stdin)["value"])')
if [[ -z "${ID_TOKEN}" || "${ID_TOKEN}" == "null" ]]; then
echo "Failed to obtain GitHub OIDC ID token. Response was:" >&2
echo "$ID_TOKEN_JSON" >&2
exit 1
fi
echo "::endgroup::"

echo "::group::az login --federated-token"
az login \
--service-principal \
--username "${AZURE_CLIENT_ID}" \
--tenant "${AZURE_TENANT_ID}" \
--federated-token "${ID_TOKEN}" \
--allow-no-subscriptions \
--output none
az account set --subscription "${AZURE_SUBSCRIPTION_ID}"
echo "::endgroup::"

# Export the same env vars azure/login@v2 sets for DefaultAzureCredential
# and other downstream Azure SDKs.
{
echo "AZURE_CLIENT_ID=${AZURE_CLIENT_ID}"
echo "AZURE_TENANT_ID=${AZURE_TENANT_ID}"
echo "AZURE_SUBSCRIPTION_ID=${AZURE_SUBSCRIPTION_ID}"
echo "AZURE_FEDERATED_TOKEN=${ID_TOKEN}"
} >> "${GITHUB_ENV}"

# The federated token is short-lived and a secret; mask it.
echo "::add-mask::${ID_TOKEN}"
2 changes: 1 addition & 1 deletion .github/skills/release-management/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: release-management
description: Guide maintainers and contributors through branching, versioning, changelog updates, and publishing agentops-toolkit. Trigger when users ask about branching strategy, creating a release, version tagging, publishing to PyPI, updating the changelog, cutting a release, opening a PR, or syncing a fork. Common phrases: "cut a release", "how do I publish", "create release branch", "tag a version", "update changelog", "release process", "bump version", "what branch should I use", "feature branch", "prepare release".
description: 'Guide maintainers and contributors through branching, versioning, changelog updates, and publishing agentops-toolkit. Trigger when users ask about branching strategy, creating a release, version tagging, publishing to PyPI, updating the changelog, cutting a release, opening a PR, or syncing a fork. Common phrases include "cut a release", "how do I publish", "create release branch", "tag a version", "update changelog", "release process", "bump version", "what branch should I use", "feature branch", "prepare release".'
---

# Release Management
Expand Down
Loading
Loading