feat: improve eval workflow and add skills sync (#87) by placerda · Pull Request #103 · Azure/agentops

placerda · 2026-04-21T13:24:44Z

Summary

Generalizable eval workflow improvements based on real-world RAG evaluation testing, plus implementation of #87 (single source of truth for skills).

Changes

Phase 1 — Single Source of Truth for Skills (closes #87)

Added scripts/sync-skills.sh and scripts/sync-skills.ps1 to copy skills from src/agentops/templates/skills/ to plugins/agentops/skills/
Added tests/unit/test_skills_sync.py — CI test that fails if the two directories diverge
Updated CONTRIBUTING.md with single-source-of-truth rule and sync instructions
Synced all 8 plugin skills from canonical source

Phase 2 — Cross-Platform Subprocess Pattern

Updated agentops-eval and agentops-dataset skills with shutil.which() + shell=(sys.platform == "win32") pattern for subprocess calls

Phase 3 — Generic Auth Carrythrough

Replaced hardcoded auth in skills with generic AGENT_AUTH_HEADER/AGENT_AUTH_TOKEN env vars
Updated callable_adapter.py template to conditionally apply auth headers

Phase 4 — azd Environment Validation

Added azd environment validation substep in agentops-eval and agentops-config skills
Checks: azd env list, resource group existence, stale environment warnings

Phase 5 — Optional Unit Test Generation

Added optional question in agentops-eval Step 1 offering unit test generation
Added full "Unit Test Generation" guidance section with mock patterns

Phase 6 — Enhanced Smoke Test Diagnostics

Added checks for: empty responses, response length, format mismatches (JSON vs SSE), UUID prefixes, HTML error pages

Test Results

239 passed, 1 skipped in 12.99s

- Add sync-skills scripts (bash + PowerShell) and CI test to enforce single source of truth for skills between src/agentops/templates/skills/ and plugins/agentops/skills/ (closes #87) - Add cross-platform subprocess pattern in agentops-eval and agentops-dataset skills (shutil.which + shell detection) - Genericize auth carrythrough: AGENT_AUTH_HEADER/AGENT_AUTH_TOKEN env vars in callable_adapter.py template and agentops-eval skill - Add azd environment validation step in agentops-eval and agentops-config skills - Add optional unit test generation question and guidance section in agentops-eval skill - Enhance smoke test diagnostics with empty response, format mismatch, UUID prefix, and HTML error detection - Update CONTRIBUTING.md with skills single-source-of-truth rule - Sync plugins/agentops/skills/ from canonical src/ templates

placerda requested a review from Dongbumlee April 22, 2026 13:44

Dongbumlee merged commit f4d5abe into develop Apr 22, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve eval workflow and add skills sync (#87)#103

feat: improve eval workflow and add skills sync (#87)#103
Dongbumlee merged 1 commit into
developfrom
feature/eval-workflow-improvements

placerda commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

placerda commented Apr 21, 2026

Summary

Changes

Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants