Skip to content

feat: improve eval workflow and add skills sync (#87)#103

Merged
Dongbumlee merged 1 commit into
developfrom
feature/eval-workflow-improvements
Apr 22, 2026
Merged

feat: improve eval workflow and add skills sync (#87)#103
Dongbumlee merged 1 commit into
developfrom
feature/eval-workflow-improvements

Conversation

@placerda
Copy link
Copy Markdown
Contributor

Summary

Generalizable eval workflow improvements based on real-world RAG evaluation testing, plus implementation of #87 (single source of truth for skills).

Changes

Phase 1 — Single Source of Truth for Skills (closes #87)

  • Added scripts/sync-skills.sh and scripts/sync-skills.ps1 to copy skills from src/agentops/templates/skills/ to plugins/agentops/skills/
  • Added tests/unit/test_skills_sync.py — CI test that fails if the two directories diverge
  • Updated CONTRIBUTING.md with single-source-of-truth rule and sync instructions
  • Synced all 8 plugin skills from canonical source

Phase 2 — Cross-Platform Subprocess Pattern

  • Updated agentops-eval and agentops-dataset skills with shutil.which() + shell=(sys.platform == "win32") pattern for subprocess calls

Phase 3 — Generic Auth Carrythrough

  • Replaced hardcoded auth in skills with generic AGENT_AUTH_HEADER/AGENT_AUTH_TOKEN env vars
  • Updated callable_adapter.py template to conditionally apply auth headers

Phase 4 — azd Environment Validation

  • Added azd environment validation substep in agentops-eval and agentops-config skills
  • Checks: azd env list, resource group existence, stale environment warnings

Phase 5 — Optional Unit Test Generation

  • Added optional question in agentops-eval Step 1 offering unit test generation
  • Added full "Unit Test Generation" guidance section with mock patterns

Phase 6 — Enhanced Smoke Test Diagnostics

  • Added checks for: empty responses, response length, format mismatches (JSON vs SSE), UUID prefixes, HTML error pages

Test Results

239 passed, 1 skipped in 12.99s

- Add sync-skills scripts (bash + PowerShell) and CI test to enforce single source of truth for skills between src/agentops/templates/skills/ and plugins/agentops/skills/ (closes #87)

- Add cross-platform subprocess pattern in agentops-eval and agentops-dataset skills (shutil.which + shell detection)

- Genericize auth carrythrough: AGENT_AUTH_HEADER/AGENT_AUTH_TOKEN env vars in callable_adapter.py template and agentops-eval skill

- Add azd environment validation step in agentops-eval and agentops-config skills

- Add optional unit test generation question and guidance section in agentops-eval skill

- Enhance smoke test diagnostics with empty response, format mismatch, UUID prefix, and HTML error detection

- Update CONTRIBUTING.md with skills single-source-of-truth rule

- Sync plugins/agentops/skills/ from canonical src/ templates
@placerda placerda requested a review from Dongbumlee April 22, 2026 13:44
@Dongbumlee Dongbumlee merged commit f4d5abe into develop Apr 22, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants