Skip to content

feat: add preflight checks, --dry-run flag, and credential timeout resilience#105

Closed
placerda wants to merge 1 commit into
developfrom
feature/preflight-checks-and-credential-resilience
Closed

feat: add preflight checks, --dry-run flag, and credential timeout resilience#105
placerda wants to merge 1 commit into
developfrom
feature/preflight-checks-and-credential-resilience

Conversation

@placerda
Copy link
Copy Markdown
Contributor

Summary

Generalizable improvements to agentops eval run based on a real-world RAG evaluation testing session where the workflow failed multiple times on issues that could be detected upfront. Addresses the friction points without adding project-specific code.

Changes

Phase 1 — Pre-flight check system

New src/agentops/services/preflight.py module with 4 checks that run before backend execution in runner.py:

  1. SDK importsazure-identity and azure-ai-evaluation. Fails with pip install hints.
  2. Environment variablesAZURE_OPENAI_ENDPOINT/AZURE_OPENAI_DEPLOYMENT for AI-assisted evaluators; AZURE_AI_FOUNDRY_PROJECT_ENDPOINT for safety evaluators.
  3. Credential warm-up — calls DefaultAzureCredential.get_token() once so subsequent evaluator calls use the MSAL cache instead of each cold-starting az.cmd.
  4. Endpoint reachability — lightweight HTTP HEAD (10s timeout) on remote-mode endpoints.

All checks run together and collect errors — failures are reported as a single numbered list so users can fix everything in one pass.

Phase 2 — --dry-run / -n flag on eval run

Runs pre-flight checks only. Exit 0 if all pass, 1 if any fail. Useful for CI gating and fast feedback loops.

Phase 3 — Credential timeout resilience

Added process_timeout=30 to all 3 DefaultAzureCredential sites:

  • eval_engine.py::_default_credential()
  • foundry_backend.py::_acquire_token()
  • foundry_backend.py::_invoke_model_direct()

The default 10s is insufficient for Windows az.cmd cold starts and was producing intermittent AzureCliCredential: Failed to invoke the Azure CLI errors.

Phase 4 — Tests

New tests/unit/test_preflight.py with 14 tests covering:

  • PreflightReport formatting
  • Evaluator classification helpers
  • Local-only bundles skip Azure checks
  • Missing env vars reported
  • Missing SDK reported with pip install hint
  • Endpoint unreachable detected
  • Multiple errors collected at once
  • HTTP endpoint success path

Test Results

253 passed, 1 skipped in 11.93s

All existing 239 tests continue to pass; 14 new preflight tests added.

…silience

Detect common configuration issues before backend execution so they surface fast with actionable error messages.

- Add services/preflight.py with 4 checks: SDK imports, env vars, credential warm-up, endpoint reachability. All checks run and collect errors together (not fail-fast).

- Add --dry-run/-n flag on 'agentops eval run' for CI gating and fast feedback.

- Raise AzureCliCredential process_timeout to 30s in eval_engine.py and foundry_backend.py (2 sites). Fixes Windows az.cmd cold start timeouts.

- Add 14 unit tests for the preflight module.
@placerda
Copy link
Copy Markdown
Contributor Author

Cancelling for now, we'll prioritize #107

@placerda placerda closed this Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant