Skip to content

Add venv support for custom evals#65

Merged
krisztianfekete merged 4 commits intomainfrom
feature/venv-support
Mar 27, 2026
Merged

Add venv support for custom evals#65
krisztianfekete merged 4 commits intomainfrom
feature/venv-support

Conversation

@krisztianfekete
Copy link
Copy Markdown
Contributor

@krisztianfekete krisztianfekete commented Mar 26, 2026

This PR adds automatic venv management for evaluators that ship their own dependencies.

  • When an evaluator includes a requirements.txt alongside its entrypoint, agentevals now creates an isolated cached venv under ~/.cache/agentevals/venvs/, installs the evaluator SDK and the declared dependencies, and runs the evaluator subprocess using that venv's Python interpreter.
  • Each venvs are keyed by evaluator path and invalidated when requirements.txt changes. It prefers uv when available, falling back to venv + pip.

Companion PR: agentevals-dev/evaluators#5 as the first custom evaluator that contains third party dependencies.

Fixes #58

@krisztianfekete krisztianfekete requested a review from Copilot March 26, 2026 15:23
@krisztianfekete krisztianfekete marked this pull request as ready for review March 26, 2026 15:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional virtualenv creation/caching for Python custom evaluators that ship a requirements.txt, so evaluator subprocesses can run with isolated dependencies.

Changes:

  • Introduce venv management utilities (ensure_venv / ensure_venv_async) that create cached environments and install deps (+ evaluator SDK).
  • Extend custom evaluator subprocess execution to optionally run Python evaluators using a venv interpreter.
  • Enhance evaluator sources to also fetch/copy requirements.txt alongside evaluator code; update example config to include a deps-heavy evaluator.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/agentevals/evaluator/venv.py New venv creation/install + caching logic for Python evaluators.
src/agentevals/evaluator/sources.py Fetch/copy requirements.txt alongside evaluator entrypoints.
src/agentevals/custom_evaluators.py Thread venv interpreter path through runtime command construction and executor factory.
examples/custom_evaluators/eval_config.yaml Add example evaluator entry intended to exercise dependency installation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@krisztianfekete krisztianfekete requested a review from peterj March 27, 2026 09:40
@krisztianfekete krisztianfekete requested a review from peterj March 27, 2026 10:29
@krisztianfekete krisztianfekete merged commit b117a0d into main Mar 27, 2026
4 checks passed
@krisztianfekete krisztianfekete deleted the feature/venv-support branch March 27, 2026 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

support for custom evaluators that have additional dependencies

3 participants