Open
Conversation
added 3 commits
November 14, 2025 04:19
This reverts commit 15dd08d.
| for _ in range(num_generations): | ||
| evaluation_rows.append( | ||
| EvaluationRow( | ||
| messages=[{"role": "user", "content": prompt}], |
There was a problem hiding this comment.
Bug: Standardize Message Object Initialization
messages is initialized with a dict {"role": "user", "content": prompt} instead of a Message object. The EvaluationRow.messages field expects a list of Message objects. This should be messages=[Message(role="user", content=prompt)] after importing Message from eval_protocol.models.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
name: Pull Request
about: Propose changes to the codebase
title: "Brief description of changes"
labels: ''
assignees: ''
Description
Please include a summary of the change and which issue is fixed or feature is implemented. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
Implements # (issue)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration.
Test Configuration:
Checklist:
black .,isort .,flake8 .)Screenshots (if applicable)
If applicable, add screenshots to help showcase your changes.
Additional context
Add any other context about the PR here.
Note
Introduces a generic OpenEnv rollout processor with TRL and vLLM integrations, adds timestamps + debug to the SQLite row store, improves logs server initialization/broadcast diagnostics, and adds OpenEnv integration tests.
OpenEnvRolloutProcessorto run rollouts against any OpenEnvHTTPEnvClient(eval_protocol/pytest/openenv_rollout_processor.py).create_openenv_rollout_funcand vLLM helpercreate_openenv_vllm_rollout_funcfor GRPO pipelines with split modes (eval_protocol/pytest/integrations/openenv_trl.py,openenv_trl_vllm.py).updated_atcolumn with best-effort migration, set on upsert/create, order reads byupdated_atdesc, and add SQL/result debug logs (eval_protocol/dataset_logger/sqlite_evaluation_row_store.py).EP_LOGS_INIT_LIMIT, and add richer debug for init/broadcast (eval_protocol/utils/logs_server.py).tests/pytest/conftest.py).tests/pytest/data/*,tests/pytest/test_openenv_*).Written by Cursor Bugbot for commit 66ac02b. This will update automatically on new commits. Configure here.