Skip to content

Add GEVal logprob artifacts for OpenAI and Fireworks#376

Merged
benjibc merged 3 commits intomainfrom
codex/enable-g-eval-for-eval-protocol
Jan 5, 2026
Merged

Add GEVal logprob artifacts for OpenAI and Fireworks#376
benjibc merged 3 commits intomainfrom
codex/enable-g-eval-for-eval-protocol

Conversation

@benjibc
Copy link
Contributor

@benjibc benjibc commented Dec 16, 2025

Summary

  • point the local filesystem dataset logger at the shared directory utils so it can be imported without sqlite
  • update the GEval logprob example to run with a filesystem logger and Fireworks/OpenAI completion parameters
  • capture JSONL artifacts from running the example with OpenAI and Fireworks models

Testing

  • HOME=/workspace/python-sdk/.ep_home_openai EP_COMPLETION_PARAMS='[{"model":"gpt-3.5-turbo","logprobs":true,"top_logprobs":3}]' python -m pytest examples/deepeval/test_geval_with_logprobs.py -k test_geval_with_logprobs -vv --disable-warnings --maxfail=1
  • HOME=/workspace/python-sdk/.ep_home_fireworks EP_COMPLETION_PARAMS='[{"model":"accounts/fireworks/models/qwen3-8b","logprobs":true,"api_base":"https://api.fireworks.ai/inference/v1","custom_llm_provider":"fireworks_ai"}]' python -m pytest examples/deepeval/test_geval_with_logprobs.py -k test_geval_with_logprobs -vv --disable-warnings --maxfail=1
  • pre-commit run --files eval_protocol/dataset_logger/local_fs_dataset_logger_adapter.py examples/deepeval/test_geval_with_logprobs.py

Codex Task


Note

Introduces first-class support for capturing and persisting provider logprobs, plus a runnable GEval example and artifacts.

  • Adds Message.logprobs (excluded from request payload) and updates model validation/snapshots
  • SingleTurnRolloutProcessor now serializes provider-specific logprobs and stores them on the assistant Message
  • New examples/deepeval/test_geval_with_logprobs.py using LocalFSDatasetLoggerAdapter; adds JSONL artifacts for OpenAI/Fireworks
  • Points filesystem logger to shared eval_protocol.directory_utils
  • Tests: new unit test verifies logprobs capture; snapshot updates for added logprobs field

Written by Cursor Bugbot for commit fca86e1. This will update automatically on new commits. Configure here.

@benjibc benjibc merged commit c260b5a into main Jan 5, 2026
17 checks passed
@benjibc benjibc deleted the codex/enable-g-eval-for-eval-protocol branch January 5, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants