Conversation
Adding ENFORCE_API_KEY=true to .env caused 4 legacy tests to fail with 403 instead of expected 422/200. Root cause: pydantic-settings reads .env directly, bypassing os.environ patches. Fix: conftest.py autouse fixture sets env_file=None on AppSettings during tests and clears the lru_cache, so each test gets fresh settings from os.environ only. Security tests that explicitly test enforcement are unaffected — they use their own mock.patch contexts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: /v1/evaluate was calling rr.run_evaluation() directly, which immediately requires GEMINI_API_KEY even for deterministic metrics like source_attribution_accuracy. Fix: route through EvaluationRunner which correctly dispatches: - Deterministic metrics (source_attribution_accuracy) → no LLM needed - RAGAS metrics (faithfulness, answer_relevancy, etc.) → calls Gemini/OpenAI - Retrieval metrics (precision@k, etc.) → no LLM needed Updated response shape: metrics are at top level (not nested under "result"). Updated 4 tests to patch harness.runner.run_evaluation (correct mock path) and assert against new response shape. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
os.getenv('GEMINI_API_KEY') returns None when the key is set via .env file
loaded by pydantic-settings — pydantic-settings reads .env into Python
attributes but does NOT export them to os.environ.
Fix: use get_settings().gemini_api_key (reads .env via pydantic-settings)
with os.getenv as fallback for environments where the var is already exported.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…n RAGAS - gemini-1.5-flash returns 404 NOT_FOUND on the v1beta API — updated to gemini-2.0-flash - raise_exceptions=False was silently returning NaN scores; changed to True so actual Gemini/RAGAS errors surface as real error messages - ragas_runner now reads gemini_model from app settings (defaults to gemini-2.0-flash) - Updated .env.example default model name Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ter LLM config - Handle NaN scores: replace with None so JSON serialises cleanly (null not NaN) - Add try/except around result.to_pandas() — fall back to scores dict if it fails - Clean up LLM provider config: gemini via LangChainWrapper, openai via ChatOpenAI - Log the actual exception when to_pandas() fails for debugging Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
apundhir
added a commit
that referenced
this pull request
Apr 11, 2026
* fix: conftest.py prevents .env API_KEY from breaking legacy tests
Adding ENFORCE_API_KEY=true to .env caused 4 legacy tests to fail with
403 instead of expected 422/200. Root cause: pydantic-settings reads .env
directly, bypassing os.environ patches.
Fix: conftest.py autouse fixture sets env_file=None on AppSettings during
tests and clears the lru_cache, so each test gets fresh settings from
os.environ only. Security tests that explicitly test enforcement are
unaffected — they use their own mock.patch contexts.
* fix: route /v1/evaluate through EvaluationRunner, not raw RAGAS
Root cause: /v1/evaluate was calling rr.run_evaluation() directly,
which immediately requires GEMINI_API_KEY even for deterministic metrics
like source_attribution_accuracy.
Fix: route through EvaluationRunner which correctly dispatches:
- Deterministic metrics (source_attribution_accuracy) → no LLM needed
- RAGAS metrics (faithfulness, answer_relevancy, etc.) → calls Gemini/OpenAI
- Retrieval metrics (precision@k, etc.) → no LLM needed
Updated response shape: metrics are at top level (not nested under "result").
Updated 4 tests to patch harness.runner.run_evaluation (correct mock path)
and assert against new response shape.
* fix: ragas_runner reads GEMINI_API_KEY from app settings, not os.getenv
os.getenv('GEMINI_API_KEY') returns None when the key is set via .env file
loaded by pydantic-settings — pydantic-settings reads .env into Python
attributes but does NOT export them to os.environ.
Fix: use get_settings().gemini_api_key (reads .env via pydantic-settings)
with os.getenv as fallback for environments where the var is already exported.
* fix: update Gemini model to gemini-2.0-flash, raise_exceptions=True in RAGAS
- gemini-1.5-flash returns 404 NOT_FOUND on the v1beta API — updated to gemini-2.0-flash
- raise_exceptions=False was silently returning NaN scores; changed to True
so actual Gemini/RAGAS errors surface as real error messages
- ragas_runner now reads gemini_model from app settings (defaults to gemini-2.0-flash)
- Updated .env.example default model name
* fix: robust RAGAS result handling — NaN→None, to_pandas fallback, better LLM config
- Handle NaN scores: replace with None so JSON serialises cleanly (null not NaN)
- Add try/except around result.to_pandas() — fall back to scores dict if it fails
- Clean up LLM provider config: gemini via LangChainWrapper, openai via ChatOpenAI
- Log the actual exception when to_pandas() fails for debugging
---------
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes
Testing
Related Issues
Checklist