Skip to content

Fix/test api key enforcement#13

Merged
apundhir merged 6 commits intomainfrom
fix/test-api-key-enforcement
Apr 11, 2026
Merged

Fix/test api key enforcement#13
apundhir merged 6 commits intomainfrom
fix/test-api-key-enforcement

Conversation

@apundhir
Copy link
Copy Markdown
Collaborator

Summary

Changes

Testing

  • Tests pass
  • Manual testing completed
  • No breaking changes (or migration path documented)

Related Issues

Checklist

  • Code follows project style guidelines
  • Documentation updated (if applicable)
  • No secrets or credentials in this PR

apundhir and others added 6 commits April 11, 2026 20:08
Adding ENFORCE_API_KEY=true to .env caused 4 legacy tests to fail with
403 instead of expected 422/200. Root cause: pydantic-settings reads .env
directly, bypassing os.environ patches.

Fix: conftest.py autouse fixture sets env_file=None on AppSettings during
tests and clears the lru_cache, so each test gets fresh settings from
os.environ only. Security tests that explicitly test enforcement are
unaffected — they use their own mock.patch contexts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: /v1/evaluate was calling rr.run_evaluation() directly,
which immediately requires GEMINI_API_KEY even for deterministic metrics
like source_attribution_accuracy.

Fix: route through EvaluationRunner which correctly dispatches:
- Deterministic metrics (source_attribution_accuracy) → no LLM needed
- RAGAS metrics (faithfulness, answer_relevancy, etc.) → calls Gemini/OpenAI
- Retrieval metrics (precision@k, etc.) → no LLM needed

Updated response shape: metrics are at top level (not nested under "result").
Updated 4 tests to patch harness.runner.run_evaluation (correct mock path)
and assert against new response shape.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
os.getenv('GEMINI_API_KEY') returns None when the key is set via .env file
loaded by pydantic-settings — pydantic-settings reads .env into Python
attributes but does NOT export them to os.environ.

Fix: use get_settings().gemini_api_key (reads .env via pydantic-settings)
with os.getenv as fallback for environments where the var is already exported.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…n RAGAS

- gemini-1.5-flash returns 404 NOT_FOUND on the v1beta API — updated to gemini-2.0-flash
- raise_exceptions=False was silently returning NaN scores; changed to True
  so actual Gemini/RAGAS errors surface as real error messages
- ragas_runner now reads gemini_model from app settings (defaults to gemini-2.0-flash)
- Updated .env.example default model name

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ter LLM config

- Handle NaN scores: replace with None so JSON serialises cleanly (null not NaN)
- Add try/except around result.to_pandas() — fall back to scores dict if it fails
- Clean up LLM provider config: gemini via LangChainWrapper, openai via ChatOpenAI
- Log the actual exception when to_pandas() fails for debugging

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@apundhir apundhir merged commit c331cf2 into main Apr 11, 2026
3 checks passed
@apundhir apundhir deleted the fix/test-api-key-enforcement branch April 11, 2026 17:22
apundhir added a commit that referenced this pull request Apr 11, 2026
* fix: conftest.py prevents .env API_KEY from breaking legacy tests

Adding ENFORCE_API_KEY=true to .env caused 4 legacy tests to fail with
403 instead of expected 422/200. Root cause: pydantic-settings reads .env
directly, bypassing os.environ patches.

Fix: conftest.py autouse fixture sets env_file=None on AppSettings during
tests and clears the lru_cache, so each test gets fresh settings from
os.environ only. Security tests that explicitly test enforcement are
unaffected — they use their own mock.patch contexts.


* fix: route /v1/evaluate through EvaluationRunner, not raw RAGAS

Root cause: /v1/evaluate was calling rr.run_evaluation() directly,
which immediately requires GEMINI_API_KEY even for deterministic metrics
like source_attribution_accuracy.

Fix: route through EvaluationRunner which correctly dispatches:
- Deterministic metrics (source_attribution_accuracy) → no LLM needed
- RAGAS metrics (faithfulness, answer_relevancy, etc.) → calls Gemini/OpenAI
- Retrieval metrics (precision@k, etc.) → no LLM needed

Updated response shape: metrics are at top level (not nested under "result").
Updated 4 tests to patch harness.runner.run_evaluation (correct mock path)
and assert against new response shape.


* fix: ragas_runner reads GEMINI_API_KEY from app settings, not os.getenv

os.getenv('GEMINI_API_KEY') returns None when the key is set via .env file
loaded by pydantic-settings — pydantic-settings reads .env into Python
attributes but does NOT export them to os.environ.

Fix: use get_settings().gemini_api_key (reads .env via pydantic-settings)
with os.getenv as fallback for environments where the var is already exported.


* fix: update Gemini model to gemini-2.0-flash, raise_exceptions=True in RAGAS

- gemini-1.5-flash returns 404 NOT_FOUND on the v1beta API — updated to gemini-2.0-flash
- raise_exceptions=False was silently returning NaN scores; changed to True
  so actual Gemini/RAGAS errors surface as real error messages
- ragas_runner now reads gemini_model from app settings (defaults to gemini-2.0-flash)
- Updated .env.example default model name


* fix: robust RAGAS result handling — NaN→None, to_pandas fallback, better LLM config

- Handle NaN scores: replace with None so JSON serialises cleanly (null not NaN)
- Add try/except around result.to_pandas() — fall back to scores dict if it fails
- Clean up LLM provider config: gemini via LangChainWrapper, openai via ChatOpenAI
- Log the actual exception when to_pandas() fails for debugging


---------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant