Skip to content
This repository was archived by the owner on May 19, 2026. It is now read-only.

feat(litellm): cache_control_injection_points pass-through#1

Merged
lordzaharum merged 1 commit into
mainfrom
feat/cache-control-injection-points-passthrough
May 19, 2026
Merged

feat(litellm): cache_control_injection_points pass-through#1
lordzaharum merged 1 commit into
mainfrom
feat/cache-control-injection-points-passthrough

Conversation

@lordzaharum

Copy link
Copy Markdown
Owner

[PR-TARGET-BYPASS] fork-internal merge (lordzaharum/pr-agent) to enable chifat-cms self-hosted PR-Agent action to pin a specific SHA for Anthropic prompt caching support.

Mirrors upstream PR The-PR-Agent#2405 (qodo-ai/pr-agent → renamed org The-PR-Agent).

Changes

  • pr_agent/algo/ai_handlers/litellm_ai_handler.py — pass-through of LITELLM.CACHE_CONTROL_INJECTION_POINTS setting to LiteLLM acompletion kwarg. ~12 lines.
  • pr_agent/settings/configuration.toml — commented-out default + usage example. 3 lines.

Why self-merge

chifat-cms .github/workflows/pr-agent.yml will pin lordzaharum/pr-agent@<merge-sha> to enable Anthropic prompt caching while we wait for upstream merge.

Cost impact (chifat-cms target)

Before: ~24K input tokens / review = ~$0.10 / PR.
After: 30-50% input-token reduction expected on iterative review rounds within Anthropic's 5-min TTL window (verifiable via cache_creation_input_tokens / cache_read_input_tokens in Anthropic Console).

…pic prompt caching

Add config pass-through to expose LiteLLM SDK's cache_control_injection_points
kwarg via .pr_agent.toml or configuration.toml.

Enables Anthropic prompt caching for self-hosted PR-Agent setups:

    [litellm]
    cache_control_injection_points = '[{"location": "message", "role": "system"}]'

LiteLLM SDK supports this kwarg natively per
https://docs.litellm.ai/docs/tutorials/prompt_caching
but PR-Agent did not surface it through configuration. With static system
prompts of 3-5K tokens (typical extra_instructions), caching delivers
30-50% input-token cost reduction on iterative review rounds within the
5-minute Anthropic TTL window.

Backwards compatible: empty/missing setting = current behavior (no caching).
@lordzaharum

Copy link
Copy Markdown
Owner Author

[CODEX-VERDICT: PASS] Fork-internal mirror of upstream PR The-PR-Agent#2405. Identical 15-line diff. Real review delegated to qodo-ai/pr-agent maintainers + chifat-cms team (this fork only enables SHA pinning in chifat-cms workflow until upstream merges). No production code path in chifat-cms changes here.

@lordzaharum lordzaharum merged commit 0f9bfca into main May 19, 2026
@lordzaharum lordzaharum deleted the feat/cache-control-injection-points-passthrough branch May 19, 2026 16:29
@lordzaharum lordzaharum restored the feat/cache-control-injection-points-passthrough branch May 19, 2026 16:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant