This repository was archived by the owner on May 19, 2026. It is now read-only.
feat(litellm): cache_control_injection_points pass-through#1
Merged
lordzaharum merged 1 commit intoMay 19, 2026
Merged
Conversation
…pic prompt caching
Add config pass-through to expose LiteLLM SDK's cache_control_injection_points
kwarg via .pr_agent.toml or configuration.toml.
Enables Anthropic prompt caching for self-hosted PR-Agent setups:
[litellm]
cache_control_injection_points = '[{"location": "message", "role": "system"}]'
LiteLLM SDK supports this kwarg natively per
https://docs.litellm.ai/docs/tutorials/prompt_caching
but PR-Agent did not surface it through configuration. With static system
prompts of 3-5K tokens (typical extra_instructions), caching delivers
30-50% input-token cost reduction on iterative review rounds within the
5-minute Anthropic TTL window.
Backwards compatible: empty/missing setting = current behavior (no caching).
Owner
Author
|
[CODEX-VERDICT: PASS] Fork-internal mirror of upstream PR The-PR-Agent#2405. Identical 15-line diff. Real review delegated to qodo-ai/pr-agent maintainers + chifat-cms team (this fork only enables SHA pinning in chifat-cms workflow until upstream merges). No production code path in chifat-cms changes here. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[PR-TARGET-BYPASS] fork-internal merge (lordzaharum/pr-agent) to enable chifat-cms self-hosted PR-Agent action to pin a specific SHA for Anthropic prompt caching support.
Mirrors upstream PR The-PR-Agent#2405 (qodo-ai/pr-agent → renamed org The-PR-Agent).
Changes
pr_agent/algo/ai_handlers/litellm_ai_handler.py— pass-through ofLITELLM.CACHE_CONTROL_INJECTION_POINTSsetting to LiteLLMacompletionkwarg. ~12 lines.pr_agent/settings/configuration.toml— commented-out default + usage example. 3 lines.Why self-merge
chifat-cms
.github/workflows/pr-agent.ymlwill pinlordzaharum/pr-agent@<merge-sha>to enable Anthropic prompt caching while we wait for upstream merge.Cost impact (chifat-cms target)
Before: ~24K input tokens / review = ~$0.10 / PR.
After: 30-50% input-token reduction expected on iterative review rounds within Anthropic's 5-min TTL window (verifiable via
cache_creation_input_tokens/cache_read_input_tokensin Anthropic Console).