llm-caching

Here are 5 public repositories matching this topic...

zakariaf / RAG-Cache

High-performance LLM query cache with semantic search. Reduce API costs 80% and latency from 8.5s to 1ms using Redis + Qdrant vector DB. Multi-provider support (OpenAI, Anthropic).

redis embeddings openai cost-optimization rag fastapi vector-database qdrant semantic-cache llm-caching

Updated Dec 2, 2025
Python

ankitvirdi4 / awesome-llm-cost

Star

Tools, libraries, papers, and patterns for reducing the cost of running large language models in production.

awesome gemini openai awesome-list quantization finops cost-engineering llm prompt-caching anthropic llm-observability llm-cost llm-routing llm-caching ai-cost

Updated May 12, 2026

smartass-4ever / Mnemon

Star

Cut LLM agent token costs by 93%. Execution cache for LangChain, CrewAI, AutoGen — 2.66ms vs 20 seconds, zero tokens on repeat runs.

python ai openai ai-tools langchain anthropic ai-agents-framework token-optimization llm-caching llm-cost-reduction

Updated May 16, 2026
Python

vikyw89 / gpt-cache-playground

Star

python openai ai-tools chatgpt prompt-caching semantic-cache llm-caching gpt-cache

Updated Apr 5, 2026

dd3ok / document-briefing-cache-skill

Star

Agent skill for caching structured document briefings — summarize once, reuse everywhere. Reduces redundant LLM calls with fingerprint-based caching.

python document-summarization llm-caching agent-skill codex-skill openai-skill

Updated May 14, 2026
Python

Improve this page

Add a description, image, and links to the llm-caching topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-caching topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-caching

Here are 5 public repositories matching this topic...

zakariaf / RAG-Cache

ankitvirdi4 / awesome-llm-cost

smartass-4ever / Mnemon

vikyw89 / gpt-cache-playground

dd3ok / document-briefing-cache-skill

Improve this page

Add this topic to your repo