refactor: split caching_unified.py into modular caching/ package by dr-gareth-roberts · Pull Request #16 · dr-gareth-roberts/insideLLMs

dr-gareth-roberts · 2026-01-30T12:09:41Z

Description

Splits the 3,488-line caching_unified.py monolith into a well-organized caching/ package with 12 focused modules:

Module	Contents
`types.py`	Enums (CacheStrategy, CacheStatus, CacheScope), dataclasses (CacheConfig, CacheEntry, CacheStats, CacheLookupResult), CacheEntryMixin
`base.py`	BaseCacheABC abstract base class, key generation functions
`backends/memory.py`	InMemoryCache implementation
`backends/disk.py`	DiskCache (SQLite-backed) implementation
`backends/strategy.py`	StrategyCache with LRU/LFU/FIFO/TTL/SIZE eviction
`prompt_cache.py`	PromptCache for LLM response caching with similarity matching
`model_wrapper.py`	CachedModel wrapper for transparent caching
`utilities.py`	CacheWarmer, MemoizedFunction, memoize decorator
`namespace.py`	CacheNamespace for organizing multiple caches
`deduplicator.py`	ResponseDeduplicator for identifying duplicate responses
`async_adapter.py`	AsyncCacheAdapter for async/await compatibility
`factory.py`	Convenience functions, global cache management, cached decorator

Backward compatibility: caching_unified.py is now a re-export shim - all existing imports continue to work unchanged.

Related Issue

Part of large file splitting initiative identified in codebase deep dive.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Refactoring (no functional changes)
Tests (adding or updating tests)

Checklist

I have read the CONTRIBUTING guidelines
My code follows the project's code style (ran ruff check and ruff format)
I have added/updated tests for my changes
All new and existing tests pass (pytest)
I have updated documentation as needed
My changes generate no new warnings

Testing

make lint passes (ruff check)
91/95 caching tests pass
4 async tests fail on both main and this branch (pre-existing issue: missing pytest-asyncio plugin)

Verified backward compatibility:

from insideLLMs.caching_unified import *  # Works
from insideLLMs.caching import *  # Works

Human Review Checklist

Verify ResponseDeduplicator API matches original (add(), get_unique_responses(), get_duplicate_count(), clear() methods with similarity_threshold parameter)
Verify AsyncCacheAdapter.set() returns CacheEntry (not None)
Confirm all 40+ public symbols are properly re-exported from caching_unified.py
Check for potential circular imports in the new package structure

Additional Notes

Pre-existing issue: 4 async tests fail due to missing pytest-asyncio - not related to this refactoring
New code can import directly from insideLLMs.caching for cleaner imports

Link to Devin run: https://app.devin.ai/sessions/4ea4b20132954ccbb473719daedd219e
Requested by: Dr Gareth Roberts (@dr-gareth-roberts)

Summary by CodeRabbit

New Features
- Introduced comprehensive caching subsystem with multiple storage backends (in-memory, disk-based, strategy-driven)
- Added async-compatible cache adapter
- Implemented configurable eviction strategies (LRU, LFU, FIFO, TTL, SIZE)
- Added semantic similarity search for cached prompts
- Introduced cache warming utility for precomputation
- Added response deduplication with configurable thresholds
- Included memoization decorator for function caching
- Built-in cache statistics and monitoring
- Thread-safe operations across all components

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Split the 3,488-line caching_unified.py into a well-organized caching/ package: New module structure: - caching/types.py: Enums (CacheStrategy, CacheStatus, CacheScope), dataclasses (CacheConfig, CacheEntry, CacheStats, CacheLookupResult), and CacheEntryMixin - caching/base.py: BaseCacheABC abstract base class and key generation functions - caching/backends/memory.py: InMemoryCache implementation - caching/backends/disk.py: DiskCache (SQLite-backed) implementation - caching/backends/strategy.py: StrategyCache with configurable eviction strategies - caching/prompt_cache.py: PromptCache for LLM response caching - caching/model_wrapper.py: CachedModel wrapper for transparent caching - caching/utilities.py: CacheWarmer, MemoizedFunction, memoize decorator - caching/namespace.py: CacheNamespace for organizing multiple caches - caching/deduplicator.py: ResponseDeduplicator for identifying duplicate responses - caching/async_adapter.py: AsyncCacheAdapter for async/await compatibility - caching/factory.py: Convenience functions (create_cache, create_prompt_cache, etc.), global cache management, and cached decorator Backward compatibility: - caching_unified.py is now a re-export shim that imports and re-exports all 40+ public symbols - All existing imports continue to work unchanged - New code can import from insideLLMs.caching directly Benefits: - Improved code organization and maintainability - Easier navigation and understanding of caching functionality - Clear separation of concerns between different cache types - Reduced cognitive load when working with specific cache features

bolt-new-by-stackblitz · 2026-01-30T12:09:45Z

Run & review this pull request in StackBlitz Codeflow.

devin-ai-integration · 2026-01-30T12:09:45Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

sourcery-ai

Sorry @dr-gareth-roberts, your pull request is larger than the review limit of 150000 diff characters

dosubot · 2026-01-30T12:09:58Z

Related Documentation

Checked 15 published document(s) in 1 knowledge base(s). No updates required.

^{How did I do? Any feedback?}

coderabbitai · 2026-01-30T12:10:04Z

📝 Walkthrough

Walkthrough

Introduces a comprehensive new caching subsystem for insideLLMs with multiple cache backends (in-memory, disk-based, strategy-driven), factory functions for cache creation and management, specialized features like prompt caching and response deduplication, and utilities for cache warming and function memoization, all unified through a single package initialization.

Changes

Cohort / File(s)	Summary
Package Initialization `insideLLMs/caching/__init__.py`	Consolidates public API exports from all caching modules, re-exporting cache backends, factories, utilities, type definitions, and specialized caches (prompt cache, deduplicator, model wrapper, async adapter) via comprehensive `__all__` list for unified imports.
Type Definitions `insideLLMs/caching/types.py`	Introduces core data structures: `CacheStrategy`, `CacheStatus`, `CacheScope` enums; `CacheConfig`, `CacheEntry`, `CacheStats`, `CacheLookupResult` dataclasses; and `CacheEntryMixin` for shared cache entry behavior with serialization support.
Base Abstractions `insideLLMs/caching/base.py`	Establishes `BaseCacheABC` generic interface with standard cache operations (get, set, delete, clear, stats), and provides deterministic key generation utilities (`generate_cache_key`, `generate_model_cache_key`) supporting multiple hash algorithms.
Cache Backends `insideLLMs/caching/backends/strategy.py`, `insideLLMs/caching/backends/memory.py`, `insideLLMs/caching/backends/disk.py`, `insideLLMs/caching/backends/__init__.py`	Implements three cache backends: `StrategyCache` (multi-strategy eviction support), `InMemoryCache` (thread-safe LRU with TTL), and `DiskCache` (SQLite-persistent with export/import), plus package initialization for unified backend exports.
Factory & Global Management `insideLLMs/caching/factory.py`	Provides factory functions (`create_cache`, `create_prompt_cache`, `create_namespace`, `create_cache_warmer`), cache key generation, global default cache management, and `cached` decorator for lightweight function caching.
Specialized Caching `insideLLMs/caching/prompt_cache.py`, `insideLLMs/caching/deduplicator.py`	Introduces `PromptCache` with semantic similarity-based lookups and prompt-to-key mapping, and `ResponseDeduplicator` for detecting duplicate/similar responses using configurable Jaccard-based similarity thresholds.
Utilities & Wrappers `insideLLMs/caching/utilities.py`, `insideLLMs/caching/model_wrapper.py`, `insideLLMs/caching/async_adapter.py`, `insideLLMs/caching/namespace.py`	Adds `CacheWarmer` (prompt precomputation with priority queuing), `MemoizedFunction` (function call memoization with statistics), `memoize` decorator, `CachedModel` (model response caching), `AsyncCacheAdapter` (async syntax wrapper), and `CacheNamespace` (multi-cache management with lazy initialization).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related issues

refactor: Consolidate caching implementations into single module #11: This PR fully addresses the objective of consolidating caching implementations and creating a unified package export structure through the comprehensive insideLLMs.caching subsystem with centralized __all__ exports.

Poem

🐰 A warren of caches, nested with care,
In-memory, disk-bound, strategies fair,
With prompts that find kin through similarity's call,
And factories crafting each cache, one and all,
Now bundled together, no imports to spare!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'refactor: split caching_unified.py into modular caching/ package' clearly and accurately describes the main change—refactoring a monolithic module into a modular package structure.
Docstring Coverage	✅ Passed	Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch devin/1769773941-refactor-caching

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

kiloconnect

Reviewing the new caching system implementation.

kiloconnect · 2026-01-30T12:12:20Z

insideLLMs/caching/backends/disk.py

+            self._stats.evictions += cursor.rowcount
+
+        conn.commit()
+        conn.execute("VACUUM")


CRITICAL: Performance issue - VACUUM is extremely expensive.

Running VACUUM blocks the database and copies the entire file. Doing this during cache eviction (which happens during set) will cause massive latency spikes, especially for large caches. Consider using PRAGMA auto_vacuum = INCREMENTAL or removing this automatic vacuum.

kiloconnect · 2026-01-30T12:12:20Z

insideLLMs/caching/backends/disk.py

+    def import_from_file(self, path: Union[str, Path]) -> int:
+        """Import cache from a JSON file."""
+        with open(path) as f:
+            entries = json.load(f)


CRITICAL: Memory safety - json.load reads entire file into memory.

For a disk cache designed to handle data larger than memory, loading the entire export file into RAM will cause OOM crashes. Use a streaming parser or a line-based format (JSONL) for exports.

kiloconnect · 2026-01-30T12:12:20Z

insideLLMs/caching/base.py

+        key_parts.append(f"model:{model}")
+
+    if params:
+        sorted_params = json.dumps(params, sort_keys=True)


WARNING: Runtime error - json.dumps fails on non-serializable objects.

If params contains any non-JSON-serializable types (sets, custom objects, etc.), this will raise a TypeError. Use default=str to handle arbitrary objects safely.

Suggested change

sorted_params = json.dumps(params, sort_keys=True)

sorted_params = json.dumps(params, sort_keys=True, default=str)

kiloconnect · 2026-01-30T12:12:21Z

insideLLMs/caching/base.py

+    for k, v in sorted(kwargs.items()):
+        key_parts[k] = v
+
+    key_str = json.dumps(key_parts, sort_keys=True, ensure_ascii=True)


WARNING: Runtime error - json.dumps fails on non-serializable objects.

Same issue as above. kwargs values are included in key_parts and serialized.

Suggested change

key_str = json.dumps(key_parts, sort_keys=True, ensure_ascii=True)

key_str = json.dumps(key_parts, sort_keys=True, ensure_ascii=True, default=str)

guardrails · 2026-01-30T12:12:26Z

⚠️ We detected 4 security issues in this pull request:

Insecure Use of Crypto (4)

Severity

Details

Docs

Low

Title: Insecure hash functions

insideLLMs/insideLLMs/caching/base.py

Line 158 in 904edfc

return hashlib.md5(key_string.encode()).hexdigest()

📚

Low

Title: Insecure hash functions

insideLLMs/insideLLMs/caching/base.py

Line 160 in 904edfc

return hashlib.sha1(key_string.encode()).hexdigest()

📚

Low

Title: Insecure hash functions

insideLLMs/insideLLMs/caching/prompt_cache.py

Line 147 in 904edfc

prompt_hash = hashlib.md5(prompt.encode()).hexdigest()

📚

Low

Title: Insecure hash functions

insideLLMs/insideLLMs/caching/prompt_cache.py

Line 164 in 904edfc

prompt_hash = hashlib.md5(prompt.encode()).hexdigest()

📚

More info on how to fix Insecure Use of Crypto in Python.

👉 Go to the dashboard for detailed results.

📥 Happy? Share your feedback with us.

kiloconnect · 2026-01-30T12:12:38Z

Code Review Summary

Status: 4 Issues Found | Recommendation: Address before merge

Overview

Severity	Count
CRITICAL	2
WARNING	2
SUGGESTION	0

Issue Details (click to expand)

CRITICAL

File	Line	Issue
`insideLLMs/caching/backends/disk.py`	256	Performance issue - `VACUUM` is extremely expensive
`insideLLMs/caching/backends/disk.py`	283	Memory safety - `json.load` reads entire file into memory

WARNING

File	Line	Issue
`insideLLMs/caching/base.py`	148	Runtime error - `json.dumps` fails on non-serializable objects
`insideLLMs/caching/base.py`	249	Runtime error - `json.dumps` fails on non-serializable objects

Other Observations (not in diff)

Issues found in unchanged code or general observations:

File	Line	Issue
`insideLLMs/caching/namespace.py`	144	`get_prompt_cache` silently overwrites existing cache if it's not a `PromptCache`, causing data loss.
`insideLLMs/caching/backends/disk.py`	196	`set` method crashes if `value` is not JSON-serializable. While documented, defensive coding would be safer.

Files Reviewed (15 files)

insideLLMs/caching/__init__.py
insideLLMs/caching/async_adapter.py
insideLLMs/caching/backends/__init__.py
insideLLMs/caching/backends/disk.py - 3 issues
insideLLMs/caching/backends/memory.py
insideLLMs/caching/backends/strategy.py
insideLLMs/caching/base.py - 2 issues
insideLLMs/caching/deduplicator.py
insideLLMs/caching/factory.py
insideLLMs/caching/model_wrapper.py
insideLLMs/caching/namespace.py - 1 issue
insideLLMs/caching/prompt_cache.py
insideLLMs/caching/types.py
insideLLMs/caching/utilities.py
insideLLMs/caching_unified.py

devin-ai-integration

Devin Review found 1 potential issue.

View issue and 5 additional flags in Devin Review.

devin-ai-integration · 2026-01-30T12:12:39Z

insideLLMs/caching/namespace.py

+            if name not in self._caches or not isinstance(self._caches[name], PromptCache):
+                self._caches[name] = PromptCache(config or self.default_config)


🔴 CacheNamespace.get_prompt_cache silently replaces existing non-PromptCache, causing data loss

The get_prompt_cache method silently replaces an existing cache with a new PromptCache if the existing cache is not a PromptCache, destroying all cached data.

Click to expand

How the bug is triggered

If a user first creates a regular cache with get_cache("foo"), populates it with data, and then later calls get_prompt_cache("foo"), the existing cache is silently replaced:

namespace = CacheNamespace() cache = namespace.get_cache("shared") cache.set("important_data", "value") # Data stored # Later in code, perhaps in a different module: prompt_cache = namespace.get_prompt_cache("shared") # Silently replaces! # All data in the original cache is now LOST

The condition at line 144 if name not in self._caches or not isinstance(self._caches[name], PromptCache) causes the replacement when the cache exists but isn't a PromptCache.

Actual vs Expected

Actual: Existing cache is silently replaced with a new empty PromptCache

Expected: Either raise an error indicating type mismatch, or return the existing cache if compatible, or at minimum log a warning

Impact

Silent data loss in multi-module applications where different parts of the codebase might access the same named cache with different methods.

Recommendation: Either raise a ValueError when trying to get a PromptCache for a name that already has a non-PromptCache, or only check if name not in self._caches (creating new caches but returning existing ones regardless of type if PromptCache is a subclass of StrategyCache).

Was this helpful? React with 👍 or 👎 to provide feedback.

codecov · 2026-01-30T12:13:02Z

❌ 1 Tests Failed:

Tests completed	Failed	Passed	Skipped
5049	1	5048	107

View the top 1 failed test(s) by shortest run time

tests/test_caching_unified.py::TestStrategyCache::test_instantiation_with_strategy

Stack Traces | 0.003s run time

tests/test_caching_unified.py:139: in test_instantiation_with_strategy
    cache = StrategyCache(strategy=CacheStrategy.LRU, max_size=100)
E   TypeError: StrategyCache.__init__() got an unexpected keyword argument 'strategy'

To view more test analytics, go to the Test Analytics Dashboard
_{📋 Got 3 mins? Take this short survey to help us improve Test Analytics.}

Copilot

Pull request overview

This refactor splits the previous 3,400+ line caching_unified.py into a modular insideLLMs.caching package, and turns caching_unified.py into a backward-compatibility shim that re-exports all public caching APIs from the new package.

Changes:

Introduces a new insideLLMs.caching package with focused modules for types, backends, utilities, prompt caching, async adapter, factory functions, namespaces, deduplication, and a model wrapper.
Replaces the implementation in caching_unified.py with a thin re-export shim that forwards all public symbols to the new package.
Aligns insideLLMs.caching.__all__ and insideLLMs.caching_unified.__all__ so existing imports continue to work while new code can use the cleaner insideLLMs.caching package.

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`insideLLMs/caching_unified.py`	Replaces the monolithic implementation with a compatibility shim that re-exports all caching symbols from the new `insideLLMs.caching` package.
`insideLLMs/caching/__init__.py`	Defines the new primary caching entry point, aggregating and exporting all public caching APIs (types, backends, utilities, factory functions, async adapter, etc.).
`insideLLMs/caching/base.py`	Extracts `BaseCacheABC` and key generation utilities (`generate_cache_key`, `generate_model_cache_key`) into a dedicated base module.
`insideLLMs/caching/types.py`	Houses all core enums and dataclasses (`CacheStrategy`, `CacheStatus`, `CacheScope`, `CacheConfig`, `CacheEntryMixin`, `CacheEntry`, `CacheStats`, `CacheLookupResult`) used across the caching system.
`insideLLMs/caching/backends/memory.py`	Provides the `InMemoryCache` backend (LRU + TTL), now as a dedicated backend module implementing `BaseCacheABC`.
`insideLLMs/caching/backends/disk.py`	Provides the `DiskCache` backend (SQLite, size-managed, TTL, export/import), separated into its own backend module.
`insideLLMs/caching/backends/strategy.py`	Implements `StrategyCache` (and `BaseCache` alias) using the new shared types module; note: its `__init__` signature has been simplified and currently breaks the previous `strategy`/`max_size` keyword API.
`insideLLMs/caching/backends/__init__.py`	Re-exports `InMemoryCache`, `DiskCache`, and `StrategyCache`/`BaseCache` from the backend submodules as a convenience.
`insideLLMs/caching/prompt_cache.py`	Extracts `PromptCache` into its own module, still extending `StrategyCache` and using semantic similarity via `word_overlap_similarity`.
`insideLLMs/caching/namespace.py`	Extracts `CacheNamespace` into a separate module for managing multiple named caches backed by `StrategyCache`/`PromptCache`.
`insideLLMs/caching/model_wrapper.py`	Extracts `CachedModel`, providing a thin caching wrapper around models with deterministic-only caching by default.
`insideLLMs/caching/utilities.py`	Extracts `CacheWarmer`, `MemoizedFunction`, and `memoize` into a utilities module, with `BaseCache` aliased to `StrategyCache`.
`insideLLMs/caching/factory.py`	Centralizes factory and convenience functions (`create_cache`, `create_prompt_cache`, `create_cache_warmer`, `create_namespace`, `get_cache_key`, `cached_response`, global default cache helpers, and `cached` decorator).
`insideLLMs/caching/deduplicator.py`	Extracts `ResponseDeduplicator` into its own module with the same API as before (`add`, `get_unique_responses`, `get_duplicate_count`, `clear`).
`insideLLMs/caching/async_adapter.py`	Extracts `AsyncCacheAdapter` into a separate module, wrapping `StrategyCache` (`BaseCache` alias) with async `get/set/delete/clear` methods.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-30T12:13:51Z

insideLLMs/caching/backends/strategy.py

+    def __init__(self, config: Optional[CacheConfig] = None):
+        self.config = config or CacheConfig()


The new StrategyCache.__init__ signature only accepts an optional CacheConfig parameter, but the previous implementation also supported strategy and max_size keyword overrides (e.g., StrategyCache(strategy=CacheStrategy.LRU, max_size=100)), which are still used in the test suite and likely by external callers. This is a breaking API change that will cause TypeError at call sites expecting the old signature. To preserve backward compatibility and keep the refactor non-functional, please reintroduce the strategy and max_size optional keyword parameters and apply them on top of the provided/configured CacheConfig as before.

Suggested change

def __init__(self, config: Optional[CacheConfig] = None):

self.config = config or CacheConfig()

def __init__(

self,

config: Optional[CacheConfig] = None,

strategy: Optional[CacheStrategy] = None,

max_size: Optional[int] = None,

):

"""

Initialize the strategy cache.

Parameters

----------

config :

Optional CacheConfig instance. If not provided, a default

CacheConfig is created.

strategy :

Optional override for the cache strategy. If provided, this

value takes precedence over the strategy defined in ``config``.

max_size :

Optional override for the maximum cache size. If provided, this

value takes precedence over the max_size defined in ``config``.

"""

self.config = config or CacheConfig()

# Backward-compatible keyword overrides

if strategy is not None:

self.config.strategy = strategy

if max_size is not None:

self.config.max_size = max_size

coderabbitai

Actionable comments posted: 10

🤖 Fix all issues with AI agents

In `@insideLLMs/caching/backends/disk.py`:
- Around line 274-277: Replace the non-deterministic json.dump call that writes
entries with one that produces canonical, stable output: when writing in the
block using open(path, "w") and json.dump(entries, f, indent=2), add
sort_keys=True and a stable separators argument (e.g., separators=(',', ': '))
so keys are consistently ordered and spacing is canonical across exports; keep
indent for readability if desired and preserve the same file write path and
context.
- Around line 160-164: The TTL-expiry branch in the disk cache (where it checks
row["expires_at"] and deletes the row via conn.execute("DELETE FROM cache WHERE
key = ?", (key,))) fails to increment the expiration metric; update that branch
to also increment self._stats.expirations (similarly to InMemoryCache) before
returning None so expired disk entries are counted; locate this logic in the
disk backend method that reads rows (uses conn, row, key) and add
self._stats.expirations += 1 right after conn.commit() and before returning.

In `@insideLLMs/caching/backends/memory.py`:
- Around line 119-123: In the expiration branch inside the memory cache get
logic (the block that checks entry.get("expires_at") and compares time.time() >
entry["expires_at"]), increment the CacheStats expirations counter in addition
to the existing misses increment: update self._stats.expirations (and keep
self._stats.misses) before deleting from self._cache so TTL expirations are
tracked; locate this logic around the expiration check that references entry,
self._cache, and self._stats to apply the change.

In `@insideLLMs/caching/base.py`:
- Around line 55-57: The current base method has() calls get(), causing
hits/misses to increment; change has() in the base class
(insideLLMs.caching.base.Cache) to be abstract (raise NotImplementedError or use
`@abstractmethod`) so concrete caches implement a stats-free existence check, and
update InMemoryCache.has, DiskCache.has and StrategyCache.has to perform a
non-mutating existence test (e.g., check the internal dict, key presence on
disk, or delegate to underlying caches) without touching hit/miss counters or
calling get().

In `@insideLLMs/caching/deduplicator.py`:
- Around line 104-107: ResponseDeduplicator is not thread-safe: add a
threading.RLock to the class (import threading and set self._lock =
threading.RLock() in __init__), and wrap any access/modification of
self._responses and self._duplicate_count (e.g., inside the add() method and any
getters/readers) with self._lock (use with self._lock: ...). This ensures all
mutations and reads of _responses and _duplicate_count in class
ResponseDeduplicator are synchronized.

In `@insideLLMs/caching/factory.py`:
- Around line 422-424: The cached decorator currently treats a returned None as
a cache miss by using "cached_result = _cache.get(key)" and "if cached_result is
not None", so change the lookup to distinguish absent keys (e.g. use a unique
sentinel object or check key membership with "if key in _cache") and return the
cached value when the key exists even if it is None; update the logic around
_cache.get(key) / cached_result in the cached decorator to use that sentinel or
membership check so legitimate None results are stored and reused.

In `@insideLLMs/caching/namespace.py`:
- Around line 137-146: The get_prompt_cache method currently overwrites any
existing entry in self._caches when the stored value is not a PromptCache,
risking silent data loss; change it to detect a type mismatch and raise a clear
error instead of replacing: inside get_prompt_cache (while holding self._lock)
if name exists in self._caches and not isinstance(self._caches[name],
PromptCache) raise a TypeError (or custom exception) explaining the name/type
conflict and suggesting using get_cache or deleting the existing cache; keep the
current creation path (self._caches[name] = PromptCache(config or
self.default_config)) only when the name is absent, and mirror analogous
behavior in get_cache if needed so both getters are symmetric.

In `@insideLLMs/caching/prompt_cache.py`:
- Around line 124-148: The _prompt_keys mapping in cache_response stores
prompt_hash -> cache key but is never cleaned on eviction; update the cache to
remove stale mappings either by (A) hooking into the cache eviction/expire path
(or the set/evict function) to delete any _prompt_keys entries whose value
equals the evicted key, or (B) making get_by_prompt defensive: compute
prompt_hash, look up key in _prompt_keys, then verify the key exists in the
cache via the cache lookup (e.g., call self.get(key) or self._has(key)); if the
key is missing, remove the prompt_hash from self._prompt_keys and return a cache
miss. Modify methods named cache_response, get_by_prompt, and the cache
eviction/eject path (or set/evict helpers) to implement one of these cleanup
strategies so _prompt_keys cannot grow with stale entries.

In `@insideLLMs/caching/types.py`:
- Around line 141-158: The docstring iteration example for CacheStatus is
inconsistent with the enum because CacheStatus also defines backward-compatible
values HIT and MISS; update the docstring for CacheStatus to either include
'hit' and 'miss' in the printed iteration example or add a note that HIT and
MISS are present for backward compatibility and will appear when iterating.
Locate the CacheStatus enum and its members (HIT, MISS, ACTIVE, EXPIRED,
EVICTED, WARMING) and adjust the example text accordingly so the docstring
matches the actual enum contents.

In `@insideLLMs/caching/utilities.py`:
- Around line 337-342: MemoizedFunction assumes cache.get(key) returns an object
with .hit and .value; update MemoizedFunction to handle both return types by
checking the result from self.cache.get(key): if it has attributes .hit/.value
(as returned by StrategyCache/CacheLookupResult) use them, otherwise treat a
truthy/non-None return as a hit and the return value as the cached value; adjust
the increment of self._cache_calls and the return accordingly. Reference
symbols: MemoizedFunction, key_generator, self.cache.get, _cache_calls,
CacheLookupResult, StrategyCache, InMemoryCache.get, BaseCacheABC.

🧹 Nitpick comments (10)

insideLLMs/caching/types.py (1)

31-40: Consider using timezone-aware datetimes for consistency.

datetime.now() returns naive (timezone-unaware) datetimes. In distributed systems or when serializing/deserializing across timezones, this can cause subtle bugs. Consider using datetime.now(timezone.utc) or datetime.utcnow() for consistency.

This affects both is_expired() and touch() methods.
insideLLMs/caching/base.py (1)
151-154: Non-primitive kwargs values may produce inconsistent cache keys.

The string interpolation f"{k}:{v}" for kwargs relies on __str__ which may be inconsistent for complex objects (e.g., dicts, lists). Consider JSON-serializing non-primitive values for determinism.
♻️ Proposed fix for deterministic kwargs handling
     # Add any additional kwargs (sorted for determinism)
     for k, v in sorted(kwargs.items()):
-        key_parts.append(f"{k}:{v}")
+        if isinstance(v, (dict, list)):
+            key_parts.append(f"{k}:{json.dumps(v, sort_keys=True)}")
+        else:
+            key_parts.append(f"{k}:{v}")
insideLLMs/caching/backends/strategy.py (1)

259-300: Eviction implementations for LFU, TTL, and SIZE have O(n) complexity.

The _evict_one() method iterates all entries for LFU (min access_count), TTL (soonest expiration), and SIZE (max size_bytes) strategies. For large caches, this could be a performance bottleneck.

This is acceptable for initial implementation, but consider using heap data structures for O(log n) eviction if performance becomes critical.

insideLLMs/caching/deduplicator.py (1)

109-122: O(n) duplicate detection may be slow for large datasets.

Each add() scans all existing responses linearly. For use cases with thousands of responses, consider using a hash-based approach for exact matches (threshold=1.0) as an optimization.

insideLLMs/caching/backends/disk.py (1)

230-257: VACUUM after every eviction batch may degrade performance.

VACUUM rewrites the entire database and can be slow for large caches. Running it in the eviction loop (which could execute multiple times in one set() call) may cause noticeable latency spikes.

Consider:

Running VACUUM only when significant space can be reclaimed

Running VACUUM in a background thread

Using PRAGMA auto_vacuum = INCREMENTAL instead
insideLLMs/caching/utilities.py (2)
155-201: Batch removal includes all items regardless of actual processing status.

The warm() method removes batch_size items from the queue (line 201) even though some items may have been skipped due to skip_existing=True. This means if you call warm(batch_size=10, skip_existing=True) and 5 items are skipped, you still lose all 10 items from the queue but only effectively warmed 5.

This might be intentional (treating "skipped" as "handled"), but it could be surprising if users expect only successfully-warmed items to be removed.
💡 Alternative: Only remove actually processed items
-        batch = self._warming_queue[:batch_size]
+        batch = self._warming_queue[:batch_size]
+        processed_count = 0

         for item in batch:
             # ... processing logic ...
+            processed_count += 1

-        self._warming_queue = self._warming_queue[batch_size:]
+        self._warming_queue = self._warming_queue[processed_count:]
331-331: Non-standard usage of functools.wraps.

wraps(func)(self) is an unusual pattern. wraps returns a decorator that updates __name__, __doc__, etc., but calling it on self doesn't properly assign these attributes to the instance in a way that makes self callable with the expected metadata.

Consider using a more explicit approach:
♻️ Proposed fix for wraps usage
-        wraps(func)(self)
+        self.__name__ = func.__name__
+        self.__doc__ = func.__doc__
+        self.__wrapped__ = func
insideLLMs/caching/factory.py (1)
376-398: Global default cache management lacks thread-safety.

The _default_cache global and its accessors (get_default_cache, set_default_cache, clear_default_cache) are not thread-safe. Concurrent calls to get_default_cache() could result in multiple InMemoryCache instances being created due to a race condition.
🔒 Proposed fix using a lock
+import threading
+
+_default_cache_lock = threading.Lock()
 _default_cache: Optional[BaseCacheABC] = None


 def get_default_cache() -> BaseCacheABC:
     """Get the global default cache instance."""
     global _default_cache
-    if _default_cache is None:
-        _default_cache = InMemoryCache()
+    if _default_cache is None:
+        with _default_cache_lock:
+            if _default_cache is None:  # Double-check locking
+                _default_cache = InMemoryCache()
     return _default_cache


 def set_default_cache(cache: BaseCacheABC) -> None:
     """Set the global default cache instance."""
     global _default_cache
-    _default_cache = cache
+    with _default_cache_lock:
+        _default_cache = cache
insideLLMs/caching/model_wrapper.py (1)
55-85: Potential UnboundLocalError if should_cache logic is modified.

cache_key and model_id are only defined inside the first if should_cache block (lines 57-65), but they're used again in the second if should_cache block (lines 78-83). If should_cache is True initially but somehow becomes False before line 78 (which can't happen in current code but could after refactoring), an UnboundLocalError would occur.

Consider restructuring to make the data flow clearer:
♻️ Clearer structure
     def generate(
         self,
         prompt: str,
         temperature: float = 0.0,
         max_tokens: Optional[int] = None,
         **kwargs: Any,
     ) -> ModelResponse:
         """Generate a response, using cache if available."""
         should_cache = not self._cache_only_deterministic or temperature == 0
+        cache_key = None
+        model_id = None

         if should_cache:
             model_id = getattr(self._model, "model_id", str(type(self._model).__name__))
             cache_key = generate_model_cache_key(
                 model_id=model_id,
                 prompt=prompt,
                 temperature=temperature,
                 max_tokens=max_tokens,
                 **kwargs,
             )

             cached = self._cache.get(cache_key)
             if cached is not None:
                 return self._deserialize_response(cached)

         response = self._model.generate(
             prompt,
             temperature=temperature,
             max_tokens=max_tokens,
             **kwargs,
         )

-        if should_cache:
+        if should_cache and cache_key is not None:
             self._cache.set(
                 cache_key,
                 self._serialize_response(response),
                 metadata={"model_id": model_id, "prompt_preview": prompt[:100]},
             )

         return response
insideLLMs/caching/__init__.py (1)

16-29: Docstring example assumes StrategyCache return type.

The quick start example shows result.hit and result.value, which is specific to StrategyCache's CacheLookupResult return type. If users substitute InMemoryCache (also exported), the example will fail since InMemoryCache.get() returns Optional[T] directly.

Consider clarifying this in the docstring or ensuring consistency across cache implementations.

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/backends/disk.py

+        if row["expires_at"] is not None and time.time() > row["expires_at"]:
+            conn.execute("DELETE FROM cache WHERE key = ?", (key,))
+            conn.commit()
+            self._stats.misses += 1
+            return None


⚠️ Potential issue | 🟡 Minor

Missing expirations stat increment when TTL expires.

Same issue as InMemoryCache - expired entries are deleted but _stats.expirations is not incremented.

🔧 Proposed fix

if row["expires_at"] is not None and time.time() > row["expires_at"]: conn.execute("DELETE FROM cache WHERE key = ?", (key,)) conn.commit() self._stats.misses += 1 + self._stats.expirations += 1 return None

🤖 Prompt for AI Agents

In `@insideLLMs/caching/backends/disk.py` around lines 160 - 164, The TTL-expiry branch in the disk cache (where it checks row["expires_at"] and deletes the row via conn.execute("DELETE FROM cache WHERE key = ?", (key,))) fails to increment the expiration metric; update that branch to also increment self._stats.expirations (similarly to InMemoryCache) before returning None so expired disk entries are counted; locate this logic in the disk backend method that reads rows (uses conn, row, key) and add self._stats.expirations += 1 right after conn.commit() and before returning.

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/backends/disk.py

+
+        with open(path, "w") as f:
+            json.dump(entries, f, indent=2)
+


⚠️ Potential issue | 🟡 Minor

Export may produce non-deterministic JSON output.

json.dump(entries, f, indent=2) doesn't use sort_keys=True. If entries contain dicts with varying key orders, exports of the same data could differ, complicating diffs and reproducibility.

As per coding guidelines: "Maintain deterministic JSON/JSONL output with stable key ordering and separators for canonical artifacts."

🔧 Proposed fix

with open(path, "w") as f: - json.dump(entries, f, indent=2) + json.dump(entries, f, indent=2, sort_keys=True)

🤖 Prompt for AI Agents

In `@insideLLMs/caching/backends/disk.py` around lines 274 - 277, Replace the non-deterministic json.dump call that writes entries with one that produces canonical, stable output: when writing in the block using open(path, "w") and json.dump(entries, f, indent=2), add sort_keys=True and a stable separators argument (e.g., separators=(',', ': ')) so keys are consistently ordered and spacing is canonical across exports; keep indent for readability if desired and preserve the same file write path and context.

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/backends/memory.py

+            # Check expiration
+            if entry.get("expires_at") and time.time() > entry["expires_at"]:
+                del self._cache[key]
+                self._stats.misses += 1
+                return None


⚠️ Potential issue | 🟡 Minor

Missing expirations stat increment when TTL expires.

When an expired entry is found and deleted, only misses is incremented. The expirations field in CacheStats is never updated, making it impossible to distinguish between true misses and TTL expirations.

🔧 Proposed fix

# Check expiration if entry.get("expires_at") and time.time() > entry["expires_at"]: del self._cache[key] self._stats.misses += 1 + self._stats.expirations += 1 return None

🤖 Prompt for AI Agents

In `@insideLLMs/caching/backends/memory.py` around lines 119 - 123, In the expiration branch inside the memory cache get logic (the block that checks entry.get("expires_at") and compares time.time() > entry["expires_at"]), increment the CacheStats expirations counter in addition to the existing misses increment: update self._stats.expirations (and keep self._stats.misses) before deleting from self._cache so TTL expirations are tracked; locate this logic around the expiration check that references entry, self._cache, and self._stats to apply the change.

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/base.py

+    def has(self, key: str) -> bool:
+        """Check if a key exists in the cache."""
+        return self.get(key) is not None


⚠️ Potential issue | 🟡 Minor

has() calling get() causes unintended stat side effects.

The has() implementation calls get(), which in concrete implementations (InMemoryCache, DiskCache, StrategyCache) increments hit/miss counters. This means checking if a key exists will affect cache statistics, which may not be the expected behavior.

Consider making has() abstract or documenting this behavior clearly so implementations can provide a stats-free existence check.

🤖 Prompt for AI Agents

In `@insideLLMs/caching/base.py` around lines 55 - 57, The current base method has() calls get(), causing hits/misses to increment; change has() in the base class (insideLLMs.caching.base.Cache) to be abstract (raise NotImplementedError or use `@abstractmethod`) so concrete caches implement a stats-free existence check, and update InMemoryCache.has, DiskCache.has and StrategyCache.has to perform a non-mutating existence test (e.g., check the internal dict, key presence on disk, or delegate to underlying caches) without touching hit/miss counters or calling get().

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/deduplicator.py

+    def __init__(self, similarity_threshold: float = 1.0):
+        self.similarity_threshold = similarity_threshold
+        self._responses: list[tuple[str, str, Any]] = []
+        self._duplicate_count = 0


⚠️ Potential issue | 🟡 Minor

Missing thread safety for concurrent usage.

Unlike other cache classes in this PR that use threading.RLock, ResponseDeduplicator has no synchronization. If used from multiple threads (e.g., in an async web app), concurrent add() calls could cause race conditions on _responses and _duplicate_count.

🔒 Proposed fix to add thread safety

+import threading + class ResponseDeduplicator: ... def __init__(self, similarity_threshold: float = 1.0): self.similarity_threshold = similarity_threshold self._responses: list[tuple[str, str, Any]] = [] self._duplicate_count = 0 + self._lock = threading.RLock() def add( self, prompt: str, response: str, metadata: Optional[Any] = None, ) -> tuple[bool, Optional[int]]: """Add response, returning whether it's a duplicate.""" + with self._lock: - for i, (_, existing_response, _) in enumerate(self._responses): + for i, (_, existing_response, _) in enumerate(self._responses): # ... rest of method indented

🤖 Prompt for AI Agents

In `@insideLLMs/caching/deduplicator.py` around lines 104 - 107, ResponseDeduplicator is not thread-safe: add a threading.RLock to the class (import threading and set self._lock = threading.RLock() in __init__), and wrap any access/modification of self._responses and self._duplicate_count (e.g., inside the add() method and any getters/readers) with self._lock (use with self._lock: ...). This ensures all mutations and reads of _responses and _duplicate_count in class ResponseDeduplicator are synchronized.

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/factory.py

+            cached_result = _cache.get(key)
+            if cached_result is not None:
+                return cached_result


⚠️ Potential issue | 🟡 Minor

cached decorator cannot distinguish between cached None values and cache misses.

The check if cached_result is not None (line 423) treats None as a cache miss. If the wrapped function legitimately returns None, that value won't be cached effectively—subsequent calls will re-execute the function.

This is a common memoization pitfall. Consider using a sentinel value or checking if the key exists in the cache.

💡 Suggested fix using a sentinel

+_CACHE_MISS = object() # Sentinel for cache miss def cached( cache: Optional[BaseCacheABC] = None, ttl: Optional[int] = None, key_fn: Optional[Callable[..., str]] = None, ) -> Callable: """Decorator to cache function results using simple cache.""" _cache = cache or InMemoryCache() def decorator(func: Callable) -> Callable: def wrapper(*args: Any, **kwargs: Any) -> Any: # ... key generation ... - cached_result = _cache.get(key) - if cached_result is not None: - return cached_result + cached_result = _cache.get(key) + # InMemoryCache returns None on miss, check has() for existence + if _cache.has(key): + return cached_result result = func(*args, **kwargs) _cache.set(key, result, ttl=ttl) return result wrapper.__wrapped__ = func return wrapper return decorator

🤖 Prompt for AI Agents

In `@insideLLMs/caching/factory.py` around lines 422 - 424, The cached decorator currently treats a returned None as a cache miss by using "cached_result = _cache.get(key)" and "if cached_result is not None", so change the lookup to distinguish absent keys (e.g. use a unique sentinel object or check key membership with "if key in _cache") and return the cached value when the key exists even if it is None; update the logic around _cache.get(key) / cached_result in the cached decorator to use that sentinel or membership check so legitimate None results are stored and reused.

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/namespace.py

+    def get_prompt_cache(
+        self,
+        name: str,
+        config: Optional[CacheConfig] = None,
+    ) -> PromptCache:
+        """Get or create a named prompt cache."""
+        with self._lock:
+            if name not in self._caches or not isinstance(self._caches[name], PromptCache):
+                self._caches[name] = PromptCache(config or self.default_config)
+            return self._caches[name]


⚠️ Potential issue | 🟡 Minor

get_prompt_cache silently replaces existing cache, potentially losing data.

If a cache named "foo" was previously created via get_cache("foo"), calling get_prompt_cache("foo") will silently replace it with a new PromptCache, discarding all existing cached data. This asymmetric behavior (compared to get_cache which doesn't replace) could lead to surprising data loss.

Consider either:

Raising an error when the existing cache is the wrong type

Documenting this behavior prominently

Making get_cache behave symmetrically

💡 Alternative: Raise error on type mismatch

def get_prompt_cache( self, name: str, config: Optional[CacheConfig] = None, ) -> PromptCache: """Get or create a named prompt cache.""" with self._lock: - if name not in self._caches or not isinstance(self._caches[name], PromptCache): + if name not in self._caches: self._caches[name] = PromptCache(config or self.default_config) + elif not isinstance(self._caches[name], PromptCache): + raise TypeError( + f"Cache '{name}' exists but is not a PromptCache. " + "Use delete_cache() first if you want to replace it." + ) return self._caches[name]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_prompt_cache(

self,

name: str,

config: Optional[CacheConfig] = None,

) -> PromptCache:

"""Get or create a named prompt cache."""

with self._lock:

if name not in self._caches or not isinstance(self._caches[name], PromptCache):

self._caches[name] = PromptCache(config or self.default_config)

return self._caches[name]

def get_prompt_cache(

self,

name: str,

config: Optional[CacheConfig] = None,

) -> PromptCache:

"""Get or create a named prompt cache."""

with self._lock:

if name not in self._caches:

self._caches[name] = PromptCache(config or self.default_config)

elif not isinstance(self._caches[name], PromptCache):

raise TypeError(

f"Cache '{name}' exists but is not a PromptCache. "

"Use delete_cache() first if you want to replace it."

)

return self._caches[name]

🤖 Prompt for AI Agents

In `@insideLLMs/caching/namespace.py` around lines 137 - 146, The get_prompt_cache method currently overwrites any existing entry in self._caches when the stored value is not a PromptCache, risking silent data loss; change it to detect a type mismatch and raise a clear error instead of replacing: inside get_prompt_cache (while holding self._lock) if name exists in self._caches and not isinstance(self._caches[name], PromptCache) raise a TypeError (or custom exception) explaining the name/type conflict and suggesting using get_cache or deleting the existing cache; keep the current creation path (self._caches[name] = PromptCache(config or self.default_config)) only when the name is absent, and mirror analogous behavior in get_cache if needed so both getters are symmetric.

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/prompt_cache.py

+        self._prompt_keys: dict[str, str] = {}  # prompt hash -> cache key
+
+    def cache_response(
+        self,
+        prompt: str,
+        response: str,
+        model: Optional[str] = None,
+        params: Optional[dict] = None,
+        metadata: Optional[dict] = None,
+    ) -> CacheEntry:
+        """Cache an LLM response."""
+        key = generate_cache_key(prompt, model, params, self.config.hash_algorithm)
+
+        entry_metadata = {
+            "prompt": prompt,
+            "model": model,
+            "params": params,
+            **(metadata or {}),
+        }
+
+        entry = self.set(key, response, metadata=entry_metadata)
+
+        # Track prompt -> key mapping
+        prompt_hash = hashlib.md5(prompt.encode()).hexdigest()
+        self._prompt_keys[prompt_hash] = key


⚠️ Potential issue | 🟡 Minor

_prompt_keys mapping is not synchronized with cache evictions.

When entries are evicted from the cache (due to max_size limits or TTL expiration), the corresponding entries in _prompt_keys are not removed. Over time, this mapping will grow unbounded and contain stale references to keys no longer in the cache.

Consider cleaning up _prompt_keys when entries are evicted, or lazily removing stale entries during lookups.

💡 Suggestion: Clean up stale mappings in get_by_prompt

def get_by_prompt(self, prompt: str) -> CacheLookupResult: """Get cached response by prompt alone (ignoring model/params).""" prompt_hash = hashlib.md5(prompt.encode()).hexdigest() key = self._prompt_keys.get(prompt_hash) if key: - return self.get(key) + result = self.get(key) + if not result.hit: + # Clean up stale mapping + del self._prompt_keys[prompt_hash] + return result return CacheLookupResult(hit=False, value=None, key="")

🤖 Prompt for AI Agents

In `@insideLLMs/caching/prompt_cache.py` around lines 124 - 148, The _prompt_keys mapping in cache_response stores prompt_hash -> cache key but is never cleaned on eviction; update the cache to remove stale mappings either by (A) hooking into the cache eviction/expire path (or the set/evict function) to delete any _prompt_keys entries whose value equals the evicted key, or (B) making get_by_prompt defensive: compute prompt_hash, look up key in _prompt_keys, then verify the key exists in the cache via the cache lookup (e.g., call self.get(key) or self._has(key)); if the key is missing, remove the prompt_hash from self._prompt_keys and return a cache miss. Modify methods named cache_response, get_by_prompt, and the cache eviction/eject path (or set/evict helpers) to implement one of these cleanup strategies so _prompt_keys cannot grow with stale entries.

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/types.py

+    Iterating over all statuses:
+
+        >>> for status in CacheStatus:
+        ...     print(status.value)
+        active
+        expired
+        evicted
+        warming
+    """
+
+    # Backward-compatible lookup statuses (used by older tests/consumers).
+    HIT = "hit"
+    MISS = "miss"
+
+    ACTIVE = "active"
+    EXPIRED = "expired"
+    EVICTED = "evicted"
+    WARMING = "warming"


⚠️ Potential issue | 🟡 Minor

Docstring example inconsistent with backward-compatible enum values.

The docstring's iteration example (lines 143-148) shows only active, expired, evicted, warming but the enum also includes HIT and MISS for backward compatibility. This could confuse users when they iterate and see 6 values instead of 4.

📝 Suggested docstring fix

Iterating over all statuses: >>> for status in CacheStatus: ... print(status.value) + hit + miss active expired evicted warming

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Iterating over all statuses:

>>> for status in CacheStatus:

... print(status.value)

active

expired

evicted

warming

"""

# Backward-compatible lookup statuses (used by older tests/consumers).

HIT = "hit"

MISS = "miss"

ACTIVE = "active"

EXPIRED = "expired"

EVICTED = "evicted"

WARMING = "warming"

Iterating over all statuses:

>>> for status in CacheStatus:

... print(status.value)

hit

miss

active

expired

evicted

warming

"""

# Backward-compatible lookup statuses (used by older tests/consumers).

HIT = "hit"

MISS = "miss"

ACTIVE = "active"

EXPIRED = "expired"

EVICTED = "evicted"

WARMING = "warming"

🤖 Prompt for AI Agents

In `@insideLLMs/caching/types.py` around lines 141 - 158, The docstring iteration example for CacheStatus is inconsistent with the enum because CacheStatus also defines backward-compatible values HIT and MISS; update the docstring for CacheStatus to either include 'hit' and 'miss' in the printed iteration example or add a note that HIT and MISS are present for backward compatibility and will appear when iterating. Locate the CacheStatus enum and its members (HIT, MISS, ACTIVE, EXPIRED, EVICTED, WARMING) and adjust the example text accordingly so the docstring matches the actual enum contents.

coderabbitai · 2026-01-30T12:16:03Z

insideLLMs/caching/utilities.py

+        key = self.key_generator(*args, **kwargs)
+        result = self.cache.get(key)
+
+        if result.hit:
+            self._cache_calls += 1
+            return result.value


⚠️ Potential issue | 🟠 Major

Cache interface assumption may cause issues with different backends.

MemoizedFunction assumes cache.get(key) returns a CacheLookupResult with .hit and .value attributes. However, InMemoryCache.get() returns Optional[T] directly (per the relevant snippet from memory.py), not a CacheLookupResult.

This class will only work correctly with StrategyCache (which returns CacheLookupResult), but the type hint suggests any BaseCache (aliased to StrategyCache) should work. If someone passes an InMemoryCache or a cache implementing BaseCacheABC, this will fail.

🐛 Proposed fix to handle both interfaces

key = self.key_generator(*args, **kwargs) result = self.cache.get(key) - if result.hit: + # Handle both CacheLookupResult (StrategyCache) and Optional[T] (InMemoryCache) + if hasattr(result, 'hit'): + if result.hit: + self._cache_calls += 1 + return result.value + elif result is not None: self._cache_calls += 1 - return result.value + return result value = self.func(*args, **kwargs) self.cache.set(key, value) return value

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

key = self.key_generator(*args, **kwargs)

result = self.cache.get(key)

if result.hit:

self._cache_calls += 1

return result.value

key = self.key_generator(*args, **kwargs)

result = self.cache.get(key)

# Handle both CacheLookupResult (StrategyCache) and Optional[T] (InMemoryCache)

if hasattr(result, 'hit'):

if result.hit:

self._cache_calls += 1

return result.value

elif result is not None:

self._cache_calls += 1

return result

value = self.func(*args, **kwargs)

self.cache.set(key, value)

return value

🤖 Prompt for AI Agents

In `@insideLLMs/caching/utilities.py` around lines 337 - 342, MemoizedFunction assumes cache.get(key) returns an object with .hit and .value; update MemoizedFunction to handle both return types by checking the result from self.cache.get(key): if it has attributes .hit/.value (as returned by StrategyCache/CacheLookupResult) use them, otherwise treat a truthy/non-None return as a hit and the return value as the cached value; adjust the increment of self._cache_calls and the return accordingly. Reference symbols: MemoizedFunction, key_generator, self.cache.get, _cache_calls, CacheLookupResult, StrategyCache, InMemoryCache.get, BaseCacheABC.

claude · 2026-01-30T12:18:51Z

Code Review

I found 2 issues that need attention:

1. CLAUDE.md Documentation Update Required

File: CLAUDE.md, line 28

CLAUDE.md line 28 documents insideLLMs/caching_unified.py as one of the infra modules:

- `insideLLMs/caching_unified.py`, `insideLLMs/rate_limiting.py`, `insideLLMs/cost_tracking.py`: infra modules used by model adapters / pipelines (not all enforced by runner).

After this refactoring, caching_unified.py is now just a 114-line re-export shim, and the actual implementation lives in the insideLLMs/caching/ package with 12 modules. The architecture documentation should be updated to reflect this new structure.

Reference: CLAUDE.md line 27-29

Suggested fix:
Update line 28 to reference the new package structure:

- `insideLLMs/caching/` (re-exported via `insideLLMs/caching_unified.py` for backward compatibility), `insideLLMs/rate_limiting.py`, `insideLLMs/cost_tracking.py`: infra modules used by model adapters / pipelines (not all enforced by runner).

2. StrategyCache Backward Compatibility Break

File: insideLLMs/caching/backends/strategy.py, line 135

The new StrategyCache.__init__ only accepts a config parameter, but the original implementation in caching_unified.py had strategy and max_size as keyword-only parameters. This breaks backward compatibility.

Original signature (caching_unified.py lines 5390-5396):

def __init__(
    self,
    config: Optional[CacheConfig] = None,
    *,
    strategy: Optional[CacheStrategy] = None,
    max_size: Optional[int] = None,
):

New signature (strategy.py line 135):

def __init__(self, config: Optional[CacheConfig] = None):

Impact:
Code using these convenience parameters will fail with TypeError:

StrategyCache(strategy=CacheStrategy.LRU) ❌
StrategyCache(max_size=100) ❌

Reference: strategy.py line 133-137

Suggested fix:
Restore the keyword-only parameters to maintain backward compatibility:

def __init__(
    self,
    config: Optional[CacheConfig] = None,
    *,
    strategy: Optional[CacheStrategy] = None,
    max_size: Optional[int] = None,
):
    resolved = config or CacheConfig()
    if strategy is not None or max_size is not None:
        from dataclasses import replace
        resolved = replace(
            resolved,
            strategy=strategy if strategy is not None else resolved.strategy,
            max_size=max_size if max_size is not None else resolved.max_size,
        )
    self.config = resolved

Copilot AI review requested due to automatic review settings January 30, 2026 12:09

devin-ai-integration bot assigned dr-gareth-roberts Jan 30, 2026

sourcery-ai bot reviewed Jan 30, 2026

View reviewed changes

Copilot started reviewing on behalf of dr-gareth-roberts January 30, 2026 12:10 View session

kiloconnect bot reviewed Jan 30, 2026

View reviewed changes

devin-ai-integration bot reviewed Jan 30, 2026

View reviewed changes

Copilot AI reviewed Jan 30, 2026

View reviewed changes

coderabbitai bot reviewed Jan 30, 2026

View reviewed changes

	sorted_params = json.dumps(params, sort_keys=True)
	sorted_params = json.dumps(params, sort_keys=True, default=str)

	key_str = json.dumps(key_parts, sort_keys=True, ensure_ascii=True)
	key_str = json.dumps(key_parts, sort_keys=True, ensure_ascii=True, default=str)

		if name not in self._caches or not isinstance(self._caches[name], PromptCache):
		self._caches[name] = PromptCache(config or self.default_config)

		def __init__(self, config: Optional[CacheConfig] = None):
		self.config = config or CacheConfig()

-    def __init__(self, config: Optional[CacheConfig] = None):
-        self.config = config or CacheConfig()
+    def __init__(
+        self,
+        config: Optional[CacheConfig] = None,
+        strategy: Optional[CacheStrategy] = None,
+        max_size: Optional[int] = None,
+    ):
+        """
+        Initialize the strategy cache.
+        Parameters
+        ----------
+        config :
+            Optional CacheConfig instance. If not provided, a default
+            CacheConfig is created.
+        strategy :
+            Optional override for the cache strategy. If provided, this
+            value takes precedence over the strategy defined in ``config``.
+        max_size :
+            Optional override for the maximum cache size. If provided, this
+            value takes precedence over the max_size defined in ``config``.
+        """
+        self.config = config or CacheConfig()
+        # Backward-compatible keyword overrides
+        if strategy is not None:
+            self.config.strategy = strategy
+        if max_size is not None:
+            self.config.max_size = max_size

Conversation

dr-gareth-roberts commented Jan 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Type of Change

Checklist

Testing

Human Review Checklist

Additional Notes

Summary by CodeRabbit

Uh oh!

bolt-new-by-stackblitz bot commented Jan 30, 2026

Uh oh!

devin-ai-integration bot commented Jan 30, 2026

🤖 Devin AI Engineer

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

dosubot bot commented Jan 30, 2026

Uh oh!

coderabbitai bot commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

Uh oh!

kiloconnect bot left a comment

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

guardrails bot commented Jan 30, 2026

Uh oh!

kiloconnect bot commented Jan 30, 2026

Code Review Summary

Overview

CRITICAL

WARNING

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Jan 30, 2026

Choose a reason for hiding this comment

How the bug is triggered

Actual vs Expected

Impact

Uh oh!

codecov bot commented Jan 30, 2026

❌ 1 Tests Failed:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 30, 2026

dr-gareth-roberts commented Jan 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 30, 2026 •

edited

Loading