Litellm by airmonitor · Pull Request #57 · giancarloerra/SocratiCode

airmonitor · 2026-05-08T07:51:12Z

Summary

Adds LiteLLM Proxy Server as a first-class embedding provider. LiteLLM exposes an OpenAI-compatible /v1/embeddings endpoint and fans out to 100+ underlying providers (OpenAI, Anthropic, Cohere, Voyage, HuggingFace, Bedrock, Vertex AI, Ollama, ...), so a single SocratiCode install can target any of them without per-provider code paths. The implementation mirrors the LM Studio strategy but is a dedicated provider rather than a flag on provider-openai, because three behaviours diverge meaningfully from both OpenAI and LM Studio: mandatory authentication (master key / virtual key gates /v1/models), no sensible defaults for EMBEDDING_MODEL / EMBEDDING_DIMENSIONS (model aliases come from the operator's config.yaml), and provider-dependent acceptance of the dimensions parameter (Matryoshka-aware backends accept it, BGE / nomic / Cohere v3 reject — opt-in via LITELLM_SEND_DIMENSIONS=true).

Changes

New src/services/provider-litellm.ts (+303 LOC): LiteLLMEmbeddingProvider built on the OpenAI SDK with a custom baseURL (default http://localhost:4000/v1). Batch size 256 (between OpenAI's 512 and LM Studio's 64). ensureReady distinguishes proxy-unreachable / auth-rejected / alias-not-registered by duck-typing err.status (401/403) and lists up to 10 currently-registered aliases in the missing-alias error so the operator can sanity-check config.yaml from the log alone.
src/services/embedding-config.ts: extends EmbeddingProvider union with "litellm", adds litellmUrl, fail-fast validation for LITELLM_API_KEY + EMBEDDING_MODEL + EMBEDDING_DIMENSIONS (key checked first so a virtual-key user fixes the easy problem before touching proxy config), updates the Invalid EMBEDDING_PROVIDER error and the hasApiKey log expression.
src/services/embedding-provider.ts: factory case for litellm with dynamic import — keeps the OpenAI SDK out of startup for non-LiteLLM users.
encoding_format=float fix from bb141a0 ported verbatim — without it the OpenAI SDK 6.x base64-decode path corrupts backends that return plain JSON float arrays (many LiteLLM aliases do, including Ollama-routed and TEI-wrapped ones).
README.md: dedicated LiteLLM section, MCP host config example, env-var table entries clarifying which values are mandatory under litellm, new LiteLLM Configuration table.
tests/unit/embedding-config.test.ts: +9 cases (model/dim/key required, error-ordering, URL default + override, dimensions parsing, EMBEDDING_CONTEXT_LENGTH override for unknown aliases, auto-detection when alias matches a known model name) + updated full external config fixture and invalid-provider error message.
tests/unit/embedding-provider.test.ts: factory test for litellm, plus 4 cases against a deliberately-closed port (config rejects construction without API_KEY; ensureReady unreachable error format; healthCheck short-circuits on missing key without a network call; healthCheck reaches Not reachable without throwing).

Backward compatible. Provider is opt-in via EMBEDDING_PROVIDER=litellm. Existing ollama / openai / google / lmstudio paths are untouched.

Diffstat: 6 files changed, +638 / −14.

Type of change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)
Test coverage improvement

Testing

Unit tests pass (npm run test:unit) — 64/64 on the touched suites (embedding-config, embedding-provider); also biome lint clean per the commit author.
Integration tests pass (npm run test:integration) — 153/154 with QDRANT_MODE=external against a remote Qdrant; 150/153 with the default Docker-managed Qdrant. Both runs were performed against this commit (1708510). Detail on the residual failures:
- The 1 failure on the external-Qdrant run is tests/integration/docker-ollama.test.ts > Qdrant container management > reports Qdrant as running after ensure. The test asserts that a Docker-managed Qdrant container is up; with QDRANT_MODE=external no such container is started, so the assertion is structurally incompatible with the config — not a regression introduced by this PR (this PR doesn't touch the Docker-Qdrant code path).
- The 3 failures on the managed-Qdrant run (tools.test.ts > impact-analysis tool handlers setup + graph tool handlers > codebase_graph_query + graph tool handlers > codebase_graph_remove) all surface as ECONNREFUSED 127.0.0.1:16333 from the @qdrant/js-client-rest client, while the parallel qdrant.test.ts / code-graph.test.ts / indexer.test.ts suites use Qdrant successfully in the same run. The same three tests pass cleanly against the external Qdrant. Looks like a pre-existing race / suite-ordering issue in the local managed-Qdrant test infrastructure (container teardown vs. cross-suite client reuse), not embeddings-related — also not introduced by this PR.
TypeScript compiles cleanly (npx tsc --noEmit) — per commit author's verification.
New tests added for new/changed functionality — 9 new cases in embedding-config.test.ts covering provider validation; 5 in embedding-provider.test.ts covering the factory and the closed-port error paths.

Checklist

My code follows the existing code style and conventions (mirrors the existing provider-lmstudio / provider-openai shape; OpenAI-SDK + custom baseURL pattern; ESM .js import extensions; SPDX header).
I have added/updated JSDoc comments where appropriate (public methods on LiteLLMEmbeddingProvider and the new fields on EmbeddingConfig).
I have updated documentation (README.md / DEVELOPER.md) if needed — README.md gains a dedicated LiteLLM section and a configuration table; env-var table updated for litellm-specific mandatory values.
I have addressed all CodeRabbit review comments (or marked as resolved with explanation) — to be confirmed once CodeRabbit runs against the opened PR.
I have read the Contributing Guide
I agree to the Contributor License Agreement

Related issues

Summary by CodeRabbit

New Features
- Added LiteLLM as a supported embedding provider, enabling access to 100+ providers via a single environment-variable selection.
Documentation
- Expanded configuration docs with LiteLLM setup, example JSON, required env vars, and optional dimension-forwarding guidance.
Tests
- Added unit tests covering LiteLLM configuration, provider selection, readiness, and health-check scenarios.
Bug Fixes / Validation
- Improved fail-fast validation and clearer error messages for missing LiteLLM-related configuration.

merging with upstream

LiteLLM Proxy Server (https://docs.litellm.ai/docs/simple_proxy) exposes an OpenAI-compatible /v1/embeddings endpoint and fans out to 100+ underlying providers (OpenAI, Anthropic, Cohere, Voyage, HuggingFace, Bedrock, Vertex AI, Ollama, ...). Mirroring the lmstudio strategy (PR giancarloerra#42 + commit bb141a0) but with three meaningful differences that justify a dedicated provider rather than a flag on provider-openai: - Authentication is mandatory. LiteLLM gates /v1/models with the master key or a virtual key, unlike LM Studio (no auth by default) and OpenAI (cloud key). LITELLM_API_KEY is checked at config-load time; the provider also duck-types 401/403 in ensureReady/healthCheck via err.status to surface a distinct "auth rejected" message vs. "proxy unreachable". - Model aliases come from the proxy's config.yaml, so EMBEDDING_MODEL and EMBEDDING_DIMENSIONS have no sensible defaults. Fail-fast in loadEmbeddingConfig with provider-specific error messages pointing at litellm_params.model in the proxy config and at the underlying alias's output dim. - Whether dimensions can be forwarded depends on the underlying provider: Matryoshka-aware models (text-embedding-3-*, voyage-3) accept it, non-Matryoshka backends (BGE, nomic, Cohere v3) reject. Made opt-in via LITELLM_SEND_DIMENSIONS=true rather than hardcoded like provider-openai does for text-embedding-3-*, since LiteLLM aliases are user-defined. Encoding-format=float fix from bb141a0 ports verbatim — the OpenAI SDK 6.x base64-decode path corrupts any backend that returns plain JSON float arrays (many LiteLLM aliases do, including Ollama-routed and tei-wrapped ones). Files: - src/services/provider-litellm.ts: new LiteLLMEmbeddingProvider with the same OpenAI-SDK + custom baseURL pattern. Default baseURL http://localhost:4000/v1 (LiteLLM's default port, /v1 prefix required). Batch size 256 — between OpenAI's 512 and LM Studio's 64, since the practical ceiling depends on whichever provider the alias resolves to. ensureReady distinguishes proxy-unreachable / auth-rejected / alias-not-registered. Lists up to 10 currently-registered models in the alias-missing error so the operator can sanity-check their config.yaml without leaving the log. - src/services/embedding-config.ts: extends EmbeddingProvider union with "litellm", adds litellmUrl to EmbeddingConfig, fail-fast validation for LITELLM_API_KEY + EMBEDDING_MODEL + EMBEDDING_DIMENSIONS (key first so a virtual-key user fixes the easy problem before touching the proxy config), updates Invalid EMBEDDING_PROVIDER message and hasApiKey log expression. - src/services/embedding-provider.ts: factory case for litellm with dynamic import to avoid loading the OpenAI SDK at startup for non-litellm users. - README.md: dedicated LiteLLM section, MCP host config example, env-var table entries for EMBEDDING_PROVIDER / EMBEDDING_MODEL / EMBEDDING_DIMENSIONS / EMBEDDING_CONTEXT_LENGTH (clarifying which require manual values for litellm), new LiteLLM Configuration table. - tests/unit/embedding-config.test.ts: 9 new cases (model + dim + key required, error-ordering, URL default + override, dimensions parsing, EMBEDDING_CONTEXT_LENGTH override for unknown aliases, auto-detection when alias matches a known model name) plus updated "full external config" expected object and updated invalid-provider error message. - tests/unit/embedding-provider.test.ts: factory test for litellm, plus 4 cases against a deliberately-closed port (config rejects construction without API_KEY, ensureReady unreachable error format, healthCheck short-circuits on missing key without a network call, healthCheck reaches "Not reachable" path without throwing). Backward compatible. The litellm provider is opt-in via EMBEDDING_PROVIDER=litellm. Existing ollama, openai, google, and lmstudio paths are untouched. Verified: 64/64 unit tests pass on the touched suites; biome lint clean; tsc --noEmit clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

coderabbitai · 2026-05-08T07:51:28Z

📝 Walkthrough

Walkthrough

Adds LiteLLM as a new embedding provider: extends config types and validation, wires provider into factory, implements LiteLLM provider (client caching, auth/reachability checks, batching/truncation, optional dimensions), updates README, and adds unit tests.

Changes

LiteLLM Embedding Provider

Layer / File(s)	Summary
Configuration Types & Validation `src/services/embedding-config.ts`	`EmbeddingProvider` type extended to include `"litellm"`. `EmbeddingConfig` interface gains `litellmUrl` and `sendDimensions` fields. Provider validation recognizes `litellm`.
Environment Validation & Config Loading `src/services/embedding-config.ts`	LiteLLM-specific validation requires `LITELLM_API_KEY`, `EMBEDDING_MODEL`, and `EMBEDDING_DIMENSIONS`. Resolver sets `litellmUrl` from env (defaults to `/v1`) and `sendDimensions` from `LITELLM_SEND_DIMENSIONS`. `hasApiKey` derivation extended to check `LITELLM_API_KEY`.
Provider Factory Wiring `src/services/embedding-provider.ts`	`getEmbeddingProvider` adds a `case "litellm"` branch that dynamically imports and instantiates `LiteLLMEmbeddingProvider`. Allowed-provider messaging updated to include `litellm`.
LiteLLM Provider Implementation `src/services/provider-litellm.ts`	Implements `LiteLLMEmbeddingProvider` with cached SDK client keyed by base URL+API key, exported `resetLiteLLMClient()`, `ensureReady()` that distinguishes auth vs reachability via `models.list()`, `embed()`/`embedSingle()` with truncation and batching, `healthCheck()` reporting, and `_embedBatch()` using `encoding_format: "float"` with optional `dimensions` forwarding.
Configuration Tests `tests/unit/embedding-config.test.ts`	Test setup clears LiteLLM env vars. Full-config expectation includes `litellmUrl`. Invalid provider error list includes `litellm`. Adds `litellm` test suite for defaults, required env validation, error precedence, example-dimensions in errors, and `EMBEDDING_CONTEXT_LENGTH` behavior.
Provider Factory & Instance Tests `tests/unit/embedding-provider.test.ts`	Factory tests include `litellm` instantiation. New `LiteLLMEmbeddingProvider` test suite covers construction-time env validation, `ensureReady()` unreachable-proxy behavior, and `healthCheck()` reporting for missing API key and unreachable proxy.
Documentation `README.md`	Updates features wording to mention LiteLLM, adds "LiteLLM (proxy gateway, 100+ providers)" configuration subsection with MCP JSON example and env-var docs, updates embedding-provider env table to include `litellm`, and adds a dedicated LiteLLM env-var reference section.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

giancarloerra/SocratiCode#42: Both PRs add a new embedding provider backend and make parallel changes across configuration, factory, provider implementation, tests, and README.
giancarloerra/SocratiCode#24: Both PRs modify src/services/embedding-config.ts for environment parsing and validation of embedding model/dimensions.

Poem

🐰 I hopped in with a tiny key,

LiteLLM waved, "Come play with me!"
Batches hum and tokens shrink,
Health checks chirp and logs all sync.
Config set, tests run free — embeddings bloom for you and me.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Litellm' is vague and generic, using only a single word that names the feature without describing the change (addition of a new embedding provider).	Consider a more descriptive title like 'Add LiteLLM as embedding provider' or 'Support LiteLLM proxy gateway for embeddings' to clarify the main change.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description check	✅ Passed	The pull request description is comprehensive and complete, covering all required template sections with detailed context about the implementation, testing results, and backward compatibility.
Docstring Coverage	✅ Passed	Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

src/services/provider-litellm.ts (2)

144-181: ⚡ Quick win

models.list().data is first-page only — replace with auto-pagination in both ensureReady and healthCheck

List methods in the OpenAI SDK are paginated; .data holds only the current page — use for await … of to automatically fetch more pages as needed. For a LiteLLM proxy that returns all registered aliases in a single response this is safe today, but a large enterprise deployment whose /v1/models response spans multiple pages could produce a false "alias not registered" error on every ensureReady() and healthCheck() call.

♻️ Proposed fix — use auto-pagination in both methods

ensureReady() (line 146):

-    let modelList: Awaited<ReturnType<typeof client.models.list>>;
-    try {
-      modelList = await client.models.list();
-    } catch (err) {
+    const allModelIds: string[] = [];
+    try {
+      for await (const m of await client.models.list()) {
+        allModelIds.push(m.id);
+      }
+    } catch (err) {
       // ... error-handling block unchanged ...
     }
-    const modelRegistered = modelList.data.some((m) => m.id === config.embeddingModel);
+    const modelRegistered = allModelIds.some((id) => id === config.embeddingModel);
     if (!modelRegistered) {
-      const known = modelList.data.map((m) => m.id).slice(0, 10).join(", ");
+      const known = allModelIds.slice(0, 10).join(", ");

healthCheck() (line 239):

-      const models = await client.models.list();
-      lines.push(`${icon(true)} LiteLLM: Reachable at ${config.litellmUrl}`);
-      const modelRegistered = models.data.some((m) => m.id === config.embeddingModel);
+      const allModelIds: string[] = [];
+      for await (const m of await client.models.list()) {
+        allModelIds.push(m.id);
+      }
+      lines.push(`${icon(true)} LiteLLM: Reachable at ${config.litellmUrl}`);
+      const modelRegistered = allModelIds.some((id) => id === config.embeddingModel);

Also applies to: 237-260

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/provider-litellm.ts` around lines 144 - 181, The code
incorrectly assumes client.models.list().data contains all models; update both
ensureReady and healthCheck to use the SDK's auto-pagination (for await … of) to
iterate all returned pages/items from client.models.list(), accumulating model
ids or checking membership across pages instead of only checking modelList.data;
reference the existing client.models.list call and config.embeddingModel when
implementing the iteration and replace the .data-based membership check (and any
slicing of modelList.data) with logic that gathers ids (or tests equality) from
all pages/items before throwing the "model not registered" error.

144-181: ⚡ Quick win

Future-proof model alias check to handle paginated LiteLLM proxies

The OpenAI SDK's models.list() returns paginated results; .data contains only the first page. While most LiteLLM deployments are non-paginated (all aliases returned at once), larger enterprise proxies could have models on later pages, causing a false "not registered" error.

Iterate through all pages using for await:

Suggested fix

In ensureReady() (lines 144–150):

-    let modelList: Awaited<ReturnType<typeof client.models.list>>;
-    try {
-      modelList = await client.models.list();
+    const allModelIds: string[] = [];
+    try {
+      for await (const m of await client.models.list()) {
+        allModelIds.push(m.id);
+      }
-    const modelRegistered = modelList.data.some((m) => m.id === config.embeddingModel);
+    const modelRegistered = allModelIds.some((id) => id === config.embeddingModel);
-      const known = modelList.data.map((m) => m.id).slice(0, 10).join(", ");
+      const known = allModelIds.slice(0, 10).join(", ");

In healthCheck() (line 239):

-      const models = await client.models.list();
-      const modelRegistered = models.data.some((m) => m.id === config.embeddingModel);
+      const allModelIds: string[] = [];
+      for await (const m of await client.models.list()) {
+        allModelIds.push(m.id);
+      }
+      const modelRegistered = allModelIds.some((id) => id === config.embeddingModel);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/provider-litellm.ts` around lines 144 - 181, The model
registration check in ensureReady (and the similar check in healthCheck) only
inspects modelList.data from a single page returned by client.models.list(),
which can miss models on subsequent pages; replace the single-page logic by
iterating all pages from client.models.list() (use for await over
client.models.list() or the SDK's pager) to collect or search every model id
before setting modelRegistered and building the known list; update references in
ensureReady, healthCheck, client.models.list, modelRegistered, and
config.embeddingModel so the check and the "Currently registered models" message
reflect the merged results of all pages.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/services/provider-litellm.ts`:
- Around line 144-181: The code incorrectly assumes client.models.list().data
contains all models; update both ensureReady and healthCheck to use the SDK's
auto-pagination (for await … of) to iterate all returned pages/items from
client.models.list(), accumulating model ids or checking membership across pages
instead of only checking modelList.data; reference the existing
client.models.list call and config.embeddingModel when implementing the
iteration and replace the .data-based membership check (and any slicing of
modelList.data) with logic that gathers ids (or tests equality) from all
pages/items before throwing the "model not registered" error.
- Around line 144-181: The model registration check in ensureReady (and the
similar check in healthCheck) only inspects modelList.data from a single page
returned by client.models.list(), which can miss models on subsequent pages;
replace the single-page logic by iterating all pages from client.models.list()
(use for await over client.models.list() or the SDK's pager) to collect or
search every model id before setting modelRegistered and building the known
list; update references in ensureReady, healthCheck, client.models.list,
modelRegistered, and config.embeddingModel so the check and the "Currently
registered models" message reflect the merged results of all pages.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 514cf71c-320e-4089-9103-b23143a31b10

📥 Commits

Reviewing files that changed from the base of the PR and between aedfa85 and 1708510.

📒 Files selected for processing (6)

README.md
src/services/embedding-config.ts
src/services/embedding-provider.ts
src/services/provider-litellm.ts
tests/unit/embedding-config.test.ts
tests/unit/embedding-provider.test.ts

The OpenAI SDK's `client.models.list()` returns a `PagePromise` that implements `AsyncIterable<Model>` and auto-paginates on demand. The previous implementation read `modelList.data` directly, which only contains the first page. Today's LiteLLM proxy returns the entire `model_list` from `config.yaml` in a single response so the bug is latent, but a future LiteLLM build (or an upstream proxy in front of it) that paginates `/v1/models` would cause `ensureReady` and `healthCheck` to throw a spurious "alias not registered" error for any alias landing on a non-first page. Switch both checks to `for await (const m of client.models.list())` and accumulate ids into a single array. Equivalent to the SDK's documented async-iteration pattern; `PagePromise` is itself the iterable, so no extra `await` is needed before the loop. Inline comment explains why the iteration matters even though today's LiteLLM doesn't paginate, so the pattern survives future drive-by "simplifications". Surfaced by CodeRabbit on PR review of 1708510. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (2)

src/services/provider-litellm.ts (2)
78-107: getClient() creates a new OpenAI instance on every API-key or URL change — consider the concurrent-call edge case

getClient() checks three variables to decide whether to reuse or replace the cached client. This is correct for sequential callers, but if two concurrent embed() calls reach the if (!litellmClient || ...) branch simultaneously (e.g., after resetLiteLLMClient()), both will construct a new OpenAI instance and the second write will silently win. In practice this is unlikely to matter (both instances are functionally identical for the same key/URL), but it means litellmApiKey could briefly hold a stale reference if env changes mid-flight. This is a design-level observation, not a correctness bug in the typical single-threaded Node.js event-loop model.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/provider-litellm.ts` around lines 78 - 107, getClient() can race
when two callers concurrently detect the cache miss and both construct an OpenAI
instance; change getClient to use a double-checked assignment or a short
critical section so only one constructed client is stored: compute the
apiKey/baseUrl as now, then if the cached litellmClient looks stale create a
local newClient, but before writing set litellmClient re-check (litellmClient,
litellmBaseUrl, litellmApiKey) and only assign the newClient if the cache is
still stale; alternatively implement a small Promise/mutex guard around the
cache-update so resetLiteLLMClient(), getClient(), and the variables
litellmClient, litellmBaseUrl, litellmApiKey are updated atomically.
275-306: 💤 Low value

response.data.sort() mutates the response object in place

Array.prototype.sort is destructive. While response.data is a local, not-shared reference here, a defensive spread avoids surprising any future code that retains a reference to response.
♻️ Proposed fix
-    const sorted = response.data.sort((a, b) => a.index - b.index);
+    const sorted = [...response.data].sort((a, b) => a.index - b.index);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/services/provider-litellm.ts` around lines 275 - 306, In _embedBatch,
avoid mutating response.data with Array.prototype.sort; instead make a shallow
copy (e.g., via [...response.data] or response.data.slice()) and sort that copy
before mapping so the original response object is not mutated; update the code
around the variable `response` and the `sorted` assignment to sort the copied
array and then return the embeddings from the sorted copy.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/services/provider-litellm.ts`:
- Around line 78-107: getClient() can race when two callers concurrently detect
the cache miss and both construct an OpenAI instance; change getClient to use a
double-checked assignment or a short critical section so only one constructed
client is stored: compute the apiKey/baseUrl as now, then if the cached
litellmClient looks stale create a local newClient, but before writing set
litellmClient re-check (litellmClient, litellmBaseUrl, litellmApiKey) and only
assign the newClient if the cache is still stale; alternatively implement a
small Promise/mutex guard around the cache-update so resetLiteLLMClient(),
getClient(), and the variables litellmClient, litellmBaseUrl, litellmApiKey are
updated atomically.
- Around line 275-306: In _embedBatch, avoid mutating response.data with
Array.prototype.sort; instead make a shallow copy (e.g., via [...response.data]
or response.data.slice()) and sort that copy before mapping so the original
response object is not mutated; update the code around the variable `response`
and the `sorted` assignment to sort the copied array and then return the
embeddings from the sorted copy.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b0a9e765-2e80-4591-aeac-89b9051fc06a

📥 Commits

Reviewing files that changed from the base of the PR and between 1708510 and 6c67965.

📒 Files selected for processing (1)

src/services/provider-litellm.ts

airmonitor · 2026-05-08T08:39:46Z

Addressed CodeRabbit's pagination concern on client.models.list() in commit 6c67965.

Concern (lines 144–181 of src/services/provider-litellm.ts): modelList.data only contains the first page of /v1/models. Today's LiteLLM proxy returns the full model_list from config.yaml in a single response, but a future LiteLLM build — or an upstream proxy in front of it — that paginates /v1/models would cause ensureReady() and healthCheck() to throw a spurious "alias not registered" error for any alias landing on a non-first page.

Fix: Both ensureReady() and healthCheck() now use the SDK's documented async-iteration pattern (for await (const m of client.models.list())), which transparently fetches subsequent pages on demand. Model ids are accumulated into a single array before the membership check, so the "Currently registered models" log line in the missing-alias error reflects the merged result across all pages. An inline comment explains why iteration matters even though today's LiteLLM doesn't paginate, so the pattern survives future drive-by simplifications.

Diff: 1 file changed, +19 / −6. No behavioural change against the current LiteLLM proxy; future-proofs against paginated /v1/models responses.

airmonitor and others added 4 commits May 4, 2026 12:05

Merge pull request #1 from giancarloerra/main

dbbeda3

merging with upstream

Merge pull request #2 from giancarloerra/main

7a42ea6

merging with upstream

Merge pull request #3 from giancarloerra/main

a081400

merging with upstream

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

giancarloerra self-assigned this May 8, 2026

giancarloerra merged commit f87f529 into giancarloerra:main May 8, 2026
5 checks passed

coderabbitai Bot mentioned this pull request May 23, 2026

Deterministic Text Embedder Implementation #62

Closed

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Litellm#57

Litellm#57
giancarloerra merged 5 commits into
giancarloerra:mainfrom
airmonitor:litellm

airmonitor commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

airmonitor commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

airmonitor commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Type of change

Testing

Checklist

Related issues

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

airmonitor commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

airmonitor commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading