Skip to content

Provider service hygiene: env-overridable URLs, safe-log credential redaction, drop redundant DB call#109

Open
Meganeuridae wants to merge 4 commits into
anima-research:mainfrom
Meganeuridae:chore/provider-hygiene-bundle
Open

Provider service hygiene: env-overridable URLs, safe-log credential redaction, drop redundant DB call#109
Meganeuridae wants to merge 4 commits into
anima-research:mainfrom
Meganeuridae:chore/provider-hygiene-bundle

Conversation

@Meganeuridae

Copy link
Copy Markdown

Three backend provider/service hygiene fixes in one PR, separate commits, all from REVIEW_FINDINGS Tier 2.

1. chore(providers): extract base URLs to env-overridable constants (P2-22)

OpenRouterService and GeminiService carried their API roots as literal in-class field assignments. Fine for the happy path but blocks legitimate overrides — tests against a mock server, internal reverse-proxies, region-local mirrors, the eventual Vertex AI compatibility shim for Gemini. Existing OPENAI_BASE_URL already follows this pattern.

Both services now read process.env.{OPENROUTER,GEMINI}_BASE_URL with the public root as fallback. env.example documents both new vars (commented; defaults are correct for almost every deployment). No behavior change unless an operator opts in.

2. feat(security): add safeErrorLog / redactSecrets and apply across providers (P2-9 + P2-24)

Provider error logs ship credentials in two recurring shapes:

  1. Catch-all console.error('… streaming error:', error) in each provider dumps the full Error/Axios object. Fetch-style errors embed the URL (with potential https://user:pass@… inline auth and ?api_key=… query params) in the message; Axios-style errors carry it in err.config.
  2. Explicit URL logs — openai-compatible.ts:99 (Request URL was: ${endpoint}) and inference.ts:331,661 (custom-model baseUrl interpolation) — directly emit user-configured custom-model URLs that may carry credentials in the URL itself.

New utils/safe-log.ts with two helpers:

  • redactSecrets(s) scrubs three non-overlapping shapes:

    • Inline URL auth user:pass@hostuser:[REDACTED]@host
    • Query secrets ?api_key=x&token=y?api_key=[REDACTED]&token=[REDACTED]
    • Header/JSON contexts Authorization: Bearer x, \"x-api-key\":\"sk-…\"[REDACTED]
    • The [REDACTED] marker's character class excludes [, so the three regexes can't re-match each other's output — avoids the [REDACTED] [REDACTED] double-redaction class of bug that an earlier draft had.
  • safeErrorLog(prefix, error, ...extra) is a drop-in console.error replacement that runs formatError (preserves name/message/stack) through redactSecrets before printing.

Applied at the highest-risk sites — catch-all streaming-error handlers in all five providers, plus the explicit URL-interpolation logs. Smoke-tested with the patterns in the doc block: inline auth, multi-secret URLs (?api_key=a&token=b&signature=c all redacted, non-secret &other=fine preserved), bearer headers, JSON header variants, and plain text — all behave as documented.

Not a general-purpose scrubber. Bounded to patterns we actually emit; new leak shapes should extend safe-log.ts directly.

3. chore(backend): drop redundant getUserModels() call (Greptile #96 follow-up)

Greptile flagged on PR #96: in the custom-model creation path, after await this.loadUser(userId), there's an await this.getUserModels(userId) whose return value is discarded. It existed as part of the now-removed 20-model cap enforcement loop. loadUser above already populates the in-memory model set; the call became a no-op that just costs one extra DB read per custom-model creation. Removed, with an inline comment pointing at the removed call's history so the next reader doesn't reintroduce it.

Scope / risk

Three commits, all backend, all additive or non-behavioral. Backend npm run build passes each intermediate state. No runtime change unless an operator sets the new env vars or hits a previously-leaky error path (in which case the log is now scrubbed).

🤖 Generated with Claude Code

Aster and others added 3 commits May 17, 2026 23:11
…ridable constants

Previously `OpenRouterService` and `GeminiService` carried their API
roots as literal in-class field assignments:

  private baseUrl = 'https://openrouter.ai/api/v1';
  this.baseUrl = 'https://generativelanguage.googleapis.com/v1beta';

That's fine for the happy path but blocks the legitimate use cases for
overriding: tests against a mock server, internal reverse-proxies that
add observability or auth shaping, region-local mirrors, and (for
Gemini specifically) the eventual Vertex AI compatibility shim where
the API root differs. The existing OpenAI-compatible service already
supports `OPENAI_BASE_URL` via the same pattern; this brings the two
similar providers in line.

Both services now read `process.env.{OPENROUTER,GEMINI}_BASE_URL` with
the public root as the fallback. Default behavior is unchanged.

`env.example` documents both new vars (commented, since the public
defaults are correct for almost every deployment).

Backend `npm run build` passes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…provider services

Provider error logs ship credentials. Two shapes recur:

1. Catch-all `console.error('… streaming error:', error)` in each provider
   dumps the full Error/Axios object. For fetch-style errors that
   embedded the URL in the message, this surfaces `https://user:pass@…`
   inline auth and `?api_key=…` query params in stdout. For Axios-style
   errors the `err.config` carries the URL and headers verbatim.
2. Explicit URL logging in openai-compatible.ts (`Request URL was:
   ${endpoint}`) and inference.ts (`baseUrl=${baseUrl}` two places)
   directly emits user-configured custom-model URLs that may carry
   credentials encoded in query strings or as inline auth.

Add `utils/safe-log.ts` with two helpers:

- `redactSecrets(s)`: scrubs three non-overlapping shapes from a string:
  * inline URL auth `user:pass@host` → `user:[REDACTED]@host`
  * query secrets `?api_key=x&token=y` → `?api_key=[REDACTED]&token=[REDACTED]`
  * header/JSON contexts `Authorization: Bearer x`, `"x-api-key":"sk-…"` → `[REDACTED]`
  The replacement leaves `[REDACTED]` markers whose character class
  excludes `[`, so the three regexes can't re-match each other's output
  (avoids the `[REDACTED] [REDACTED]` double-redaction class of bug).
- `safeErrorLog(prefix, error, ...extra)`: drop-in `console.error`
  replacement that runs `formatError` (preserves name/message/stack)
  through `redactSecrets` before printing.

Applied at the highest-risk sites — the catch-all streaming-error
handlers in each provider (anthropic, bedrock, gemini, openrouter,
openai-compatible) and the explicit URL-interpolation logs in
inference.ts and openai-compatible.ts. Smoke-tested with the patterns
in `redactSecrets`'s doc block; multi-secret URLs, header variants,
inline auth, and plain text all behave as documented.

Not a general-purpose scrubber — bounded to the patterns we emit. New
leak shapes should extend `safe-log.ts` directly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…l creation

Greptile anima-research#96 caught: in the custom-model creation path, after
`await this.loadUser(userId)`, there's an `await this.getUserModels(userId)`
whose return value is discarded. It existed as part of the now-removed
20-model cap enforcement loop, but after the cap was lifted in PR anima-research#96 the
call became a no-op — `loadUser` above already populates the in-memory
model set, so `getUserModels` re-runs the same load, maps the model
collection, and throws away the array.

One redundant DB read per custom-model creation. Removed, with an inline
comment pointing at the removed call's history so the next person reading
this section doesn't put the line back thinking it was load-bearing.

Backend `npm run build` passes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented May 18, 2026

Copy link
Copy Markdown

Greptile Summary

This PR applies three backend hygiene improvements: env-overridable base URLs for OpenRouter and Gemini, a new safe-log.ts utility (redactSecrets + safeErrorLog) applied across all five providers to scrub credentials from error logs, and removal of a redundant discarded getUserModels() DB call in the custom-model creation path.

  • safe-log.ts: HEADER_RE silently fails to redact the actual credential for non-Bearer auth schemes — the scheme word is consumed instead of the credential, leaving the real value in the log. QUERY_SECRET_RE also over-redacts on substring keyword matches (auth, sig within longer param names).
  • Env-overridable URLs: cleanly mirrors the existing OPENAI_BASE_URL pattern; constructors assign the env var with a fallback constant.
  • DB clean-up: discarded getUserModels() call is removed and replaced with an inline comment tracing its history.

Confidence Score: 3/5

The three-part change is mostly safe to merge, but the new HEADER_RE regex in safe-log.ts has a logic gap that silently fails for non-Bearer auth schemes — the regex redacts the scheme keyword instead of the actual credential, meaning any custom-model endpoint using Basic or Token auth will continue to leak the credential into logs despite the redaction being applied.

The base URL and DB clean-up changes are trivially correct. The safeErrorLog/redactSecrets utility is well-structured and handles the Bearer case that all current first-party providers use. However, the utility is also applied at the custom OpenAI-compatible endpoint path where user-configured values may carry any auth scheme, and the regex silently produces a misleading partial redaction for those cases rather than a clear failure.

deprecated-claude-app/backend/src/utils/safe-log.ts — specifically the HEADER_RE regex on line 37

Security Review

Incomplete credential redaction in safe-log.ts: HEADER_RE handles Authorization: Bearer … correctly but silently leaves the actual credential value in the log for other auth schemes (Basic, Token, ApiKey, etc.) — the scheme keyword is redacted while the credential itself passes through unchanged. Custom OpenAI-compatible endpoints configured with non-Bearer auth are the most likely real-world trigger.

Important Files Changed

Filename Overview
deprecated-claude-app/backend/src/utils/safe-log.ts New secret-redacting utility; HEADER_RE silently fails for non-Bearer auth schemes (leaving the actual credential in logs), and QUERY_SECRET_RE over-redacts on keyword substrings like auth and sig.
deprecated-claude-app/backend/src/services/gemini.ts Adds env-overridable base URL and applies redactSecrets/safeErrorLog at error paths; both changes are clean.
deprecated-claude-app/backend/src/services/openrouter.ts Adds env-overridable base URL and replaces streaming-error console.error with safeErrorLog; straightforward.
deprecated-claude-app/backend/src/services/openai-compatible.ts Applies redactSecrets to error text and endpoint URL before logging, and replaces streaming/model-list error logging with safeErrorLog; correct.
deprecated-claude-app/backend/src/services/anthropic.ts Swaps streaming-error console.error for safeErrorLog; minimal change, looks correct.
deprecated-claude-app/backend/src/services/bedrock.ts Swaps streaming-error console.error for safeErrorLog; minimal and clean.
deprecated-claude-app/backend/src/services/inference.ts Applies redactSecrets to two explicit baseUrl log sites; both are string values, change is safe.
deprecated-claude-app/backend/src/database/index.ts Removes the redundant discarded getUserModels() call and replaces it with an explanatory comment; correct clean-up.
deprecated-claude-app/backend/env.example Adds commented-out OPENROUTER_BASE_URL and GEMINI_BASE_URL entries with correct default values.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
deprecated-claude-app/backend/src/utils/safe-log.ts:36-37
**`HEADER_RE` silently fails to redact non-`Bearer` auth schemes**

`(?:Bearer\s+)?` is optional and zero-width when the scheme isn't `Bearer`. For `Authorization: Basic dXNlcjpwYXNz` the cursor sits right at `Basic` after the `': '`, so `[A-Za-z0-9._\-+/=]{4,}` consumes the word `Basic` — leaving the actual credential in the log. The same failure mode applies to `Authorization: Token sk-…`, `Authorization: ApiKey sk-…`, and any other non-Bearer scheme. Custom OpenAI-compatible endpoints that use `Token` or `ApiKey` auth headers would silently leak through. Changing the group to `(?:[A-Za-z]+\s+)?` would consume any scheme word before the credential-matching class.

### Issue 2 of 2
deprecated-claude-app/backend/src/utils/safe-log.ts:29-30
**`QUERY_SECRET_RE` over-redacts on keyword substring matches**

`auth` and `sig` are matched without word-boundary anchors, so any parameter whose name merely contains those substrings — such as `author`, `authentication_level`, `signature_date`, `signalling_mode` — will have its value silently wiped even when the value is not sensitive. Over-redaction degrades log diagnostics for no security gain. Wrapping each bare keyword in `\b` word boundaries would restrict matches to whole-word occurrences only.

Reviews (1): Last reviewed commit: "chore(backend): drop redundant getUserMo..." | Re-trigger Greptile

Comment on lines +36 to +37
const HEADER_RE =
/((?:authorization|(?:x-)?api[_-]?key)['"]?\s*[:=]\s*['"]?(?:Bearer\s+)?)[A-Za-z0-9._\-+/=]{4,}/gi;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security HEADER_RE silently fails to redact non-Bearer auth schemes

(?:Bearer\s+)? is optional and zero-width when the scheme isn't Bearer. For Authorization: Basic dXNlcjpwYXNz the cursor sits right at Basic after the ': ', so [A-Za-z0-9._\-+/=]{4,} consumes the word Basic — leaving the actual credential in the log. The same failure mode applies to Authorization: Token sk-…, Authorization: ApiKey sk-…, and any other non-Bearer scheme. Custom OpenAI-compatible endpoints that use Token or ApiKey auth headers would silently leak through. Changing the group to (?:[A-Za-z]+\s+)? would consume any scheme word before the credential-matching class.

Prompt To Fix With AI
This is a comment left during a code review.
Path: deprecated-claude-app/backend/src/utils/safe-log.ts
Line: 36-37

Comment:
**`HEADER_RE` silently fails to redact non-`Bearer` auth schemes**

`(?:Bearer\s+)?` is optional and zero-width when the scheme isn't `Bearer`. For `Authorization: Basic dXNlcjpwYXNz` the cursor sits right at `Basic` after the `': '`, so `[A-Za-z0-9._\-+/=]{4,}` consumes the word `Basic` — leaving the actual credential in the log. The same failure mode applies to `Authorization: Token sk-…`, `Authorization: ApiKey sk-…`, and any other non-Bearer scheme. Custom OpenAI-compatible endpoints that use `Token` or `ApiKey` auth headers would silently leak through. Changing the group to `(?:[A-Za-z]+\s+)?` would consume any scheme word before the credential-matching class.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Real leak — confirmed against Authorization: Basic … in a local smoke test, the credential survived as Greptile predicted. Fixed in f03e485 with the generalized (?:[A-Za-z]+\s+)? scheme group. Now consumes Bearer, Basic, Token, ApiKey, etc. as the scheme before the credential body. Verified Basic/Token/ApiKey all redact correctly without breaking the existing Bearer + no-scheme cases.

Comment on lines +29 to +30
const QUERY_SECRET_RE =
/([?&])([a-z0-9_.-]*(?:api[_-]?key|token|secret|password|auth|sig|signature)[a-z0-9_.-]*)=[^&#\s'"]*/gi;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 QUERY_SECRET_RE over-redacts on keyword substring matches

auth and sig are matched without word-boundary anchors, so any parameter whose name merely contains those substrings — such as author, authentication_level, signature_date, signalling_mode — will have its value silently wiped even when the value is not sensitive. Over-redaction degrades log diagnostics for no security gain. Wrapping each bare keyword in \b word boundaries would restrict matches to whole-word occurrences only.

Prompt To Fix With AI
This is a comment left during a code review.
Path: deprecated-claude-app/backend/src/utils/safe-log.ts
Line: 29-30

Comment:
**`QUERY_SECRET_RE` over-redacts on keyword substring matches**

`auth` and `sig` are matched without word-boundary anchors, so any parameter whose name merely contains those substrings — such as `author`, `authentication_level`, `signature_date`, `signalling_mode` — will have its value silently wiped even when the value is not sensitive. Over-redaction degrades log diagnostics for no security gain. Wrapping each bare keyword in `\b` word boundaries would restrict matches to whole-word occurrences only.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right tradeoff to flag — over-redaction silently degrades diagnostics. Fixed in f03e485 by splitting the keyword group: long unambiguous names (api_key, token, secret, password, signature) keep substring-match (so ?my_api_key_v2= still works), but the short ambiguous auth and sig now require exact whole-name match. ?author=Doe, ?authentication_level=full, ?signalling_mode=on all pass through unchanged; ?auth= and ?sig= still redact as intended. One known accepted edge: ?signature_date=... still redacts because signature is in the long list — preserving compound credential names was the better tradeoff.

…E schemes, narrow QUERY_SECRET_RE short keywords

Two Greptile catches on PR anima-research#109's safe-log utility:

P1 (security) — `HEADER_RE` silently failed on non-Bearer auth schemes.
The `(?:Bearer\s+)?` group was optional and matched empty when the
scheme wasn't literally `Bearer`. For `Authorization: Basic dXNlcjpwYXNz`
the cursor sat at `Basic`, the body class consumed the word `Basic`, and
the actual credential survived into the log. Same failure on `Token`,
`ApiKey`, and any other non-Bearer scheme — exactly the OpenAI-compatible
custom-endpoint case the utility was meant to protect. Generalized to
`(?:[A-Za-z]+\s+)?` so any scheme word gets consumed as scheme rather
than as the start of the credential body.

P2 (over-redaction) — `QUERY_SECRET_RE`'s `auth` and `sig` keywords had
no word-boundary anchor, so any param whose name *contained* those
substrings — `author`, `authentication_level`, `signalling_mode` —
got silently wiped even when the value was benign. Useful diagnostics
lost for no security benefit. Split the keyword group into two
alternatives: long unambiguous names (api_key, token, secret, password,
signature) can sit anywhere in a compound param name, but the short
ambiguous ones (auth, sig) must be the WHOLE param name (anchored by
the surrounding `[?&]` and `=`). `?auth=` still redacts; `?author=`
doesn't.

Smoke-tested both directions: Basic/Token/ApiKey schemes now redact;
Bearer + no-scheme still work; `?author=`, `?authentication_level=`,
`?signalling_mode=` preserved; `?auth=`, `?sig=`, compound `?my_api_key_v2=`
still redact.

Known accepted false-positive: `?signature_date=...` still redacts
because `signature` is in the long-keyword list (where compound names
like `?request_signature=...` SHOULD redact). The alternative — moving
`signature` to the exact-match list — would miss compound legitimate
credential names, which is the worse tradeoff.

Backend `npm run build` passes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant