fix: normalize OAuth discovery URLs to use host_name config#186
fix: normalize OAuth discovery URLs to use host_name config#186rachana-promantia wants to merge 28 commits into
Conversation
The repair patch from v2.4.1 hard-coded reads/writes of the skill_name column on both tabFAC Skill and tabSkill. Some HRMS releases ship the Skill doctype with autoname=field:skill_name and a dedicated skill_name column; newer releases drop that field and use name directly. On the latter, bench migrate fails with: pymysql.err.OperationalError: (1054, \"Unknown column 'skill_name' in 'SELECT'\") Probe both tables for the column at runtime and conditionally include it in the SELECT and INSERT, so the patch is a no-op safe across either HRMS schema.
fix: accept JSON-string roles in update_tool_role_access
The pre-insert required-field check ran against the user-supplied data dict before any defaults were applied, and the docfield filter excluded only fields with a static 'default'. Fields populated dynamically by frappe.new_doc() (company from User permissions, selling_price_list from Selling Settings, naming_series from autoname rules, etc.) carry no static default on the docfield, so the check rejected requests that should have succeeded — surfacing as false 'Missing required fields: company, selling_price_list, naming_series' errors when those values were already on the doc. Run the check after assigning the user-supplied values onto the doc, and check doc.get(fieldname) instead of data[fieldname]. Drops the 'not f.default' filter since the populated-doc check covers both static-default and dynamically-populated cases without producing false positives.
The v15 CI install step was failing because `bench get-app` invokes `yarn install --check-files` against the new package.json, and @commitlint/cli@20's transitive dep `global-directory@5` requires Node >= 20. Frappe v15's CI environment uses Node 18, so yarn aborted: error global-directory@5.0.0: The engine "node" is incompatible with this module. Expected version ">=20". Got "18.20.8" - package.json: pin @commitlint/cli and config-conventional to ^19.6.0 (last major still supporting Node 18). Add engines.node >=18. - .pre-commit-config.yaml: pin additional_dependencies to v19 to match. - Drop package-lock.json, ship yarn.lock instead. Bench runs yarn against every app, so a stray npm lockfile triggers a warning and invites resolution drift. - PRE_COMMIT_SETUP.md: update install instructions to yarn, document the Node 18 / v19 constraint so future bumps don't reintroduce it.
filter_sensitive_fields was extended to require a third positional
user_role argument. The two MCP-path callers (get_document,
list_documents) were updated; chatgpt_fetch was missed and crashed on
every call with:
TypeError: filter_sensitive_fields() missing 1 required positional
argument: 'user_role'
The ChatGPT fetch tool also bypassed FAC's standard auth pipeline,
calling doc.has_permission("read") directly instead of
validate_document_access. That weaker check skipped role-based doctype
gating and the restricted-doctype list, and was the structural reason
the user_role plumbing went missing in the first place — the tool
never resolved a role at all.
Switch to the same validate_document_access flow get_document uses,
take user_role from its result, and pass it through to
filter_sensitive_fields. Drop the catch-all except Exception that
masked the original TypeError as a misleading 'Error fetching document'
ValueError; let unexpected exceptions propagate to _safe_execute so
the underlying error surfaces in logs.
chore: add commitlint hook and pip-audit pre-commit setup
The output_data column on Assistant Audit Log was rendering token-count metrics (input_tokens, output_tokens, total_tokens, tokens_used) as "***REDACTED***" because BaseTool._sanitize_data and audit_trail._sanitize_arguments both used a substring blocklist that flagged any key containing the word "token" — designed to catch access_token / api_token, but accidentally catching the metric shape. This shows up as soon as a tool returns LLM token counts in its result dict. The dedicated audit columns (set via the row's own fields) are unaffected; the readable output_data JSON was the only collateral. Replaced the substring blocklist with a regex-backed _is_sensitive_key helper. The regex matches ``token`` only when preceded/followed by a non-letter, so: - Redacted: token, access_token, refresh_token, jwt_token, bearer_token - Preserved: input_tokens, output_tokens, total_tokens, tokens_used - Plus the unchanged always-sensitive list: password, secret, api_key, apikey, auth, bearer, credential, private_key audit_trail._sanitize_arguments now delegates to the same helper so the defensive sink and BaseTool agree on what's a credential. Tests added: TestSensitiveKeyMatcher in test_audit_log.py covering 21 key shapes plus None/int defensive paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(audit-log): preserve token-count fields in sanitized output
…hild-doctype calls Closes buildswithpaul#168 (item 1). Child-table fields can now be updated through the parent doc with patch or replace semantics, picked per-table from the input shape: any row carrying a 'name' triggers patch mode (matched rows are updated, unmatched rows are appended, '_delete: true' removes a row, untouched rows are preserved); no 'name' on any row triggers replace mode (clear and re-append, matching create_document semantics). Restricted-field filtering now also runs against the child doctype's SENSITIVE/ADMIN_ONLY sets. Also reject direct calls with a child-table doctype as the target (e.g. "Sales Order Item"). Saving a child row in isolation bypasses the parent's validate() pipeline, leaving derived fields like row 'amount' and parent 'total'/'total_qty'/'grand_total' stale. The tool now resolves the parent doc + table fieldname and returns a structured 'child_doctype_direct_update' error with a ready-to-retry suggestion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(update_document): support child-table updates; reject direct child-doctype calls
…idate() Issue buildswithpaul#165 follow-up. The earlier fix moved the required-fields check from `data` to `doc.get(f)`, but the check still ran *before* `doc.insert()`. Many doctypes — Quotation in particular — populate `reqd` fields like `conversion_rate`, `price_list_currency`, and `plc_conversion_rate` inside the doctype controller's `set_missing_values()`, which is invoked from `validate()` and therefore not yet run at the point of the pre-flight check. The result was that valid payloads kept getting rejected with false positives like: Missing required fields: conversion_rate, price_list_currency, plc_conversion_rate This patch removes the pre-flight check entirely and lets Frappe's own `_validate_mandatory()` (which runs after `set_missing_values()`) be the authority. `frappe.MandatoryError` is caught and translated into the same structured response shape the pre-flight check used to produce, but with genuine missing-field names (parsed from the exception payload). A regression test against Quotation pins this: the previously-leaked fields must never appear in `missing_fields`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…thpaul#165 follow-up) fix(create_document): drop pre-flight reqd check, defer to Frappe validate() (buildswithpaul#165 follow-up)
…oryError handler
Regression caught in test review: `_, _, fields_part = error_msg.partition(": ")`
made `_` a function-local for the entire `execute()` body, so the later
`_("Document Creation Error")` call inside `frappe.log_error` raised
UnboundLocalError on paths that reached the generic `except Exception` branch
without first reaching the MandatoryError branch (e.g. a LinkValidationError
on an invalid Link reference).
Use `partition(...)[2]` instead of unpacking into named throwaways. Adds two
regression guards: one for the MandatoryError → structured-response path, one
for the generic exception path that proves `_(...)` is still callable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(create_document): don't shadow translation function `_` in MandatoryError handler
`preview_template` is whitelisted to any logged-in user, and stored
`Prompt Template` content can be authored by Assistant users — but the
three Jinja construction sites used a plain `jinja2.Environment`, which
exposes `__class__` / `__mro__` / `__subclasses__` and the full Python
class graph. A template like `{{ ''.__class__.__mro__[1].__subclasses__() }}`
would have executed under the rendering user.
All three sites (PromptTemplateManager._jinja_env, validate_template_syntax,
preview_template) now use `jinja2.sandbox.SandboxedEnvironment`, which
raises `SecurityError` on unsafe attribute lookups. Added regression test
asserting the canonical SSTI probe is blocked through both the manager
render path and the whitelisted preview endpoint.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…de save `hooks.py` registered a `doc_events["*"]` handler for `on_update`, `on_submit`, and `on_cancel`. The three callbacks all gated on `should_log_document()`, which unconditionally returned False — so the hook fired on every document save in every app on the bench, called a function, ran a `startswith` check, and no-oped. Strictly performance overhead with zero behaviour. Removed the wildcard hook registration and the three dead callbacks (`log_document_change`, `log_document_submit`, `log_document_cancel`) plus `should_log_document`. The real audit-trail pipeline (`log_tool_execution`) is untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… User" The "All" role grants read access to every session including website (guest) users. Prompt Category is internal assistant taxonomy; there's no reason for unauthenticated visitors to enumerate it. Changed the permissive read rule to "Assistant User", matching how the rest of the prompt subsystem is gated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… audit The Frappe Cloud marketplace audit's metadata regex flags any occurrence of "architecture" in the rendered README description. The README pointed at `docs/architecture/ARCHITECTURE.md` and `docs/architecture/MCP_STREAMABLEHTTP_GUIDE.md`. Renamed the directory to `docs/internals/`, renamed `ARCHITECTURE.md` to `INTERNALS.md`, relabelled the README link as "Internals", and updated every cross-reference in the docs tree so links don't 404. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds tesseract alongside paddleocr and ollama. Reintroduces the implementation removed in buildswithpaul#99 (commit 736b3fc).
- Use host_name from site config if available - Force https:// when host_name is configured - Fixes http -> https mismatch for production deployments - Fixes buildswithpaul#156
|
Thanks for picking this up — I hit the same issue on a customer's GKE deployment (Gateway API in front, plain HTTP to the Frappe nginx pod) and your fix would have resolved it. A couple of suggestions before this lands: 1. The PR includes unrelated changes from a stale
|
- Add _get_public_base_url() helper that respects host_name and force_https - Replace get_server_url() at all 5 call sites in oauth_discovery.py - Override Frappe-inherited URL keys in both openid_configuration() and authorization_server_metadata() to ensure issuer consistency - Drop trailing slash on issuer per RFC 8414 - Fixes buildswithpaul#156
4b0b595 to
190e961
Compare
Summary
OAuth discovery endpoint was returning http:// URLs even when host_name
was configured as https://. This breaks hosted OAuth clients like Claude
that reject non-HTTPS token endpoints.
Type of Change
Related Issues
Fixes #156
Checklist
developbranch (notmain)