Skip to content

Per-user MCP OAuth: backend status API, UI consent controls, worker-side token resolution #217

@AjayThorve

Description

@AjayThorve

Background

NAT 1.6 ships the protocol-level building blocks for per-user MCP OAuth:
mcp_oauth2 (auth provider) and per_user_mcp_client (function group with
per-user token storage). AIQ 2.1 cannot yet drive these end-to-end because
three integration surfaces are missing.

This issue tracks the AIQ-side work needed to unblock per-user MCP OAuth so
that connectors like ServiceNow, Salesforce, GitHub, Google Workspace, etc.
can be wired up with row-level-security-preserving auth instead of shared
service-account credentials.

See docs/source/customization/mcp-tools.md:288-308 for the existing
planning note this expands.

Gaps in AIQ 2.1

1. Backend: per-MCP auth status on /v1/data_sources

GET /v1/data_sources does not surface per-MCP auth state. UI clients have
no way to know:

  • Whether a given MCP source is connected for the signed-in user.
  • The OAuth connect / reconnect URL to send the user through.
  • Required scopes.
  • Token expiry / refresh state.
  • Error state (revoked, scope mismatch, IdP unreachable).

Proposal: extend data_sources response with a per_user_auth block
when the underlying function group is per_user_mcp_client:

{
  "id": "servicenow_mcp",
  "requires_auth": true,
  "per_user_auth": {
    "status": "disconnected" | "connected" | "expired" | "error",
    "connect_url": "/v1/auth/mcp/servicenow_mcp/connect",
    "disconnect_url": "/v1/auth/mcp/servicenow_mcp/disconnect",
    "scopes": ["read:incident", "read:kb"],
    "expires_at": "2026-05-06T18:00:00Z",
    "error": null
  }
}

2. UI: per-source Connect / Reconnect controls

The data-source UI has no per-MCP "Connect" / "Reconnect" / "Disconnect"
controls. Once the backend exposes the status above, the UI needs:

  • A connect button that opens the OAuth consent flow (popup or redirect).
  • Reconnect on expired / error state.
  • Disabled-with-tooltip state when a user toggles on a source they haven't
    connected, blocking submission until they do.
  • Surfacing of token-refresh failures emitted mid-job (overlaps with Server-side token refresh for long-running async jobs #215).

3. Worker-side per-user MCP token resolution

Async deep-research jobs run in Dask workers and currently capture a single
AIQ bearer token at submit time (see frontends/aiq_api/src/aiq_api/jobs/submit.py
and runner.py). They cannot yet resolve per-user, per-MCP-server
tokens inside the worker.

per_user_mcp_client expects a per-user token store the function group can
consult during a tool call. The worker needs:

  • A reference to the user's session (not the frozen AIQ token).
  • A MCPTokenStore lookup keyed by (user_sub, mcp_server_id).
  • Refresh-on-demand against the IdP / MCP server's token endpoint when the
    cached access token is near expiry.
  • Structured failure events when a per-MCP token refresh fails so the UI can
    prompt reconnect mid-job.

This shares infrastructure with #215 (server-side token refresh for the
AIQ session token). Suggest landing #215's TokenStore interface first and
adding MCPTokenStore as a sibling implementation, both backed by Postgres
via NAT_JOB_STORE_DB_URL.

Acceptance criteria

  • GET /v1/data_sources returns a per_user_auth block for sources backed by per_user_mcp_client, with status, connect_url, disconnect_url, scopes, expires_at, and error fields.
  • Backend exposes connect / disconnect / callback endpoints that drive the mcp_oauth2 flow per MCP server.
  • UI renders Connect / Reconnect / Disconnect controls per MCP source and surfaces expired / error states.
  • UI blocks submission (with explanation) when a user enables a source they have not connected.
  • An MCPTokenStore is available to Dask workers and resolves (user_sub, mcp_server_id) to a fresh access token, refreshing on-demand.
  • Token-refresh failures inside a worker emit a structured job event the UI can act on.
  • An end-to-end example (e.g. ServiceNow or Salesforce MCP) is documented in docs/source/customization/mcp-tools.md.

Why this matters

Without this feature, MCP examples for connectors that carry user-scoped
data (ServiceNow, Salesforce, GitHub, Google Workspace, O365) can only be
shipped as service-account, which collapses every AIQ user onto one
downstream identity and defeats row-level security. This is a blocker for
shipping SNOW and SFDC MCP examples (deferred from 2.1 to 2.2 for this
reason).

Out of scope

  • New IdP-specific MCP server adapters — examples and adapters for
    individual MCP servers (SNOW, SFDC, etc.) are tracked separately and
    depend on this feature landing.
  • CLI per-user MCP OAuth — CLI flows can keep using local refresh caches.
  • AIQ session-token refresh for long jobs — see Server-side token refresh for long-running async jobs #215.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions