Skip to content

feat: per-connector expiry overrides#4785

Open
0x2b3bfa0 wants to merge 10 commits into
dexidp:masterfrom
0x2b3bfa0:feat/per-connector-expiry
Open

feat: per-connector expiry overrides#4785
0x2b3bfa0 wants to merge 10 commits into
dexidp:masterfrom
0x2b3bfa0:feat/per-connector-expiry

Conversation

@0x2b3bfa0
Copy link
Copy Markdown
Contributor

@0x2b3bfa0 0x2b3bfa0 commented May 11, 2026

Closes #3557 ?

0x2b3bfa0 added 4 commits May 11, 2026 03:58
Adds an optional `expiry` block to each static connector in the YAML
config. Fields left unset inherit the top-level `expiry` configuration.

Supported overrides:
- `idTokens`
- `refreshTokens.absoluteLifetime`
- `refreshTokens.validIfNotUsedFor`
- `refreshTokens.reuseInterval`
- `refreshTokens.disableRotation`

Per-connector `idTokens` and `refreshTokens.absoluteLifetime` are
rejected at config load if they exceed the corresponding global value,
since signing key retention is derived from the global maximums.

`authRequests` and `deviceRequests` are intentionally left global:
`deviceRequests` is issued before the user picks a connector, and
`authRequests` have no meaningful connector-scoped semantics.

Addresses the per-connector variant of the need raised in dexidp#3557.

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
- Re-home tests into the existing per-source-file test files:
  cmd/dex/serve_test.go (buildConnectorExpiryOverrides tests),
  cmd/dex/config_test.go (YAML roundtrip), and
  server/oauth2_test.go (resolver and newIDToken tests).
- Extract parseConnectorExpiry to flatten the override-vs-inherit logic.
- Drop the refresh-token absoluteLifetime cap: refresh tokens are
  validated against storage, not signing keys, so key retention does
  not constrain their lifetime. Scope the safety check to idTokens.
- Shorten error messages to match the project's terse style.
- Apply golangci-lint fmt (struct field alignment in server/server.go).

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
- Drop tests that rewrapped stdlib behavior (invalid duration parse),
  tautological cases (empty input), and wall-clock lifetime probes
  that really test RefreshTokenPolicy.
- Keep only the validation-rule test and one consolidated case
  covering override/inherit for both idTokens and refresh tokens.
- Remove the YAML roundtrip tests: they exercise struct tags, not
  our own unmarshaler logic.

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
- Reuse the existing server.value() helper in idTokensValidForConn
  instead of a hand-rolled zero-check.
- Drop the duplicate per-connector refresh-token summary log; thread
  the connector_id into NewRefreshTokenPolicy via logger.With so the
  existing per-field logs carry it.
- Trim WHAT-only docstrings and tighten the rationale on
  ConnectorRefreshToken and buildConnectorExpiryOverrides.

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
@0x2b3bfa0 0x2b3bfa0 marked this pull request as ready for review May 11, 2026 00:43
@nabokihms
Copy link
Copy Markdown
Member

Hello @0x2b3bfa0 and thanks for providing a PR.

Two questions before reviewing:

  1. What will happen if a connector is saved via api with idTokenTTL: 1h, and then the globalIDTokenTTL changes to 30min in the config? PR does not address this case, but it is real. One way to solve it is to always use idTokenTTL = min(globalIDTokenTTL, connectorIDTokenTTL), but I don't like this much
  2. How will it work if we add a per-client expiration settings in the future?
  3. What is the real-world use case for the feature? Can you share it, please? I understand the per-client case when some applications cannot refresh tokens or have more trust than others.

@0x2b3bfa0
Copy link
Copy Markdown
Contributor Author

What is the real-world use case for the feature? Can you share it, please? I understand the per-client case when some applications cannot refresh tokens or have more trust than others.

Using the default Dex server in ArgoCD with two different connectors: one for humans (e.g. Microsoft Entra) and other for machines (e.g. GitHub Actions [1, 2])

Typically, human tokens are expected to expire in > 1 hour, while machine tokens are expected to expire in < 5 minutes. However, dex enforces an uniform token lifetime for all connectors, and having machine tokens that last e.g. 24 hours (default) is a substantial security risk.

Implementhing per-connector expiry configuration will allow me to provide both the convenience of long sessions for humans, and the security of short sessions for machines.

While waiting for this feature, I have chosed the compromise of 1 hour for both humans and machines, albeit it's too short for the former and too long for the latter. I could maybe shorten the global expiry and then use refresh tokens once argoproj/argo-cd#27777 is released, instead of relying on individual expiry configuration, but my previous attempts to find a workaround were unsuccessful.

@0x2b3bfa0
Copy link
Copy Markdown
Contributor Author

0x2b3bfa0 commented May 11, 2026

1. What will happen if a connector is saved via api with idTokenTTL: 1h, and then the globalIDTokenTTL changes to 30min in the config? PR does not address this case, but it is real. One way to solve it is to always use idTokenTTL = min(globalIDTokenTTL, connectorIDTokenTTL), but I don't like this much

2. How will it work if we add a per-client expiration settings in the future?

I'd say that expiration needs to be enforced, so client <= connector <= global is always true. Dynamic changes to configuration would fail if this rule isn't met (e.g. you can only lower the global expiry until it equals the connector expiry, not less); I hope this makes sense. 🫣

Apply the invariant "per-connector value must not exceed the global
value" to every time-based refresh-token field, not just idTokens.
A global value left unset means "no ceiling", so an override is
accepted. disableRotation stays exempt: it's a policy toggle rather
than a quantity, so "stricter" has no natural direction.

Rejecting violations at config load rather than silently clamping at
issuance keeps operator intent auditable: if an operator later drops
the global below a persisted per-connector value, the next startup
fails with a clear error. The same rule extends to future per-client
overrides with the connector (or global, if unset) as their ceiling.

Globals are parsed once into a small refreshCeilings struct and the
per-field checks run through a table, so adding a new ceiling-bound
field is one line in two places.

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
0x2b3bfa0 added 4 commits May 13, 2026 03:43
Persist per-connector expiry overrides in the storage layer (via a new
Expiry field on storage.Connector) so the rule "per-connector value
must not exceed the global value" can be enforced at every write path,
not just static config load:

- Static YAML connectors: validated when loaded into storage and
  re-validated when the server reads them at startup. Invalid values
  refuse to start dex.
- gRPC CreateConnector / UpdateConnector: validate against the current
  global ceilings before persisting. Rejected writes leave both the
  storage and the in-memory override map untouched.
- gRPC DeleteConnector: drops the matching in-memory override.
- Restart with a stale persisted override (global lowered below it):
  startup re-validation surfaces the conflict as an error.

The validation rule and override-resolution logic live in a new
server/expiry.go and are exercised by every code path that can change
a connector's expiry, so the hierarchy is preserved without any silent
clamping. disableRotation remains exempt: it is a policy toggle, not
a quantity, so "stricter" has no natural direction.

Storage backends:
- storage.Connector gains an optional Expiry field with JSON tags.
- SQL: new nullable expiry column + migration.
- ent: typed JSON field, regenerated code.
- kubernetes / etcd / memory pick up the change through the existing
  JSON-based serialization.

API surface:
- proto: ConnectorExpiry / ConnectorRefreshExpiry / ConnectorExpiryUpdate
  messages, with optional bool DisableRotation for the inherit-vs-override
  distinction.
- ConnectorExpiryUpdate uses the standard "absent leaves alone, present
  with nil Value clears, present with Value sets" wrapper.

Conformance tests now roundtrip Expiry across every backend.

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
Per-connector overrides may now only tighten rotation, not loosen it.
Concretely: when the global expiry.refreshTokens.disableRotation is
false (rotation enabled — the stricter policy, shorter replay window
after compromise), a connector setting disableRotation: true is
rejected. The reverse direction (global disabled, connector enables)
remains permitted as a tightening.

This closes the one gap left by the prior "global is the ceiling"
design: ValidateConnectorExpiry now covers every field of
ConnectorExpiry instead of leaving disableRotation as an exception.

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
- Fix missing Expiry round-trip in the kubernetes storage backend mirror
  (would silently drop overrides on a real cluster; conformance tests
  skip without a live cluster).
- Honor ContinueOnConnectorFailure in the startup validation loop; a
  bad persisted override no longer fails startup when the flag is set.
- Rename ConnectorRefreshToken -> ConnectorRefreshExpiry for symmetry
  with the storage / proto types.
- Unexport ValidateConnectorExpiry and drop the ExpiryCeilings() method
  in favor of direct field access (same package).
- Inline single-call-site helpers (connectorExpiryToStorage,
  connectorExpiryToProto, marshalConnectorExpiry) matching the
  GrantTypes / inline-struct-copy idiom already used in the file.
- Drop log spam: NewRefreshTokenPolicy emits per-field Info logs that
  are useful for the global policy but spam at N connectors x 4 fields.
  Pass a discard logger from buildConnectorExpiryOverride; add a single
  audit line in the startup loop per connector that has an override.
- Trim WHAT-only docstrings, fix stale mu field comment, tighten
  the SQL nil-vs-null comment.
- Drop omitempty from cmd/dex Connector.Expiry tag for sibling
  consistency.

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
RefreshTokenPolicy.CompletelyExpired and ExpiredBecauseUnused treat a
parsed duration of 0 as "expiration disabled". The previous validation
compared numerically (d > ceiling), which let "0s" pass under any
positive global ceiling — semantically meaning infinite lifetime,
strictly looser than the global.

A connector administrator (gRPC API write access or YAML config) could
install:

  expiry:
    refreshTokens:
      absoluteLifetime: "0s"
      validIfNotUsedFor: "0s"

and end up with refresh tokens that never expire on either axis, while
the global policy enforces a positive bound. Stolen tokens issued
through that connector would survive indefinitely.

Fix: extend checkCeiling with a zeroDisables flag; reject d == 0 in
the presence of a positive ceiling for absoluteLifetime and
validIfNotUsedFor. idTokens and reuseInterval are exempt:
- idTokens: zero falls back to the global via value() at runtime
- reuseInterval: zero means "no reuse window" — stricter, not looser

Regression tests cover the rejection and the reuseInterval exception.

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
@0x2b3bfa0
Copy link
Copy Markdown
Contributor Author

What will happen if a connector is saved via api with idTokenTTL: 1h, and then the globalIDTokenTTL changes to 30min in the config? PR does not address this case, but it is real.

@nabokihms, I have updated the pull request with some LLM-assisted modifications addressing this case; I hope they're correct. 🫣

Fold server/expiry.go into server/server.go and server/expiry_test.go
into server/server_test.go. The validation, ceiling-check, and
override-build helpers are server-construction concerns called from
NewServer and from API write paths on *Server, so they belong next to
ConnectorExpiryOverride and upsertConnectorExpiryOverride rather than
in a separate file.

Collapse the test functions into table-driven form matching the
project's existing idiom (TestRefreshTokenExpirationScenarios /
TestRefreshTokenAuthTime in refreshhandlers_test.go):

- 11x TestValidateConnectorExpiry_* -> TestValidateConnectorExpiry
  with 11 t.Run subcases.
- 3x TestBuildConnectorExpiryOverride_* -> TestBuildConnectorExpiryOverride
  with 3 subcases.
- 4x TestBuildExpiryCeilings* in cmd/dex/serve_test.go -> one
  table-driven test with 4 subcases.
- TestUpsertConnectorExpiryOverride stays standalone (state-mutating).

Other minor cleanups:
- errMatch -> wantErrContains for naming consistency with the rest of
  the package (server_test.go, api_test.go, etc.).
- Subcase names use plain prose rather than period/equals punctuation.
- disable/enable bool locals renamed to disableRotation/enableRotation
  so the names don't invert meaning.
- Package-level discardLogger var replaces per-call
  slog.New(slog.DiscardHandler) in buildConnectorExpiryOverride.
- Trimmed WHAT comments and the "single source of truth" docstring
  over-claim on validateConnectorExpiry.

Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow for registered clients to set client-specific expiry

2 participants