feat: per-connector expiry overrides#4785
Conversation
Adds an optional `expiry` block to each static connector in the YAML config. Fields left unset inherit the top-level `expiry` configuration. Supported overrides: - `idTokens` - `refreshTokens.absoluteLifetime` - `refreshTokens.validIfNotUsedFor` - `refreshTokens.reuseInterval` - `refreshTokens.disableRotation` Per-connector `idTokens` and `refreshTokens.absoluteLifetime` are rejected at config load if they exceed the corresponding global value, since signing key retention is derived from the global maximums. `authRequests` and `deviceRequests` are intentionally left global: `deviceRequests` is issued before the user picks a connector, and `authRequests` have no meaningful connector-scoped semantics. Addresses the per-connector variant of the need raised in dexidp#3557. Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
- Re-home tests into the existing per-source-file test files: cmd/dex/serve_test.go (buildConnectorExpiryOverrides tests), cmd/dex/config_test.go (YAML roundtrip), and server/oauth2_test.go (resolver and newIDToken tests). - Extract parseConnectorExpiry to flatten the override-vs-inherit logic. - Drop the refresh-token absoluteLifetime cap: refresh tokens are validated against storage, not signing keys, so key retention does not constrain their lifetime. Scope the safety check to idTokens. - Shorten error messages to match the project's terse style. - Apply golangci-lint fmt (struct field alignment in server/server.go). Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
- Drop tests that rewrapped stdlib behavior (invalid duration parse), tautological cases (empty input), and wall-clock lifetime probes that really test RefreshTokenPolicy. - Keep only the validation-rule test and one consolidated case covering override/inherit for both idTokens and refresh tokens. - Remove the YAML roundtrip tests: they exercise struct tags, not our own unmarshaler logic. Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
- Reuse the existing server.value() helper in idTokensValidForConn instead of a hand-rolled zero-check. - Drop the duplicate per-connector refresh-token summary log; thread the connector_id into NewRefreshTokenPolicy via logger.With so the existing per-field logs carry it. - Trim WHAT-only docstrings and tighten the rationale on ConnectorRefreshToken and buildConnectorExpiryOverrides. Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
|
Hello @0x2b3bfa0 and thanks for providing a PR. Two questions before reviewing:
|
Using the default Dex server in ArgoCD with two different connectors: one for humans (e.g. Microsoft Entra) and other for machines (e.g. GitHub Actions [1, 2]) Typically, human tokens are expected to expire in > 1 hour, while machine tokens are expected to expire in < 5 minutes. However, Implementhing per-connector expiry configuration will allow me to provide both the convenience of long sessions for humans, and the security of short sessions for machines. While waiting for this feature, I have chosed the compromise of 1 hour for both humans and machines, albeit it's too short for the former and too long for the latter. I could maybe shorten the global expiry and then use refresh tokens once argoproj/argo-cd#27777 is released, instead of relying on individual expiry configuration, but my previous attempts to find a workaround were unsuccessful. |
I'd say that expiration needs to be enforced, so |
Apply the invariant "per-connector value must not exceed the global value" to every time-based refresh-token field, not just idTokens. A global value left unset means "no ceiling", so an override is accepted. disableRotation stays exempt: it's a policy toggle rather than a quantity, so "stricter" has no natural direction. Rejecting violations at config load rather than silently clamping at issuance keeps operator intent auditable: if an operator later drops the global below a persisted per-connector value, the next startup fails with a clear error. The same rule extends to future per-client overrides with the connector (or global, if unset) as their ceiling. Globals are parsed once into a small refreshCeilings struct and the per-field checks run through a table, so adding a new ceiling-bound field is one line in two places. Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
e860130 to
0b26d1c
Compare
Persist per-connector expiry overrides in the storage layer (via a new Expiry field on storage.Connector) so the rule "per-connector value must not exceed the global value" can be enforced at every write path, not just static config load: - Static YAML connectors: validated when loaded into storage and re-validated when the server reads them at startup. Invalid values refuse to start dex. - gRPC CreateConnector / UpdateConnector: validate against the current global ceilings before persisting. Rejected writes leave both the storage and the in-memory override map untouched. - gRPC DeleteConnector: drops the matching in-memory override. - Restart with a stale persisted override (global lowered below it): startup re-validation surfaces the conflict as an error. The validation rule and override-resolution logic live in a new server/expiry.go and are exercised by every code path that can change a connector's expiry, so the hierarchy is preserved without any silent clamping. disableRotation remains exempt: it is a policy toggle, not a quantity, so "stricter" has no natural direction. Storage backends: - storage.Connector gains an optional Expiry field with JSON tags. - SQL: new nullable expiry column + migration. - ent: typed JSON field, regenerated code. - kubernetes / etcd / memory pick up the change through the existing JSON-based serialization. API surface: - proto: ConnectorExpiry / ConnectorRefreshExpiry / ConnectorExpiryUpdate messages, with optional bool DisableRotation for the inherit-vs-override distinction. - ConnectorExpiryUpdate uses the standard "absent leaves alone, present with nil Value clears, present with Value sets" wrapper. Conformance tests now roundtrip Expiry across every backend. Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
Per-connector overrides may now only tighten rotation, not loosen it. Concretely: when the global expiry.refreshTokens.disableRotation is false (rotation enabled — the stricter policy, shorter replay window after compromise), a connector setting disableRotation: true is rejected. The reverse direction (global disabled, connector enables) remains permitted as a tightening. This closes the one gap left by the prior "global is the ceiling" design: ValidateConnectorExpiry now covers every field of ConnectorExpiry instead of leaving disableRotation as an exception. Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
- Fix missing Expiry round-trip in the kubernetes storage backend mirror (would silently drop overrides on a real cluster; conformance tests skip without a live cluster). - Honor ContinueOnConnectorFailure in the startup validation loop; a bad persisted override no longer fails startup when the flag is set. - Rename ConnectorRefreshToken -> ConnectorRefreshExpiry for symmetry with the storage / proto types. - Unexport ValidateConnectorExpiry and drop the ExpiryCeilings() method in favor of direct field access (same package). - Inline single-call-site helpers (connectorExpiryToStorage, connectorExpiryToProto, marshalConnectorExpiry) matching the GrantTypes / inline-struct-copy idiom already used in the file. - Drop log spam: NewRefreshTokenPolicy emits per-field Info logs that are useful for the global policy but spam at N connectors x 4 fields. Pass a discard logger from buildConnectorExpiryOverride; add a single audit line in the startup loop per connector that has an override. - Trim WHAT-only docstrings, fix stale mu field comment, tighten the SQL nil-vs-null comment. - Drop omitempty from cmd/dex Connector.Expiry tag for sibling consistency. Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
RefreshTokenPolicy.CompletelyExpired and ExpiredBecauseUnused treat a
parsed duration of 0 as "expiration disabled". The previous validation
compared numerically (d > ceiling), which let "0s" pass under any
positive global ceiling — semantically meaning infinite lifetime,
strictly looser than the global.
A connector administrator (gRPC API write access or YAML config) could
install:
expiry:
refreshTokens:
absoluteLifetime: "0s"
validIfNotUsedFor: "0s"
and end up with refresh tokens that never expire on either axis, while
the global policy enforces a positive bound. Stolen tokens issued
through that connector would survive indefinitely.
Fix: extend checkCeiling with a zeroDisables flag; reject d == 0 in
the presence of a positive ceiling for absoluteLifetime and
validIfNotUsedFor. idTokens and reuseInterval are exempt:
- idTokens: zero falls back to the global via value() at runtime
- reuseInterval: zero means "no reuse window" — stricter, not looser
Regression tests cover the rejection and the reuseInterval exception.
Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
@nabokihms, I have updated the pull request with some LLM-assisted modifications addressing this case; I hope they're correct. 🫣 |
Fold server/expiry.go into server/server.go and server/expiry_test.go into server/server_test.go. The validation, ceiling-check, and override-build helpers are server-construction concerns called from NewServer and from API write paths on *Server, so they belong next to ConnectorExpiryOverride and upsertConnectorExpiryOverride rather than in a separate file. Collapse the test functions into table-driven form matching the project's existing idiom (TestRefreshTokenExpirationScenarios / TestRefreshTokenAuthTime in refreshhandlers_test.go): - 11x TestValidateConnectorExpiry_* -> TestValidateConnectorExpiry with 11 t.Run subcases. - 3x TestBuildConnectorExpiryOverride_* -> TestBuildConnectorExpiryOverride with 3 subcases. - 4x TestBuildExpiryCeilings* in cmd/dex/serve_test.go -> one table-driven test with 4 subcases. - TestUpsertConnectorExpiryOverride stays standalone (state-mutating). Other minor cleanups: - errMatch -> wantErrContains for naming consistency with the rest of the package (server_test.go, api_test.go, etc.). - Subcase names use plain prose rather than period/equals punctuation. - disable/enable bool locals renamed to disableRotation/enableRotation so the names don't invert meaning. - Package-level discardLogger var replaces per-call slog.New(slog.DiscardHandler) in buildConnectorExpiryOverride. - Trimmed WHAT comments and the "single source of truth" docstring over-claim on validateConnectorExpiry. Signed-off-by: Helio Machado <0x2b3bfa0+git@googlemail.com>
expiryblock per static connector, overridingidTokensandrefreshTokens.*from the global expiry configidTokensvalues exceeding the global maximum, since the signer's key retention window is sized from it; see Allow for registered clients to set client-specific expiry #3557)Closes #3557 ?