fix(sdk)!: reclassify KAS 400 errors — distinguish tamper from misconfiguration#3166
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the error handling within the SDK to provide a more accurate classification of KAS-related errors. By introducing a dedicated error type for KAS client/configuration issues and adjusting the parentage of an existing error, it prevents false positives when checking for data tampering and improves the clarity of error propagation. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Error's path, once tangled, Now unwinds, a clearer view. Tamper's false alarm, gone. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly reclassifies ErrRewrapBadRequest to be a KAS request error instead of a tamper error, which improves error handling clarity. The introduction of ErrKASRequestError is a good step towards better error categorization. I've suggested one minor improvement to also bring ErrRewrapForbidden into this new error hierarchy for consistency.
Benchmark results, click to expandBenchmark authorization.GetDecisions Results:
Benchmark authorization.v2.GetMultiResourceDecision Results:
Benchmark Statistics
Bulk Benchmark Results
TDF3 Benchmark Results:
|
X-Test Failure Report❌ js-pull-3166 |
Benchmark results, click to expandBenchmark authorization.GetDecisions Results:
Benchmark authorization.v2.GetMultiResourceDecision Results:
Benchmark Statistics
Bulk Benchmark Results
TDF3 Benchmark Results:
|
|
Companion xtest PR: opentdf/tests#421 That PR updates |
…421) ## Summary - Add `assert_kas_request_error` for policy binding failure tests (`test_tdf_with_unbound_policy`, `test_tdf_with_altered_policy_binding`) - These tests now correctly expect KAS request errors (400) rather than tamper/integrity errors - Backward compatible: accepts both new (`KAS request error`, `rewrap request 400`) and legacy (`tamper`, `InvalidFileError`) error strings - Remove unused `"wrap"` case from `assert_tamper_error` ## Context Companion to opentdf/platform#3166, which reclassifies `ErrRewrapBadRequest` away from `ErrTampered` and under a new `ErrKASRequestError`. Policy binding mismatches cause KAS to return 400 Bad Request — a request-level error, not a tamper/integrity issue. ## Test plan - [ ] Merge this PR first - [ ] Verify opentdf/platform#3166 xtest jobs pass after this lands 🤖 Generated with [Claude Code](https://claude.com/claude-code) Signed-off-by: Mary Dickson <mary.dickson@virtru.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@elizabethhealy identified a case where this change can raise a false negative. Moving back to draft while investigating what's possible in KAS. |
Benchmark results, click to expandBenchmark authorization.GetDecisions Results:
Benchmark authorization.v2.GetMultiResourceDecision Results:
Benchmark Statistics
Bulk Benchmark Results
TDF3 Benchmark Results:
|
X-Test Failure Report |
Benchmark results, click to expandBenchmark authorization.GetDecisions Results:
Benchmark authorization.v2.GetMultiResourceDecision Results:
Benchmark Statistics
Bulk Benchmark Results
TDF3 Benchmark Results:
|
Benchmark results, click to expandBenchmark authorization.GetDecisions Results:
Benchmark authorization.v2.GetMultiResourceDecision Results:
Benchmark Statistics
Bulk Benchmark Results
TDF3 Benchmark Results:
|
…ic for tamper Instead of changing the gRPC status code for policy binding failures (which would leak information about secret-key computations), make non-secret KAS 400 errors descriptive (e.g. "unsupported key type", "key access object is nil") so the SDK can distinguish them from the generic "bad request" that policy binding failures produce. SDK logic: generic "bad request" from KAS → ErrRewrapBadRequest (under ErrTampered, potential integrity failure); specific message → ErrKASRequestError (misconfiguration, not tamper). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Mary Dickson <mary.dickson@virtru.com>
…KAS errors Document why policy binding and DEK decryption failures intentionally use generic "bad request" messages — to avoid leaking information about computations involving secret key material. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Mary Dickson <mary.dickson@virtru.com>
Policy body parse failures and binding encoding errors may indicate tamper — keep them as generic "bad request" so the SDK classifies them under ErrTampered. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Mary Dickson <mary.dickson@virtru.com>
Extract "bad request" into kasGenericBadRequest constant in the SDK so both sides of the contract are linked. Add KAS-side comment referencing the SDK constant. Expand test coverage with subtests for all descriptive KAS messages and a substring false-match edge case. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Mary Dickson <mary.dickson@virtru.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Mary Dickson <mary.dickson@virtru.com>
…r messages Address code review feedback: - Change kasGenericBadRequest from "bad request" to "desc = bad request" to anchor the match to the gRPC status description field, avoiding false positives from middleware or error wrapping - Replace dynamic err400(err.Error()) with fixed descriptive message to prevent information leakage and accidental tamper classification - Add all 10 descriptive KAS messages to the test table (was 5) - Add test for middleware-injected "bad request" without desc prefix - Update comments to accurately describe substring matching behavior Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Mary Dickson <mary.dickson@virtru.com>
When verifyRewrapRequests returned policyErr, tdf3Rewrap discarded the
per-KAO results (which contained generic "bad request" tamper signals)
and returned err400("invalid request") instead. The SDK classified
"invalid request" as ErrKASRequestError, silently losing the tamper
signal for corrupted policy bodies.
Fix: always store per-KAO results before continuing, regardless of the
error type from verifyRewrapRequests. This ensures the SDK receives the
per-KAO "bad request" entries and classifies them as ErrTampered.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Mary Dickson <mary.dickson@virtru.com>
A 400 (misconfiguration) and a 403 (authorization denied) have different failure modes and different remediation paths. Nesting ErrRewrapForbidden under ErrKASRequestError repeats the same lumping problem this PR set out to fix — callers catching ErrKASRequestError to retry with different config would incorrectly retry on 403s too. The ticket (DSPX-2606) explicitly specified ErrKASRequestError should be "separate from ErrRewrapForbidden (authorization)." Restore that separation: ErrRewrapForbidden is now a standalone sentinel again. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Mary Dickson <mary.dickson@virtru.com>
8ec8182
12ee687 to
8ec8182
Compare
Invalidated by push of 8ec8182
Benchmark results, click to expandBenchmark authorization.GetDecisions Results:
Benchmark authorization.v2.GetMultiResourceDecision Results:
Benchmark Statistics
Bulk Benchmark Results
TDF3 Benchmark Results:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@service/kas/access/rewrap.go`:
- Around line 796-800: The base64 decode error path around
base64.StdEncoding.Decode(policyBinding, []byte(policyBindingB64Encoded))
currently reports a generic err400("bad request"); change that to return a
descriptive client-facing error (e.g. err400("invalid policy binding encoding"))
so the response matches the logged message and clarifies the failure; update the
call to failedKAORewrap(results, kao, ...) to pass the new descriptive error
while leaving the existing p.Logger.WarnContext(ctx, "invalid policy binding
encoding", slog.Any("error", err)) intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5e09cb2d-8fd5-4489-97ec-b51a70a866b6
📒 Files selected for processing (5)
docs/Contributing.mdsdk/tdf.gosdk/tdf_test.gosdk/tdferrors.goservice/kas/access/rewrap.go
Problem
Today, every KAS
InvalidArgument(400) error is classified asErrRewrapBadRequest→ErrTampered, regardless of the actual cause. That meanserrors.Is(err, sdk.ErrTampered)returnstruefor misconfiguration errors like a wrong key ID, unsupported key type, or missing EC-wrapped support — not just actual integrity failures like policy binding mismatches or DEK decryption failures.This makes
ErrTamperedunreliable as a signal. SDK consumers cannot distinguish "this TDF was tampered with" from "your KAS setup is wrong," which undermines the tamper detection API.Approach
Changing gRPC status codes for policy binding failures would leak information about computations involving secret key material. Instead, this PR splits the error signal at the message level:
KAS (
service/kas/access/rewrap.go): non-secret 400s now return descriptive messages (e.g."unsupported key type","key access object is nil","ec-wrapped not enabled"). Errors involving secret key material — policy binding, DEK decryption, corrupted policy body, malformed binding encoding — keep the generic"bad request"to avoid leaking secret-derived information.SDK (
sdk/tdf.go): classifies errors based on the message. The substring"desc = bad request"anchored to the gRPC status description field = potential tamper (ErrRewrapBadRequestunderErrTampered); anything else withInvalidArgument= misconfiguration (ErrKASRequestError, notErrTampered). Per-KAO errors are serialized as plain strings through the proto response (not as gRPC status errors), so substring matching is the only classification mechanism available.Shared contract: the generic message pattern is defined as
kasGenericBadRequestinsdk/tdferrors.gowith a cross-reference comment inservice/kas/access/rewrap.goso both sides stay in sync. The companion xtest PR (fix(xtest): rename policy binding assertions to reflect tamper classification tests#422) provides runtime enforcement across Go, Java, and JS SDKs.Error classification
errors.Is(ErrTampered)?"bad request"(generic)ErrRewrapBadRequest"bad request"(generic)ErrRewrapBadRequest"bad request"(generic)ErrRewrapBadRequest"unsupported key type")ErrKASRequestError"forbidden"ErrRewrapForbiddenBreaking changes
1.
ErrTamperedno longer matches KAS misconfiguration errorserrors.Is(err, sdk.ErrTampered)no longer matches KAS misconfiguration errors (descriptive 400s). Consumers who relied on this to catch all KAS 400s should add a check forErrKASRequestError:2.
ErrRewrapForbiddenis now underErrKASRequestErrorErrRewrapForbiddennow wrapsErrKASRequestError, meaningerrors.Is(err, ErrKASRequestError)matches both misconfiguration 400s and forbidden 403s. PreviouslyErrRewrapForbiddenwas a standalone error. This groups all KAS request-level errors under one sentinel, separate from the tamper hierarchy.3. Pre-existing descriptive
err400calls reclassified16 pre-existing
err400calls in the early request validation path (e.g."invalid request body","missing request body","clientPublicKey failure","bad key for rewrap") already had descriptive messages. Under the old code, ALLInvalidArgumenterrors were classified asErrRewrapBadRequest(tamper) regardless of message. Now they are correctly classified asErrKASRequestError(misconfiguration). These are all client-side validation errors that do not involve secret key material, so this is the correct classification.4.
tdf3Rewrapno longer short-circuits on per-KAO errorsPreviously,
tdf3Rewraphad an early-return path whereverifyRewrapRequestserrors (other thanerrNoValidKeyAccessObjects) would short-circuit with a top-levelerr400("invalid request"), discarding per-KAO results. Now per-KAO results are always preserved in the results map even when errors occur, so tamper signals from individual KAOs reach the SDK instead of being replaced by a generic top-level error.Test plan
TestGetKasErrorToReturn— all 11 descriptive messages, generic "bad request" →ErrTampered, desc-prefix anchoring, middleware false-positive preventiontamper-error-splitfeature flag and strict assertion that policy binding failures produce tamper errors (not KAS request errors)🤖 Generated with Claude Code
Summary by CodeRabbit
Documentation
Bug Fixes
Tests