feat: add user metadata support for RLS policies#825
feat: add user metadata support for RLS policies#825TylerHillery wants to merge 22 commits intomasterfrom
Conversation
Pull Request Test Coverage Report for Build 21612994975Details
💛 - Coveralls |
|
@TylerHillery awesome! Can we also pass in the |
I thought we couldn't do this because we don't get the We could switch the order but then we have to ensure to delete the object if user can't upload and we would be do the "expensive" part uploading when we didn't have to. Much quicker to test RLS first and skip uploading if we don't have to. Here is the /**
* Extracts file information from the request and upload the buffer
* to the remote storage
* @param request
* @param options
*/
async upload(request: UploadRequest) {
// NOTE: canUpload() gets called inside prepare upload
const version = await this.prepareUpload(request)
try {
const file = request.file
const s3Key = this.location.getKeyLocation({
tenantId: this.db.tenantId,
bucketId: request.bucketId,
objectName: request.objectName,
})
// NOTE: we don't get object metadata until here
const objectMetadata = await this.backend.uploadObject(
storageS3Bucket,
s3Key,
version,
file.body,
file.mimeType,
file.cacheControl,
request.signal
)
if (request.file.xRobotsTag) {
objectMetadata.xRobotsTag = request.file.xRobotsTag
}
if (file.isTruncated()) {
throw ERRORS.EntityTooLarge()
}
return this.completeUpload({
...request,
version,
objectMetadata: objectMetadata,
userMetadata: { ...request.userMetadata },
})
} catch (e) {
await ObjectAdminDelete.send({
name: request.objectName,
bucketId: request.bucketId,
tenant: this.db.tenant(),
version: version,
reqId: this.db.reqId,
})
throw e
}
} |
Other fields like etag, lastModified, it's fine if they are empty on the first RLS check |
| .from(bucketName) | ||
| .signUploadObjectUrl(objectName, urlPath as string, uploadSignedUrlExpirationTime, owner, { | ||
| upsert: request.headers['x-upsert'] === 'true', | ||
| userMetadata: userMetadata, |
There was a problem hiding this comment.
I think we don't actually support userMetadata on upload signed urls (which we should) so I'm glad you found this.
I believe we need to store the userMetadata in the returned JWT payload; otherwise, this information will be lost during the signedUpload request. We need to be careful, though, if we decide to store the metadata in the JWT, because it can be abused with large payloads. We can for now restrict the amount of data that we can store in the JWT to something like 1MB i think we already have a constant MAX_CUSTOM_METADATA_SIZE, which can be reused (re-refactored as a validation function, potentially in limits.ts)
But we also need to add the second part (on the upload route) to store the userMetadata received from the JWT. This way, the metadata can't be tempered.
There was a problem hiding this comment.
I will do a follow up PR to handle the upload part.
57b96fb to
05a322a
Compare
I have updated canUpload to accept a metadata parameter that gets passed to the RLS check. Only content-length and content-typ get passed through. Multipart uploads required a little more work as the |
src/storage/uploader.ts
Outdated
| bucketId: string | ||
| objectName: string | ||
| file: FileUpload | ||
| userMetadata: Record<string, unknown> | undefined |
There was a problem hiding this comment.
why do we diverge from optionality? userMetadata?: Record<string, unknown>
There was a problem hiding this comment.
While implementing this change I found it very easy to forget to pass userMetadata somewhere when it should be passed so I thought I make it required but allow undefined if it's not needed.
I called this out in the PR as wasn't sure if there was a better way and wasn't 100% confident about this solution. If you think optionality is better here I am happy to change back.
calls to ensure userMetadata is passed through. I found it difficult to ensure I caught every cause though, e.g. it was very easy to miss uploadNewObject needed to be updated in the uploadFromRequest. I'm open to feedback to see if there is a more robust solution we could implement that could be caught with typescript type
There was a problem hiding this comment.
I see your point but I would go with optional since more idiomatic. Maybe for cleanup afterwards
There was a problem hiding this comment.
reverted back to optional
src/storage/database/knex.ts
Outdated
| owner_id: owner, | ||
| user_metadata: metadata, | ||
| user_metadata: userMetadata, | ||
| metadata: metadata, |
There was a problem hiding this comment.
don't we need a guard rule for new column similar to user_metadata?
There was a problem hiding this comment.
yes you're right. I added a guard rule but it feels a bit hacky. The reason I didn't add it to rules in normalizedColumns is because metadata column does exist in objects table. It's my understanding any metadata column would get remove regardless of the table it's selecting from.
So instead I checked if latest migration was >= so s3-multipart-uploads-metadata and added it to the list of columns to select.
I almost feel normalizedColumns should be table dependent. This might be better in a follow up PR thought.
05a322a to
5d4edd0
Compare
5d4edd0 to
8c316f6
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a nullable jsonb Sequence Diagram(s)sequenceDiagram
participant Client
participant HTTP_Route as HTTP Route / TUS Entry
participant Uploader
participant Storage
participant DB
participant RLS
Client->>HTTP_Route: Upload request (headers: x-metadata, content-type, content-length)
HTTP_Route->>Uploader: call canUpload/prepareUpload (includes userMetadata, metadata)
Uploader->>Storage: create/prepare multipart (passes userMetadata, metadata)
Storage->>DB: insert/find multipart row (user_metadata, metadata)
DB->>RLS: evaluate policy with user_metadata & metadata
RLS-->>DB: allow/deny
DB-->>Storage: result
alt allowed
Storage->>Uploader: success (tokens/parts)
Uploader-->>Client: signed URL / upload acceptance
else denied
Uploader-->>Client: error (403/400)
end
Assessment against linked issues
Out-of-scope changes
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| properties: { | ||
| 'x-upsert': { type: 'string' }, | ||
| 'content-type': { type: 'string' }, | ||
| 'content-length': { type: 'string' }, |
There was a problem hiding this comment.
don't we need x-metadata as well since we try to read below?
| const customMd = request.headers['x-metadata'] | ||
|
|
||
| if (typeof customMd === 'string') { | ||
| userMetadata = parseUserMetadata(customMd) |
There was a problem hiding this comment.
this parser returns Record<string, string> by casting due to its expectation (and S3 compatibility) but there is no validation. Values could be anything.
This is an existing helper, feel free to add a TODO to be addressed later
There was a problem hiding this comment.
I looked into supabase-js and s3 part of this, type is wrong but handling is correct and we return x-amz-meta-missing header for non-string/non-printable
There was a problem hiding this comment.
Another thing to note is that when it's set when content-type is multipart/form-data, we check/limit its size but not when it comes from a header.
However, the value is 1kb and a header can definitely be larger than that as well saw value wrong, it's 1mb so this is a valid assumption. We could reuse the code, instead of two parsers as a cleaup but not in this PR.
|
|
||
| const uploader = new Uploader(this.storage.backend, this.storage.db, this.storage.location) | ||
|
|
||
| const multipart = await this.shouldAllowPartUpload(UploadId, ContentLength, maxFileSize) |
There was a problem hiding this comment.
this ordering can cause issues for valid uploads because it mutates the state.
For example (numbers are arbitrary)
- in progress 50mb and new part is 5mb
- in progress becomes 55mb
- RLS fails
- next request in progress becomes 60mb
- RLS checks but size fails because it was expecting at most 55mb
There was a problem hiding this comment.
go catch, changed the order to have canUpload called before shouldAllowPartUpload. What I don't like about this is that this adds an extra call to findMultipartUpload as this is also called in shouldAllowPartUpload I thought about moving canUpload within in shouldAllowPartUpload but it uses asSuperUser before starting the transaction which would bypass RLS.
Can you confirm my understanding? Is there anyway to do this that I'm missing that doesn't involve extra findMultipartUpload call?
| const multipart = await this.storage.db | ||
| .asSuperUser() | ||
| .findMultipartUpload(UploadId, 'id,version') | ||
| .findMultipartUpload(UploadId, 'id,version,user_metadata,metadata') |
There was a problem hiding this comment.
we need the column guard for these reads similar to create as well
There was a problem hiding this comment.
added the guard to the findMultipartUpload with a TODO comment to refactor once we change normalizeColumns to be table aware.
src/storage/object.ts
Outdated
| objectName: destinationKey, | ||
| owner, | ||
| isUpsert: upsert, | ||
| userMetadata: userMetadata, |
There was a problem hiding this comment.
similar to metadata, aren't we supposed to use origin.user_metadata if copyMetadata is false because it's the written value eventually?
There was a problem hiding this comment.
yes we are, changed it to the same logic we use for metadata
There was a problem hiding this comment.
I take it back, using the same logic breaks some tests for userMetadata we don't want to merge the two. Either use all of the origin if copy or use the new destination metadata
|
Can we also add more tests please?
metadata checked at sign time but mutable at upload time (to be addressed separately by me when this is merged) |
8c316f6 to
6ea5123
Compare
Added all the above except for |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/storage/object.ts (1)
805-817:⚠️ Potential issue | 🔴 CriticalSigned-upload RLS validation is bypassable because policy-checked metadata is not bound to the JWT.
At lines 805–812,
canUploadevaluatesoptions.userMetadataandoptions.metadataagainst row-level security policies, but at lines 815–817 the generated token only contains{ owner, url, upsert }. The downstream upload handlers (uploadSignedObjectand tus lifecycle) extract only those three fields and never enforce that request metadata matches the values validated at signing time. An attacker can obtain a signed URL with policy-compliant metadata, then submit different metadata during upload, bypassing the RLS check.Add
userMetadataandmetadatatoSignedUploadToken, include them in the signing payload, and enforce exact match (or reject caller-provided values) during upload execution.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/object.ts` around lines 805 - 817, canUpload currently validates options.userMetadata and options.metadata against RLS but the signed token only includes {owner, url, upsert}, allowing an attacker to change metadata during upload; update the SignedUploadToken type to include userMetadata and metadata, include those fields in the payload passed to signJWT (in the code path that calls getJwtSecret and signJWT), and then modify the upload handlers (uploadSignedObject and the tus lifecycle handler) to read the token’s userMetadata/metadata and enforce an exact match (or reject any caller-provided metadata) before accepting the upload; use the existing canUpload, getJwtSecret, signJWT, SignedUploadToken, uploadSignedObject, and tus lifecycle handlers as the touchpoints.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/storage/object.ts`:
- Around line 805-817: canUpload currently validates options.userMetadata and
options.metadata against RLS but the signed token only includes {owner, url,
upsert}, allowing an attacker to change metadata during upload; update the
SignedUploadToken type to include userMetadata and metadata, include those
fields in the payload passed to signJWT (in the code path that calls
getJwtSecret and signJWT), and then modify the upload handlers
(uploadSignedObject and the tus lifecycle handler) to read the token’s
userMetadata/metadata and enforce an exact match (or reject any caller-provided
metadata) before accepting the upload; use the existing canUpload, getJwtSecret,
signJWT, SignedUploadToken, uploadSignedObject, and tus lifecycle handlers as
the touchpoints.
ℹ️ Review info
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/storage/object.ts
f184734 to
b4676ca
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/storage/object.ts (1)
805-819:⚠️ Potential issue | 🔴 CriticalMetadata passed to authorization check is not bound to signed upload tokens.
At line 805,
canUpload()validates the operation usingoptions.userMetadataandoptions.metadata. However, the JWT signed at line 816 omits both fields. During upload execution inuploadSignedObject.ts,uploadFromRequest()extracts metadata from the current request, not from the token—allowing a user to sign a URL with one metadata set and upload with a different one, bypassing the original authorization check.Fix: Include metadata fields in the signed JWT payload:
Suggested fix
- const token = await signJWT( - { owner, url, upsert: Boolean(options?.upsert) }, + const token = await signJWT( + { + owner, + url, + upsert: Boolean(options?.upsert), + userMetadata: options?.userMetadata, + metadata: options?.metadata, + }, urlSigningKey, expiresIn )Also validate that request metadata at upload time matches the token-bound metadata (or re-run
canUpload()with the same metadata set under the original authorization context).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/object.ts` around lines 805 - 819, The authorization metadata (userMetadata and metadata) passed to canUpload is not included in the signed token: update the signJWT payload in the code that calls signJWT to include options?.userMetadata and options?.metadata (alongside owner, url, upsert) so the signed token binds those values; then in uploadSignedObject.ts (uploadFromRequest) verify the incoming request's metadata matches the token-bound metadata or re-run canUpload with the same metadata under the original auth context (use getJwtSecret/signJWT token payload and canUpload) to prevent replacing metadata at upload time.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/storage/object.ts`:
- Around line 805-819: The authorization metadata (userMetadata and metadata)
passed to canUpload is not included in the signed token: update the signJWT
payload in the code that calls signJWT to include options?.userMetadata and
options?.metadata (alongside owner, url, upsert) so the signed token binds those
values; then in uploadSignedObject.ts (uploadFromRequest) verify the incoming
request's metadata matches the token-bound metadata or re-run canUpload with the
same metadata under the original auth context (use getJwtSecret/signJWT token
payload and canUpload) to prevent replacing metadata at upload time.
ℹ️ Review info
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (16)
migrations/tenant/0057-s3-multipart-uploads-metadata.sqlsrc/http/routes/object/getSignedUploadURL.tssrc/http/routes/tus/index.tssrc/http/routes/tus/lifecycle.tssrc/internal/database/migrations/types.tssrc/storage/database/adapter.tssrc/storage/database/knex.tssrc/storage/object.tssrc/storage/protocols/s3/s3-handler.tssrc/storage/schemas/multipart.tssrc/storage/uploader.tssrc/test/cdn.test.tssrc/test/rls.test.tssrc/test/rls_tests.yamlsrc/test/s3-protocol.test.tssrc/test/scanner.test.ts
💤 Files with no reviewable changes (1)
- src/test/scanner.test.ts
🚧 Files skipped from review as they are similar to previous changes (4)
- src/storage/database/adapter.ts
- src/http/routes/object/getSignedUploadURL.ts
- src/test/rls_tests.yaml
- src/storage/schemas/multipart.ts
f33de9d to
2dd0938
Compare
What kind of change does this PR introduce?
resolves #757 by adding support for referencing
user_metadatafield in RLS policies for insertsWhat is the current behavior?
Currently
user_metadatais not passed intocanUploadchecks, so RLS policies can't reference it during file uploads.What is the new behavior?
Now
user_metadatais passed through all upload paths (standard HTTP, S3, TUS, multipart) so RLS policies can use it.Additional context
canUploadto acceptuserMetatdata. I updated all callers to pass inuserMetadata. Becauseuploadgets called inprepareUploadwhich gets called inuploadI also updated alluploadcalls to ensureuserMetadatais passed through. I found it difficult to ensure I caught every cause though, e.g. it was very easy to missuploadNewObjectneeded to be updated in theuploadFromRequest. I'm open to feedback to see if there is a more robust solution we could implement that could be caught with typescript types.canUploadchecks for multipart uploads. It required changing the order of certain operations around so thatuser_metadatawas available. Wasn't entirely sure if these changes were okay.userMetadatafrom theFileUploadinterface to theUploadRequestbut wasn't sure if this was the right approach so would appreciate some feedback