Skip to content

Conversation

@txase
Copy link
Contributor

@txase txase commented Aug 3, 2025

This change allows s3:// data paths to include additional parameters as querystrings. The default parameters remain optimized for the official AWS S3 service, but any OpenDAL S3 parameters can be set. For example, to override the default true value for enable_virtual_host_style you can set the data config to:

s3://my-bucket?enable_virtual_host_style=false

Boolean parameters can be set to true by simply adding the parameter:

s3://my-bucket?enable_versioning

Fixes #6112

Copilot AI review requested due to automatic review settings August 3, 2025 20:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for OpenDAL S3 parameters via query strings in S3 URLs, allowing users to customize S3 configurations while maintaining optimized defaults for AWS S3. The change enables parameters like enable_virtual_host_style=false or boolean flags like enable_versioning to be passed directly in the S3 URL.

Key changes:

  • Added parameter parsing from S3 URL query strings with type inference for booleans, numbers, and strings
  • Refactored credential loading to support static credentials alongside the existing AWS SDK chain
  • Implemented parameter validation with blocked parameters (bucket, root) that cannot be overridden

@ProbstDJakob
Copy link

When this has been merged, it would be great if the official container images would be build with S3 support enabled.

This change allows s3:// data paths to include additional parameters as
querystrings. The default parameters remain optimized for the official
AWS S3 service, but any OpenDAL S3 parameters can be set. For example,
to override the default `true` value for `enable_virtual_host_style` you
can set the data config to:

s3://my-bucket?enable_virtual_host_style=false

Boolean parameters can be set to `true` by simply adding the parameter:

s3://my-bucket?enable_versioning

Fixes dani-garcia#6112
@tessus
Copy link
Contributor

tessus commented Oct 21, 2025

@BlackDex since you are on a roll today merging PRs, this seems like a low hanging fruit... ;-)

@kaiyou
Copy link

kaiyou commented Nov 29, 2025

Thank you so much for this much expected feature.

I have tested this for a couple days and deployed it to production today. It works really well and also enables compatibility with Garage, yet another S3 cluster implementation.

I have feedback about the admin experience though: ICON_CACHE_FOLDER, SENDS_FOLDER and ATTACHMENTS_FOLDER, when not explicitly provided, are automatically derived from the more general DATA_FOLDER in config.rs. One might either set all of these explicitly, e.g. to store data in separate buckets, or simply set DATA_FOLDER to an S3 URL and let OpenDAL deal with derived subpaths.

However, when DATA_FOLDER contains a query string, it cannot be concaternated with /attachments, /sends or /icons_cache anymore. For instance :

s3://mybucket?endpoint=my-service.tld&region=myregion

... leads to the following attachment folder URL :

s3://mybucket?endpoint=my-service.tld&region=myregion/attachments

The subpath is now part of the region in the query string, which just fails (due to / being forbidden in region names and sigv4 scope construction). I suspect other situations might lead to more subtle failure modes, where requests are handled properly but data is directed to the bucket root instead of the intended subpath.

It would be really nice if the subpath was properly inserted before the query string. This either requires some more logic in config.rs, parsing and manipulating the OpenDAL URL, or a separate configuration variable for the query string. Otherwise, a note in .env.template would help avoid some failure modes.

@BlackDex
Copy link
Collaborator

Thanks @kaiyou for testing! This really helps us.

@dani-garcia
Copy link
Owner

Looking through the changelog on the latest OpenDAL version, it looks like Operator::from_uri is now available:
apache/opendal#5445
https://docs.rs/opendal/latest/opendal/struct.Operator.html#method.from_uri

I'm wondering if that would also cover this usecase, potentially in a generic way that would make it easier to enable more backends in the future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

S3 default storage class breaks Minio compatibility

6 participants