Skip to content

Conversation

@craigfurman
Copy link

@craigfurman craigfurman commented Jul 21, 2025

I have a use case for sending non-utf8 query parameters (specifically, announcing to BitTorrent trackers from a BitTorrent client). I'm using Req.Test to assert against the request payload my application sends, and am hitting an error of the form "invalid UTF-8 on urlencoded params, got byte...".

This commit disables utf8 validation on Req.Test's plug. I am not 100% sure, but I don't think this will break anything else (well, apart from implicit utf8-validation of query parameters, but explicit test assertions on the values / pattern-matching should cover most use cases). If desired, we can perhaps make it an option.

@wojtekmach
Copy link
Owner

Hey, thank you for the PR. I haven't really thought about it, but I think the current behavior could be considered a guard rail against people accidentally sending out incorrect data. I'd prefer to stay with stock plug things as close as possible, including this particular behavior. Seems that's what URI.new! does too:

iex> URI.new!("https://foo?bar=\xFF")
** (FunctionClauseError) no function clause matching in :uri_string.parse_query/2

    The following arguments were given to :uri_string.parse_query/2:

        # 1
        <<255>>

        # 2
        %{}

That being said, yeah, it should be possible to support your use case because what else are you gonna do. Do you have any thoughts about opting in into relaxed parsing, what would be a reasonable API for you?

@craigfurman
Copy link
Author

craigfurman commented Oct 29, 2025

Hi @wojtekmach, re: guard rail against sending out incorrect data: it seems like a narrow use case to me, to be honest. Such users would be implicitly relying on Req.Test to validate the utf8-encoding of their query params, but without asserting or pattern matching on their content in that test. On the other hand, inverting this behaviour would technically be a breaking change, so perhaps it's best to be conservative.

Plus, admittedly, testing clients of http servers that accept non-utf8 parameters is also a rather narrow use case 😅

Of course you're best-placed to decide on this as the maintainer, and I'm happy to make this opt-in if that works better for the project. I haven't tried this out (and haven't looked at this code in a while), but perhaps it could work to extend the valid values of the plug Request option in a non-breaking way.

The general form could be:

Req.request(plug: {Req.Test, name: name, validate_utf8_query: true})

But Req.request(plug: {Req.Test, name}) would continue to be supported, and would be translated to Req.request(plug: {Req.Test, name: name}) for backwards compatibility reasons. This leaves the door open to adding more config options in the future.

What do you think?

@craigfurman
Copy link
Author

I opened #515 which contains something similar to the above comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants