preliminary minimal changes#251
preliminary minimal changes#251zjmarlow wants to merge 27 commits intoraku-community-modules:mainfrom
Conversation
|
Appears to be failing 2 tests for me on MacOS: |
|
Should HTTP::UserAgent be producing valid Messages, or is it up to the users of the library to use it in such a way that it only produces valid messages? Some of the test messages are themselves invalid, so what needs changing will depend on the library's purpose. |
|
Well, that is a good question. What this module really needs, is a maintainer. Would you be willing to take on that role? You apparently have a need to fix things in it :-) |
|
Yes, but I'll need guidance and feedback as I maintain it, if that's okay? Some first questions:
I'll start looking through the other open issues, too. |
|
What would the best channel / medium be for discussing this module? |
|
Either #raku on libera.chat or on Discord |
|
There are enough changes to get things conformant that it would be difficult not to break existing behavior. My plan is to provide alternate classes that can be imported with:
if that is acceptable, though I'm open to other options. |
|
That would make it opt-in, and as such backward compatible. 👍 |
|
I wasn't able to get exporting to work how I was hoping, so instead, the strict versions are just named after the originals with "-Strict" attached. They are subclasses of the originals. The new classes are:
They are all made available with
so that it is easy to just import the strict classes. Behavior when mixing strict and original classes is "undefined". The only change to the original should be the is-chunked multi methods in the HTTP::Message class. One checks the message's own headers, the other takes an arbitrary header object. The following new testcases were added:
I still need to do some testing with the HTTP::UserAgent-Strict class before turning this draft request into a full pull request, but this should be a good place for review and verifying testcases (the new behavior passed for me locally across Windows, Linux, and Mac). |
|
need to fix body-less messages. will comment again once fixed. |
| unit class HTTP::Cookies; | ||
|
|
||
| use HTTP::Cookie; | ||
| use HTTP::Response:auth<zef:raku-community-modules>; |
There was a problem hiding this comment.
I think these were qualified like this because there are un-related packages that provide similarly named modules, so I'd be careful about removing that.
There was a problem hiding this comment.
Yeah, the removal of this will requires an explanation
There was a problem hiding this comment.
I missed that one. It is added back now. I wasn't aware of the similarly named modules.
|
My apologies. I am now reading through the META6.json and distributions documentation and committed some fixes. Will the version's patch number eventually need to be updated? |
|
Okay, the META6.json should be fixed. I did not change the version number. |
|
What is the process for completing the pull request / merging? |
I'm not sure there is a strict process, but I'd like one or two more reviews before pushing the button 🤣 |
|
Fair enough. I wasn't sure how it worked. =) |
|
I only prepared this module for an initial community modules release. I have currently no affinity with this module at all, so removed myself as a reviewer. |
lib/HTTP/UserAgent.rakumod
Outdated
| } | ||
|
|
||
| sub _clear-url(Str $url is copy) { | ||
| our sub _clear-url(Str $url is copy) { |
There was a problem hiding this comment.
Why are these subs changed to our scope?
There was a problem hiding this comment.
So that they could be used in the Strict classes. I can copy the subs over to the new classes, if that is better.
There was a problem hiding this comment.
I don't know that I'd duplicate code, so I'm not suggesting an alternative. But here you are essentially making this function part of the public API, i.e. users can now call HTTP::UserAgent::_clear-url(...). You could mark is is-implementation-detail but even that feels like a code smell.
|
I haven't really looked closely at the code, but my initial thought a strict mode would be implemented in one of two ways:
Option 2 also allows for those e.g. I also agree with the other comment that the naming seems a bit weird. Specifically I think the existence of a duplicate UserAgent class at all is what feels off. |
|
I don't think this approach has an advantage over option 1, so I will look back over the changes in the new classes and see about having different methods used based on a flag provided in the original classes. The flag would need to be provided for the UserAgent, as well as the Message classes. I am not familiar with Net::HTTP but will take a look at that approach. The purpose of these changes is to address invalid output, as opposed to checking for invalid input. The original code was appending an extra CRLF to requests, which some browser driver implementations (implementing the WebDriver protocol) do not accept, specifically for POST requests. |
|
I am working through option 1. The general approach is to have a |
|
Flag implementation committed. Since the issue to address was mainly output, strict expects valid input, but does not validate it. Meaning the #226 case would probably want a non-strict UserAgent with a strict Request. That combination has not been tested yet and I am working on testcases to cover the various mixed cases. The latest commit also removes the ::Strict classes and reverts the our subs back to their original. |
|
@zjmarlow Can you give me a bit more context? I can't recall what you're referring to / in what way I've been involved in this issue. |
|
Sorry, I was too quick with the auto-complete. It was meant to be in reference to #226 , which was opened by bbkr. |
|
Testcases added for the various UserAgent and Request strict / non-strict interactions. |
ugexe
left a comment
There was a problem hiding this comment.
That's quite a lot of changes, nice! Unfortunately making lots of changes also has drawbacks... would it be possible to clean up this PR further? There are multiple commits fixing things broken in/from previous commits which makes reviewing things commit by commit frustrating. The PR title is still "preliminary changes" and is described as being a draft pull request (among other things), which makes understanding the git history more difficult. There are also many leftover comments in the code itself.
I left quite a few comments, but that was just from a cursory review so I suspect there are things I missed. Hopefully it is not too discouraging... you've bitten off a rather large piece of work.
lib/HTTP/Message.rakumod
Outdated
|
|
||
| method new($content?, *%fields) { | ||
| my $header = HTTP::Header.new(|%fields); | ||
| method new($content?, Bool :$strict, *%fields) { |
There was a problem hiding this comment.
This seems like a bit of an unfortunate api design. What if I want to send a header named strict?
There was a problem hiding this comment.
Instead of named, strict could be the final positional and default to False. If that's acceptable, I'd like to keep it consistent by changing all the signatures to have strict as the final positional with False as default. This particular case might require a multi method, though:
multi method new($content, Bool $strict = False, *%fields)
multi method new(Bool $strict = False, *%fields) is default| # pop zero-length Str that occurs after last chunk | ||
| # what to do if this doesn't happen? | ||
| @lines.pop if @lines %2; | ||
| @lines = grep *, |
There was a problem hiding this comment.
What is the purpose of grepping on *? It is also hard to read this chain since it is mixing routine and method calls.
There was a problem hiding this comment.
The full purpose of the chain is to take lines two at a time - first line is the chunk length, second the chunk content. If the chunk length looks valid, pass along the content, otherwise use grep to filter it out. It should probably fully validate.
The larger problem is that if the content contains embedded CRLFs, the method is broken. I might try my hand at a correct implementation.
lib/HTTP/Message.rakumod
Outdated
| } | ||
|
|
||
| method parse($raw_message) { | ||
| method !parse ( $raw_message ) { |
There was a problem hiding this comment.
!parse isn't a great name for this function, especially given there is a public function of the same name. !parse-strict or some such would make it more clear what the purpose of this function is.
Also it seems like there is a lot of logic that could be shared between !parse and parse. Refactoring these would make it easier to maintain.
There was a problem hiding this comment.
refactored, including adding various private methods that parse status / request line, header, and content
| # technically incorrect - content allowed to contain embedded CRLFs | ||
| my @lines = $content.split: $CRLF; | ||
| # pop zero-length Str that occurs after last chunk | ||
| # what to do if this doesn't happen? |
There was a problem hiding this comment.
I don't know, what should this do?
There was a problem hiding this comment.
One option would be: if the user specified throw-exceptions, to throw with 'truncated last chunk'. The changes are meant to focus on producing strict output rather than demand strict input, so I'm not sure about this - your thoughts?
There was a problem hiding this comment.
It seems like a strange design to only have special http header objects for one type of header. Is there any reason this needs its own object? The motivation for it isn't clear to me as the commit message on this file is just "initial strict implementation".
There was a problem hiding this comment.
ETags can be tagged as weak validators and might be handled differently in those cases.
| # headers container | ||
| has @.fields; | ||
|
|
||
| grammar Grammar::Strict { |
There was a problem hiding this comment.
The other grammar in this file is called HTTP::Header::Grammar which makes it clear what it is used for. To me a name like Grammar::Strict in the same file does not suggest it has any relation to HTTP::Header::Grammar even though it appears that is the case based on the parse method that returns one of these classes.
There was a problem hiding this comment.
I don't think I understand the comment. The only relation between the grammars is that they serve the same purpose. Otherwise the strict grammar is based on the RFCs and supports weak ETags. Could you please let me know if my explanation missed the mark?
|
Hello @ugexe , thank you for taking the time to do this. Would you prefer a separate PR with only the final changes committed for any further reviews? I have some other work to take care of so it will likely be over the next few days that I get through the change requests. |
|
I think the test failure is just due to incorrect skip count. I've fixed it locally. I'm still working through the change requests and will mention in a comment when changes have been pushed. In the meantime, I might have questions / need feedback. |
|
Okay, all changes I think I understood have been applied. I can push what I have now, or, if you prefer, wait until the rest of the change request conversations have been resolved. |
Yes please. |
|
The signature change where strict is the last positional instead of an associative has been implemented enough to get tests passing. The rest of the methods should probably be updated at some point to make requesting strict processing consistent. |
draft pull request with minimal changes to handle eol. also updates is-chunked check to include the case where multiple Transfer Encodings are present (chunked should always be last).
requesting feedback before final changes.