Update dependency org.jsoup:jsoup to v1.22.2#337
Open
renovate[bot] wants to merge 1 commit intomainfrom
Open
Conversation
14b3399 to
c646fe3
Compare
c646fe3 to
4c1d478
Compare
4c1d478 to
bf0a20c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
1.21.2→1.22.2Release Notes
jhy/jsoup (org.jsoup:jsoup)
v1.22.2Improvements
NodeTraversorsupport for in-place DOM rewrites duringNodeVisitor.head(). Current-node edits such asremove,replace, andunwrapnow recover more predictably, while traversal stays within the original root subtree. This makes single-pass tree cleanup and normalization visitors easier to write, for example when unwrapping presentational elements or replacing text nodes as you walk the DOM. #2472Cleanermay be reused across concurrent threads, and that sharedSafelistinstances should not be mutated while in use. #2473TagSetfor current HTML elements: addeddialog,search,picture, andslot; madeins,del,button,audio,video, andcanvasinline by default (Tag#isInline(), aligned to phrasing content in the spec); and added readableElement.text()boundaries for controls and embedded objects via the newTag.TextBoundaryoption. This improves pretty-printing and keeps normalized text from running adjacent words together. #2493Bug Fixes
re2jdependency when not present. #2459NodeTraversorregression in 1.21.2 where removing or replacing the current node duringhead()could revisit the replacement node and loop indefinitely. The traversal docs now also clarify which inserted nodes are visited in the current pass. #2472available()call throwsIOException, as seen on JDK 8HttpURLConnection. #2474Cleanerno longer makes relative URL attributes in the input document absolute when cleaning or validating aDocument. URL normalization now applies only to the cleaned output, andSafelist.isSafeAttribute()is side effect free. #2475Cleanerno longer duplicates enforced attributes when the inputDocumentpreserves attribute case. A case-variant source attribute is now replaced by the enforced attribute in the cleaned output. #2476HttpClient, because the JDK would silently ignore that proxy and attempt to connect directly. Those requests now fall back to the legacyHttpURLConnectiontransport instead, which does support SOCKS. #2468Connection.Response.streamParser()andDataUtil.streamParser(Path, ...)could fail on small inputs without a declared charset, if the initial 5 KB charset sniff fully consumed the input and closed it before the stream parse began. #2483<!DOCTYPE root [<!ENTITY name "value">]>, now round-trip correctly. The subset is preserved as raw text only; entities are not expanded and external DTDs are not loaded. #2486Build Changes
v1.22.1Improvements
re2jregular expression engine for regex-based CSS selectors (e.g.[attr~=regex],:matches(regex)), which ensures linear-time performance for regex evaluation. This allows safer handling of arbitrary user-supplied query regexes. To enable, add thecom.google.re2jdependency to your classpath, e.g.:(If you already have that dependency in your classpath, but you want to keep using the Java regex engine, you can disable re2j via
System.setProperty("jsoup.useRe2j", "false").) You can confirm that the re2j engine has been enabled correctly by callingorg.jsoup.helper.Regex.usingRe2j(). #2407Parser#unescape(String, boolean)that unescapes HTML entities using the parser's configuration (e.g. to support error tracking), complementing the existing static utilityParser.unescapeEntities(String, boolean). #2396org.jsoup.parser.Parser#setMaxDepth. #2421Changes
Bug Fixes
Elementsof anElementwere not correctly invalidated inNode#replaceWith(Node), which could lead to incorrect results when subsequently callingElement#children(). #2391[attr=" foo "]). Now matches align with the CSS specification and browser engines. #2380ProxySelector.getDefault()) was ignored. Now, the system proxy is used if a per-request proxy is not set. #2388, #2390ValidationExceptioncould be thrown in the adoption agency algorithm with particularly broken input. Now logged as a parse error. #2393IndexOutOfBoundsExceptioncould be thrown when parsing a body fragment with crafted input. Now logged as a parse error. #2397, #2406parent childselector) across many retained threads, their memoized results could also be retained, increasing memory use. These results are now cleared immediately after use, reducing overall memory consumption. #2411Parsernow preserves any customTagSetapplied to the parser. #2422, #2423Tag.Voidnow parse and serialize like the built-in void elements: they no longer consume following content, and the XML serializer emits the expected self-closing form. #2425<br>element is once again classified as an inline tag (Tag.isBlock() == false), matching common developer expectations and its role as phrasing content in HTML, while pretty-printing and text extraction continue to treat it as a line break in the rendered output. #2387, #2439Jsoup.connect(url).get(). On responses without a charset header, the initial charset sniff could sometimes (depending on buffering /available()behavior) be mistaken for end-of-stream and a partial parse reused, dropping trailing content. #2448TagSetcopies no longer mutate their template during lazy lookups, preventing cross-threadConcurrentModificationExceptionwhen parsing with shared sessions. #2453<svg>foreignObjectcontent nested within a<p>, which could incorrectly move the HTML subtree outside the SVG. #2452Internal Changes
org.jsoup.internal.Functions(for removal in v1.23.1). This was previously used to support older Android API levels without fulljava.util.functioncoverage; jsoup now requires core library desugaring so this indirection is no longer necessary. #2412Configuration
📅 Schedule: (UTC)
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.