Skip to content

Remove feature gating from detect_encoding()#952

Merged
dralley merged 1 commit into
tafia:masterfrom
dralley:detect-encoding
May 1, 2026
Merged

Remove feature gating from detect_encoding()#952
dralley merged 1 commit into
tafia:masterfrom
dralley:detect-encoding

Conversation

@dralley
Copy link
Copy Markdown
Collaborator

@dralley dralley commented Apr 27, 2026

Have it return a generic enum so that it can be used with or without the encoding feature.

It was suggested that this be split off from #947

use crate::reader::EncodingRef;
#[cfg(feature = "encoding")]
use encoding_rs::{Encoding, UTF_8};
use encoding_rs;
Copy link
Copy Markdown
Collaborator Author

@dralley dralley Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted it to be 100% clear where types were coming from, as some of the constants in mod encoding are very similar to other types or constants from encoding_rs, and especially with arguments and return values it's best that it be clear they are foreign types.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Comment thread src/reader/async_tokio.rs

use tokio::io::{self, AsyncBufRead, AsyncBufReadExt, AsyncRead, ReadBuf};

use crate::encoding;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used by the macros.

Comment thread src/reader/mod.rs
if let Some(encoding) = $reader.detect_encoding() $(.$await)? ? {
if $self.state.encoding.can_be_refined() {
$self.state.encoding = crate::reader::EncodingRef::BomDetected(encoding);
$self.state.encoding = crate::reader::EncodingRef::BomDetected(encoding.encoding());
Copy link
Copy Markdown
Collaborator Author

@dralley dralley Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned on the previous PR, detect_encoding() returns / returned Some() even in cases where no BOM was found, thus this is perhaps misnamed.

However, the initial XML characters function like a BOM, and the documentation mentions it, so I see an argument for leaving this be.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 27, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 95.12195% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.40%. Comparing base (a759d65) to head (eaa5a0b).
⚠️ Report is 12 commits behind head on master.

Files with missing lines Patch % Lines
src/encoding.rs 93.33% 2 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #952      +/-   ##
==========================================
+ Coverage   55.08%   56.40%   +1.31%     
==========================================
  Files          44       44              
  Lines       16911    17607     +696     
==========================================
+ Hits         9316     9931     +615     
- Misses       7595     7676      +81     
Flag Coverage Δ
unittests 56.40% <95.12%> (+1.31%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dralley dralley requested a review from Mingun April 27, 2026 02:33
@dralley
Copy link
Copy Markdown
Collaborator Author

dralley commented Apr 28, 2026

After this PR, can we go ahead and cut a release that includes the attribute value normalization?

@Mingun
Copy link
Copy Markdown
Collaborator

Mingun commented Apr 28, 2026

After this PR, can we go ahead and cut a release that includes the attribute value normalization?

Agree, I also want to release new version soon. I will review this and try to fix the latest issues and then release new version. If the latest issues will hard to fix quickly, then I'll release new version without their fixes.

Copy link
Copy Markdown
Collaborator

@Mingun Mingun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems two small improvements could be made, but other is good

Comment thread src/encoding.rs Outdated
Comment thread src/encoding.rs Outdated
Comment thread src/encoding.rs Outdated
use crate::reader::EncodingRef;
#[cfg(feature = "encoding")]
use encoding_rs::{Encoding, UTF_8};
use encoding_rs;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Have it return a generic enum so that it can be used with or without the
encoding feature.
@dralley dralley force-pushed the detect-encoding branch from c0cab37 to eaa5a0b Compare May 1, 2026 17:49
@dralley
Copy link
Copy Markdown
Collaborator Author

dralley commented May 1, 2026

re-pushed

@dralley dralley merged commit 8549d20 into tafia:master May 1, 2026
7 checks passed
@dralley dralley deleted the detect-encoding branch May 1, 2026 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants