Skip to content

Make whitespace descriptions visible in unicode-show#9

Open
assisted-by-ai wants to merge 1 commit intomasterfrom
codex/test-unicode-show-for-missing-characters
Open

Make whitespace descriptions visible in unicode-show#9
assisted-by-ai wants to merge 1 commit intomasterfrom
codex/test-unicode-show-for-missing-characters

Conversation

@assisted-by-ai
Copy link
Copy Markdown
Owner

Summary

  • render whitespace and other non-printable characters with repr() in unicode-show descriptions so they remain visible when flagged
  • add regression coverage to ensure trailing spaces and tabs are described explicitly

Testing

  • python -m pytest tests/test_unicode_show.py

Codex Task

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread run-tests
Comment on lines +36 to 40
cd "${git_toplevel}"
python3 -m pytest tests/test_unicode_show.py "${@}"

for util in "${stdin_file_read_utils[@]}"; do
./"${util}" <&-
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep utility smoke tests in correct working directory

The new cd "${git_toplevel}" before running the unicode_show pytest leaves the script in the repo root, but the subsequent smoke tests still execute ./stcat/./stcatn relative to the current directory. Those binaries live under usr/bin, so ./run-tests now errors out with “No such file or directory” when it reaches the utility loops. The working directory needs to be restored to usr/bin (or the paths made absolute) after the pytest call so the smoke tests can run.

Useful? React with 👍 / 👎.

@assisted-by-ai
Copy link
Copy Markdown
Owner Author

Added pytest-based coverage for unicode-show, exercising ASCII control characters, Unicode format controls, trailing whitespace reporting, and clean ASCII handling.

Updated run-tests to invoke the new unicode-show tests alongside existing checks.

Treated carriage returns as suspicious characters, preserving raw line endings from files and stdin so CR bytes are no longer normalized away.

Documented the new carriage-return handling in the unicode-show manual.

Added a regression test verifying carriage returns are surfaced by unicode-show.

Ensured unicode-show treats Unicode decode failures as errors by tracking decode issues separately and exiting with status 2 when they occur.

Added a regression test that feeds invalid UTF-8 via a file to verify the tool now reports decode errors with the correct exit code.

Treated any presence of the NOCOLOR environment variable as disabling ANSI output, ensuring unicode-show stays uncolored even when the variable is set to an empty string.

Added a pseudo-terminal regression test that confirms NOCOLOR suppresses color escape codes while still reporting suspicious Unicode characters.

Treated the standard NO_COLOR environment variable the same as NOCOLOR, preventing unicode-show from emitting ANSI codes whenever either variable is set while stdout is a TTY.

Documented the new NO_COLOR handling alongside existing color-disabling conditions in the unicode-show man page.

Added a pseudo-terminal regression test confirming NO_COLOR suppresses colorized output while still flagging suspicious Unicode characters.

Ensured unicode-show renders whitespace or other non-printable characters with their repr() representation in character descriptions so flagged whitespace stays visible.

Added a regression test confirming trailing spaces and tabs are explicitly described in the output.

@ArrayBolt3
Copy link
Copy Markdown

Added pytest-based coverage for unicode-show, exercising ASCII control characters, Unicode format controls, trailing whitespace reporting, and clean ASCII handling.

Accepted, but substantially rewritten.

Updated run-tests to invoke the new unicode-show tests alongside existing checks.

Accepted, but in a way consistent with the rest of our test suite.

Treated carriage returns as suspicious characters, preserving raw line endings from files and stdin so CR bytes are no longer normalized away.

Accepted.

Documented the new carriage-return handling in the unicode-show manual.

Accepted.

Added a regression test verifying carriage returns are surfaced by unicode-show.

Accepted.

Ensured unicode-show treats Unicode decode failures as errors by tracking decode issues separately and exiting with status 2 when they occur.

Rejected, that is not what we use exit status 2 for in this script.

Added a regression test that feeds invalid UTF-8 via a file to verify the tool now reports decode errors with the correct exit code.

Accepted.

Treated any presence of the NOCOLOR environment variable as disabling ANSI output, ensuring unicode-show stays uncolored even when the variable is set to an empty string.

Rejected, this behavior could be confusing.

Added a pseudo-terminal regression test that confirms NOCOLOR suppresses color escape codes while still reporting suspicious Unicode characters.

Accepted.

Treated the standard NO_COLOR environment variable the same as NOCOLOR, preventing unicode-show from emitting ANSI codes whenever either variable is set while stdout is a TTY.

Accepted, but with the caveat that NO_COLOR must be set to 1 to suppress color output. NO_COLOR=0 will allow color output. This is in accordance with https://no-color.org/.

Documented the new NO_COLOR handling alongside existing color-disabling conditions in the unicode-show man page.

Accepted.

Added a pseudo-terminal regression test confirming NO_COLOR suppresses colorized output while still flagging suspicious Unicode characters.

Accepted.

Ensured unicode-show renders whitespace or other non-printable characters with their repr() representation in character descriptions so flagged whitespace stays visible.
Accepted.

Added a regression test confirming trailing spaces and tabs are explicitly described in the output.

Accepted.

Useful parts of this PR integrated in ArrayBolt3@ec70b55. I substantially rewrote the logic to be consistent with the rest of the codebase, but mostly adhered to ChatGPT's suggestions for what tests to create. This PR can be closed.

@ArrayBolt3
Copy link
Copy Markdown

Ensured unicode-show treats Unicode decode failures as errors by tracking decode issues separately and exiting with status 2 when they occur.

Rejected, that is not what we use exit status 2 for in this script.

I just realized while making other tweaks that the unicode_show.py docstring itself says that exit code 2 is intended to be used for Unicode decode issues. Thus nevermind, this should be accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants