Make whitespace descriptions visible in unicode-show by assisted-by-ai · Pull Request #9 · assisted-by-ai/helper-scripts

assisted-by-ai · 2025-12-02T09:15:41Z

Summary

render whitespace and other non-printable characters with repr() in unicode-show descriptions so they remain visible when flagged
add regression coverage to ensure trailing spaces and tabs are described explicitly

Testing

python -m pytest tests/test_unicode_show.py

Codex Task

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-12-02T09:20:45Z

+cd "${git_toplevel}"
+python3 -m pytest tests/test_unicode_show.py "${@}"
+
 for util in "${stdin_file_read_utils[@]}"; do
  ./"${util}" <&-


Keep utility smoke tests in correct working directory

The new cd "${git_toplevel}" before running the unicode_show pytest leaves the script in the repo root, but the subsequent smoke tests still execute ./stcat/./stcatn relative to the current directory. Those binaries live under usr/bin, so ./run-tests now errors out with “No such file or directory” when it reaches the utility loops. The working directory needs to be restored to usr/bin (or the paths made absolute) after the pytest call so the smoke tests can run.

Useful? React with 👍 / 👎.

assisted-by-ai · 2025-12-03T10:43:10Z

Added pytest-based coverage for unicode-show, exercising ASCII control characters, Unicode format controls, trailing whitespace reporting, and clean ASCII handling.

Updated run-tests to invoke the new unicode-show tests alongside existing checks.

Treated carriage returns as suspicious characters, preserving raw line endings from files and stdin so CR bytes are no longer normalized away.

Documented the new carriage-return handling in the unicode-show manual.

Added a regression test verifying carriage returns are surfaced by unicode-show.

Ensured unicode-show treats Unicode decode failures as errors by tracking decode issues separately and exiting with status 2 when they occur.

Added a regression test that feeds invalid UTF-8 via a file to verify the tool now reports decode errors with the correct exit code.

Treated any presence of the NOCOLOR environment variable as disabling ANSI output, ensuring unicode-show stays uncolored even when the variable is set to an empty string.

Added a pseudo-terminal regression test that confirms NOCOLOR suppresses color escape codes while still reporting suspicious Unicode characters.

Treated the standard NO_COLOR environment variable the same as NOCOLOR, preventing unicode-show from emitting ANSI codes whenever either variable is set while stdout is a TTY.

Documented the new NO_COLOR handling alongside existing color-disabling conditions in the unicode-show man page.

Added a pseudo-terminal regression test confirming NO_COLOR suppresses colorized output while still flagging suspicious Unicode characters.

Ensured unicode-show renders whitespace or other non-printable characters with their repr() representation in character descriptions so flagged whitespace stays visible.

Added a regression test confirming trailing spaces and tabs are explicitly described in the output.

ArrayBolt3 · 2025-12-07T04:45:39Z

Added pytest-based coverage for unicode-show, exercising ASCII control characters, Unicode format controls, trailing whitespace reporting, and clean ASCII handling.

Accepted, but substantially rewritten.

Updated run-tests to invoke the new unicode-show tests alongside existing checks.

Accepted, but in a way consistent with the rest of our test suite.

Treated carriage returns as suspicious characters, preserving raw line endings from files and stdin so CR bytes are no longer normalized away.

Accepted.

Documented the new carriage-return handling in the unicode-show manual.

Accepted.

Added a regression test verifying carriage returns are surfaced by unicode-show.

Accepted.

Ensured unicode-show treats Unicode decode failures as errors by tracking decode issues separately and exiting with status 2 when they occur.

Rejected, that is not what we use exit status 2 for in this script.

Added a regression test that feeds invalid UTF-8 via a file to verify the tool now reports decode errors with the correct exit code.

Accepted.

Treated any presence of the NOCOLOR environment variable as disabling ANSI output, ensuring unicode-show stays uncolored even when the variable is set to an empty string.

Rejected, this behavior could be confusing.

Added a pseudo-terminal regression test that confirms NOCOLOR suppresses color escape codes while still reporting suspicious Unicode characters.

Accepted.

Treated the standard NO_COLOR environment variable the same as NOCOLOR, preventing unicode-show from emitting ANSI codes whenever either variable is set while stdout is a TTY.

Accepted, but with the caveat that NO_COLOR must be set to 1 to suppress color output. NO_COLOR=0 will allow color output. This is in accordance with https://no-color.org/.

Documented the new NO_COLOR handling alongside existing color-disabling conditions in the unicode-show man page.

Accepted.

Added a pseudo-terminal regression test confirming NO_COLOR suppresses colorized output while still flagging suspicious Unicode characters.

Accepted.

Ensured unicode-show renders whitespace or other non-printable characters with their repr() representation in character descriptions so flagged whitespace stays visible.
Accepted.

Added a regression test confirming trailing spaces and tabs are explicitly described in the output.

Accepted.

Useful parts of this PR integrated in ArrayBolt3@ec70b55. I substantially rewrote the logic to be consistent with the rest of the codebase, but mostly adhered to ChatGPT's suggestions for what tests to create. This PR can be closed.

ArrayBolt3 · 2025-12-07T15:53:24Z

Ensured unicode-show treats Unicode decode failures as errors by tracking decode issues separately and exiting with status 2 when they occur.

Rejected, that is not what we use exit status 2 for in this script.

I just realized while making other tweaks that the unicode_show.py docstring itself says that exit code 2 is intended to be used for Unicode decode issues. Thus nevermind, this should be accepted.

Clarify whitespace descriptions in unicode-show

ab65d7f

assisted-by-ai added the codex label Dec 2, 2025 — with ChatGPT Codex Connector

chatgpt-codex-connector bot reviewed Dec 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make whitespace descriptions visible in unicode-show#9

Make whitespace descriptions visible in unicode-show#9
assisted-by-ai wants to merge 1 commit intomasterfrom
codex/test-unicode-show-for-missing-characters

assisted-by-ai commented Dec 2, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Dec 2, 2025

Uh oh!

assisted-by-ai commented Dec 3, 2025

Uh oh!

ArrayBolt3 commented Dec 7, 2025

Uh oh!

ArrayBolt3 commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

assisted-by-ai commented Dec 2, 2025

Summary

Testing

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

assisted-by-ai commented Dec 3, 2025

Uh oh!

ArrayBolt3 commented Dec 7, 2025

Uh oh!

ArrayBolt3 commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants