Cover remaining C1 controls in stdisplay tests#8
Cover remaining C1 controls in stdisplay tests#8assisted-by-ai wants to merge 1 commit intomasterfrom
Conversation
|
stdisplay builds an allowlist of safe characters: printable ASCII (0x20–0x7E), newline, tab, and allowed SGR escape sequences. Everything else—including other control characters, Unicode beyond ASCII, and any escape sequence that is not an allowed SGR—is replaced with underscores. Conclusion Added a dedicated test case class covering hostile non-SGR escape sequences (OSC, DCS, kitty graphics, UTF-8 mode, and C1 CSI) to ensure they are neutralized by stdisplay. Added coverage for APC, PM, SOS, and C1 APC control strings to verify stdisplay redacts them outside the SGR allowlist. Added zero-width and right-to-left override examples to the shared escape-case corpus to verify they are reduced to safe placeholders. Introduced coverage for single C1 control characters and 8-bit string commands to confirm stdisplay neutralizes them. Added Unicode formatting controls (bidi overrides/isolates, zero-width joiner, BOM, and related markers) to the shared escape corpus to confirm they are reduced to safe placeholders. Added coverage for additional C0/C1 controls and nested OSC/DCS clipboard payloads to ensure stdisplay neutralizes them with underscores. Added regression coverage for additional C1 control bytes (NEL, PAD/SS3, and STS–EPA range) to confirm stdisplay replaces them with safe placeholders. |
| ] | ||
|
|
||
|
|
||
| class TestSTDisplayMaliciousCases(unittest.TestCase): |
There was a problem hiding this comment.
These don't have to go in their own class.
| ("\x1b]0;evil title\x07", "_]0;evil title_"), | ||
| ("\x1bP1;2|malicious\x1b\\", "_P1;2|malicious_\\"), | ||
| ("\x1b_Gf=24,s=1,v=1;AAAA\x1b\\", "__Gf=24,s=1,v=1;AAAA_\\"), | ||
| ("\x1b%Gpayload", "_%Gpayload"), |
There was a problem hiding this comment.
ESC % doesn't seem to be a C1 control sequence that I can tell.
All accepted. (I didn't audit to make sure all of the sequences in this PR are the sequences ChatGPT says it included, but really any weird Unicode and escape sequences should be caught by stdisplay, so I don't care too much about whether the commands being sanitized out are well-formed or even valid. We only cared about this for the CSI commands because stdisplay is supposed to semi-interpret and allow through some CSI commands.) Added the test cases in ArrayBolt3@bb095e1. (Did not use the redundant code from this.) This PR can be closed. |
Summary
Testing
Codex Task