-
-
Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-133710: Enhance pyrepl auto-indent #140710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Lib/_pyrepl/readline.py
Outdated
| # update stack | ||
| if char in "\"'" and (i == 0 or buffer[i - 1] != "\\"): | ||
| if str_delims and str_delims[-1] == char: | ||
| str_delims.pop() | ||
| else: | ||
| str_delims.append(char) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd also have to add f-string and t-string support to that. So you'd have to parse braces, but only if they're not escaped and they're in the right string prefixes.
At this stage, I'm afraid this function will balloon into a poor reimplementation of a decent chunk of the tokenizer. But we already use the tokenizer properly for syntax highlighting. We even have specific functionality to discover unterminated multiline strings.
Look at _pyrepl.utils and think about how that could be reused here instead of the manual approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your review! I have made the change to rely on _pyrepl.utils.gen_colors for a tokenizer-based parsing of strings and comments.
This modifies
_should_auto_indent()to scan characters from the start of the buffer up to the target positionpos, tracking context to ignore irrelavant content within strings, then checks if the conditions for auto-indent are met.It uses a stack (
str_delims) to track quotes ("or'). When inside a string (stack not empty), characters like#or:are treated as literal text and are ignored. A flag (in_comment) marks when a#(not inside a string) has been encountered, ignoring all subsequent characters on that line as comment text. It tracks the indentation level of the last non-whitespace character (lastchar_line_indent), which helps ensure the colon is at a valid indentation level to trigger auto-indent.The function returns
Trueif the last non-whitespace character (outside comments/strings) is a colon (:), and the indentation level of the line whereposresides is not greater than the indentation of that colon.Hi @wiggin15, as this
_should_auto_indent()function is modified based on your original implementation (#119606), if you could help review when you have some spare time, I would be really appreciated.