The current parser implementation contains two areas of unnecessary complexity:
- Macro-based token dispatch (
token_dispatch!)
- Manual stateful block parsing using
balanced_block_with_min and a Cell state machine
Both can be simplified for improved readability and maintainability without sacrificing functionality.
1. Replace token_dispatch! Macro with a Method
Instead of using the token_dispatch! macro, introduce a dispatch method on TokenStream that takes a closure. This method would loop over tokens and allow the closure to handle each token, making the code more idiomatic and easier to follow.
Example Implementation:
impl<'a> TokenStream<'a> {
/// Drive a user-supplied closure over every peeked token;
/// the closure must advance the stream itself.
pub fn dispatch<F>(&mut self, mut f: F)
where
F: FnMut(&mut Self, SyntaxKind, Span),
{
while let Some(&(kind, ref span_ref)) = self.peek() {
let span = span_ref.clone();
f(self, kind, span);
}
}
}
Call-site change:
// before:
// token_dispatch!(st, {
// SyntaxKind::T_LPAREN => lp,
// SyntaxKind::T_RPAREN => rp,
// });
// after:
st.stream.dispatch(|stream, kind, span| {
match kind {
SyntaxKind::T_LPAREN => lp(&mut st, span),
SyntaxKind::T_RPAREN => rp(&mut st, span),
_ => stream.advance(),
}
});
2. Replace Manual Stateful Block Parsing with Recursive Combinators
The current approach uses a Cell-based depth counter and manual state management. This can be replaced with a recursive parser combinator (e.g., using chumsky::recursive), which is flatter and easier to read.
Example Implementation:
use chumsky::recursive;
pub(super) fn balanced_block(
open: SyntaxKind,
close: SyntaxKind,
) -> impl Parser<SyntaxKind, (), Error = Simple<SyntaxKind>> + Clone {
recursive(|block| {
just(open)
.padded_by(inline_ws().repeated())
.ignore_then(
block
.or(filter(|k: &SyntaxKind| *k != open && *k != close).ignored())
.padded_by(inline_ws().repeated())
.repeated(),
)
.then_ignore(just(close))
.ignored()
})
}
// For nonempty blocks:
pub(super) fn balanced_block_nonempty(
open: SyntaxKind,
close: SyntaxKind,
) -> impl Parser<SyntaxKind, (), Error = Simple<SyntaxKind>> + Clone {
recursive(|block| {
just(open)
.padded_by(inline_ws().repeated())
.ignore_then(
block
.or(filter(|k: &SyntaxKind| *k != open && *k != close).ignored())
.padded_by(inline_ws().repeated())
.repeated()
.at_least(1),
)
.then_ignore(just(close))
.ignored()
})
}
This approach eliminates the need for internal mutable state and results in a much cleaner parser.
Action Items
These changes will make the parser codebase simpler, more idiomatic, and easier to maintain.
I created this issue for @leynos from #73 (comment).
Tips and commands
Getting Help
The current parser implementation contains two areas of unnecessary complexity:
token_dispatch!)balanced_block_with_minand aCellstate machineBoth can be simplified for improved readability and maintainability without sacrificing functionality.
1. Replace
token_dispatch!Macro with a MethodInstead of using the
token_dispatch!macro, introduce adispatchmethod onTokenStreamthat takes a closure. This method would loop over tokens and allow the closure to handle each token, making the code more idiomatic and easier to follow.Example Implementation:
Call-site change:
2. Replace Manual Stateful Block Parsing with Recursive Combinators
The current approach uses a
Cell-based depth counter and manual state management. This can be replaced with a recursive parser combinator (e.g., usingchumsky::recursive), which is flatter and easier to read.Example Implementation:
This approach eliminates the need for internal mutable state and results in a much cleaner parser.
Action Items
token_dispatch!macro into adispatchmethod onTokenStream.balanced_block_with_min+Cell) with a recursive combinator approach.These changes will make the parser codebase simpler, more idiomatic, and easier to maintain.
I created this issue for @leynos from #73 (comment).
Tips and commands
Getting Help