Refactor large parser modules into smaller components under 400 lines (#223)#259
Refactor large parser modules into smaller components under 400 lines (#223)#259leynos wants to merge 22 commits into
Conversation
- Extract rule-body term classification into a dedicated module `rule/classification.rs`. - Move S-expression formatting for Expr into a separate `expr/sexpr.rs` module. - Split Pratt parser postfix parsing, delay-postfix, and diff-marker handling into separate files under `expression/pratt/`. - Remove inline implementations and validations, organizing parser logic into coherent units. - Revamp expression parser tests to use fixture modules, improving maintainability. This refactor improves code organization, readability, and test structure without changing external behavior. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
WalkthroughRefactor large parser modules into focused submodules: extract S‑expression rendering, move rule-body term classification to a new module, split Pratt postfix parsing into postfix/diff/delay helpers, migrate expression test matrices into fixture modules, add test helpers for atom markers, and update Makefile/Cargo/docs. ChangesParser refactor cohort
Sequence Diagram(s) (The changes introduce a multi-component sequential flow in Pratt postfix parsing and satisfy the sequence-diagram conditions.) sequenceDiagram
participant Parser as Pratt parser
participant Postfix as postfix.rs
participant Diff as diff.rs
participant Delay as delay.rs
Parser->>Postfix: parse_postfix(lhs)
Postfix->>Diff: handle_diff_marker() / apply_pending_diff()
Postfix->>Delay: parse_delay_postfix()
Postfix-->>Parser: final Expr (with AtomDiff/AtomDelay applied)
Possibly related issues
Possibly related PRs
Suggested labels
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (2 errors)
✅ Passed checks (16 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Reviewer's GuideRefactors parser internals by extracting S-expression formatting, rule-body classification, and Pratt postfix parsing into dedicated modules, and reworks expression tests around a reusable fixtures system without changing observable parsing behavior. Sequence diagram for the new rule body term parsing pipelinesequenceDiagram
participant RuleBodyExpression
participant RuleClassificationModule as classification
participant PatternParser as parse_pattern
participant ExpressionParser as parse_expression
participant AggregationLogic as aggregation_tracking
participant Errors as errors_vec
RuleBodyExpression->>RuleClassificationModule: parse_rule_body_term(raw, literal_span, first_aggregation_span, errors)
alt raw contains_assignment
RuleClassificationModule->>PatternParser: parse_pattern(parts.pattern)
PatternParser-->>RuleClassificationModule: Result_Pattern_or_errors
alt pattern_ok
RuleClassificationModule->>ExpressionParser: parse_expression(parts.value)
ExpressionParser-->>RuleClassificationModule: Result_Expr_or_errors
alt value_ok
RuleClassificationModule-->>RuleBodyExpression: Some_RuleBodyTerm_Assignment
else value_errors
RuleClassificationModule->>Errors: append(shifted_errors)
RuleClassificationModule-->>RuleBodyExpression: None
end
else pattern_errors
RuleClassificationModule->>Errors: append(shifted_errors)
RuleClassificationModule-->>RuleBodyExpression: None
end
else raw_is_expression
RuleClassificationModule->>ExpressionParser: parse_expression(trimmed_raw)
ExpressionParser-->>RuleClassificationModule: Result_Expr_or_errors
alt expr_ok
RuleClassificationModule->>AggregationLogic: classify_expression(expr, ctx)
AggregationLogic-->>RuleClassificationModule: Option_RuleBodyTerm
RuleClassificationModule-->>RuleBodyExpression: Option_RuleBodyTerm
else expr_errors
RuleClassificationModule->>Errors: append(errors)
RuleClassificationModule-->>RuleBodyExpression: None
end
end
Class diagram for rule body classification and Expr S-expression formattingclassDiagram
class Expr {
+to_sexpr() String
}
class Literal {
}
class Pattern {
+to_source() String
}
class MatchArm {
pattern Pattern
body Expr
}
class RuleBodyTerm {
}
class RuleAssignment {
pattern Pattern
value Expr
}
class RuleAggregation {
project Expr
key Expr
source AggregationSource
}
class RuleForLoop {
pattern Pattern
iterable Expr
guard Expr
body_terms Vec_RuleBodyTerm_
}
class AggregationSource {
<<enum>>
GroupBy
LegacyAggregate
}
class ClassificationContext {
-literal_span Span
-first_aggregation_span Option_Span_
-errors Vec_SimpleSyntaxKind__
}
class ForLoopComponents {
pattern Pattern
iterable Expr
guard Option_Expr_
body Expr
}
class RuleClassifier {
+parse_rule_body_term(raw String, literal_span Span, first_aggregation_span Option_Span_, errors Vec_SimpleSyntaxKind__) Option_RuleBodyTerm_
-parse_assignment(parts AssignmentParts, literal_span Span) Result_Option_RuleBodyTerm__Vec_SimpleSyntaxKind___
-classify_expression(expr Expr, ctx ClassificationContext) Option_RuleBodyTerm_
-classify_aggregation_with_tracking(args Vec_Expr_, source AggregationSource, ctx ClassificationContext) Option_RuleBodyTerm_
-classify_for_loop(components ForLoopComponents, ctx ClassificationContext) RuleBodyTerm
-classify_for_body_with_aggregation_tracking(body Expr, ctx ClassificationContext) Vec_RuleBodyTerm_
-aggregation_source_for(name String) Option_AggregationSource_
-invocation_aggregation_source(callee Expr) Option_AggregationSource_
-validate_aggregation(term RuleBodyTerm, literal_span Span, first_aggregation_span Option_Span_, errors Vec_SimpleSyntaxKind__) bool
-aggregation_arity_error(literal_span Span, source AggregationSource) Vec_SimpleSyntaxKind__
-multiple_aggregations_error(first_span Span, second_span Span) SimpleSyntaxKind
}
Expr --> Literal
Expr --> Pattern
Expr --> MatchArm
RuleBodyTerm <.. RuleAssignment
RuleBodyTerm <.. RuleAggregation
RuleBodyTerm <.. RuleForLoop
RuleForLoop "*" o-- "body_terms" RuleBodyTerm
RuleAggregation --> AggregationSource
RuleClassifier ..> ClassificationContext
RuleClassifier ..> ForLoopComponents
RuleClassifier ..> RuleBodyTerm
RuleClassifier ..> RuleAggregation
RuleClassifier ..> RuleAssignment
ClassificationContext --> Span
ClassificationContext --> AggregationSource
ForLoopComponents --> Pattern
ForLoopComponents --> Expr
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response. src/parser/tests/expression/fixtures/errors.rs Comment on lines +217 to +232 fn collection_error_cases() -> Vec<CountedErrorCase> {
vec![
CountedErrorCase {
src: "[1, 2",
min_errs: 1,
},
CountedErrorCase {
src: "{a: 1",
min_errs: 1,
},
CountedErrorCase {
src: "{a, b}",
min_errs: 1,
},
]
}❌ New issue: Code Duplication |
|
@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response. src/parser/expression/pratt.rs Comment on file use chumsky::error::Simple;
use crate::parser::ast::{Expr, StringLiteral};
use crate::parser::span_utils::parse_u32_decimal;❌ New issue: Code Duplication |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Share the setup and teardown flow used by the Pratt parser's struct literal activation and suspension helpers. Keep the existing public helper signatures and delegate their state changes through one private scope method.
Default `CARGO` to the executable found on `PATH`, with a fallback to the standard Rustup installation path under `HOME`. This lets hook environments that omit the Cargo bin directory still run the repository Makefile targets.
Add a private helper for expression parser error cases that all expect a single minimum error. Keep the existing case order and fixture function signatures while removing repeated `CountedErrorCase` construction.
|
@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response. Ensure that this is validated against the current version of the codegraph. If further refinement to address this finding would be deleterious, please supply a clear explanatory one to two paragraph markdown message I can paste into the CodeScene web ui's diagnostic suppression function so this diagnostic can be silenced. Code Duplicationsrc/parser/tests/expression/fixtures/errors.rs: What lead to degradation?The module contains 4 functions with similar structure: collection_error_cases,control_flow_error_cases,operator_error_cases,postfix_counted_error_cases Why does this problem occur?Duplicated code often leads to code that's harder to change since the same logical change has to be done in multiple functions. More duplication gives lower code health. How to fix it?A certain degree of duplicated code might be acceptable. The problems start when it is the same behavior that is duplicated across the functions in the module, ie. a violation of the Don't Repeat Yourself (DRY) principle. DRY violations lead to code that is changed together in predictable patterns, which is both expensive and risky. DRY violations can be identified using CodeScene's X-Ray analysis to detect clusters of change coupled functions with high code similarity. Read More |
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/parser/ast/rule/classification.rs`:
- Line 61: The check that treats missing assignment values uses
parts.value.is_empty(), which misses whitespace-only values like "x = ";
update the logic to trim the string before checking emptiness (e.g., use
parts.value.trim().is_empty()) so that the code path in the classification logic
(where parts.value is inspected) treats whitespace-only assignments as missing
expressions and triggers the targeted diagnostic instead of falling through to a
generic parse error.
In `@src/parser/tests/expression/fixtures/valid/postfix.rs`:
- Around line 9-67: Add two happy-path fixtures to the
postfix_expression_cases() vec: one for the apostrophe diff postfix (e.g., src
"e'") and one for the delay postfix (e.g., src "f-<1>"), each as an
ExpressionCase entry; set their expected AST using the same test helper pattern
used elsewhere (e.g., postfix_diff(var("e")) and postfix_delay(var("f"),
lit_num("1")) or the equivalent helper names in the tests) so the postfix parser
branches exercised by src/parser/expression/pratt/postfix.rs (apostrophe diff
handling and -<...> delay handling) are asserted to succeed and cannot regress
silently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 0ae55951-bb4d-44a2-a93e-b1901d4665ef
📒 Files selected for processing (18)
Makefilesrc/parser/ast/expr.rssrc/parser/ast/expr/sexpr.rssrc/parser/ast/rule.rssrc/parser/ast/rule/classification.rssrc/parser/expression/pratt.rssrc/parser/expression/pratt/delay.rssrc/parser/expression/pratt/diff.rssrc/parser/expression/pratt/postfix.rssrc/parser/tests/expression.rssrc/parser/tests/expression/fixtures.rssrc/parser/tests/expression/fixtures/errors.rssrc/parser/tests/expression/fixtures/valid.rssrc/parser/tests/expression/fixtures/valid/basic.rssrc/parser/tests/expression/fixtures/valid/collections.rssrc/parser/tests/expression/fixtures/valid/control_flow.rssrc/parser/tests/expression/fixtures/valid/postfix.rssrc/parser/tests/expression/fixtures/valid/structured.rs
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
@coderabbitai please create new GitHub issues for the following: Out-of-scope: GitHub issues requiredDocstring Coverage (
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 88
Both include backlinks to PR |
…ule docs Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Add `docs/developers-guide.md` for the issue `#223` parser split and update `docs/parser-implementation-notes.md` and `docs/contents.md` so the ownership notes and index stay aligned with the current module layout.
Document the extracted `pub(super)` parser helpers and add focused module-local tests for the rule classifier and Pratt postfix boundaries. Add inline `insta` snapshots for `Expr::to_sexpr()` so the extracted S-expression formatter has snapshot coverage without introducing separate snapshot fixture files.
Align the `AtomDiff` and `AtomDelay` helper docs with the postfix node constructors used by parser fixtures.
Remove the extra comment above `postfix_expression_cases` so the fixture module body matches the focused helper split requested for the large-method cleanup.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/parser/expression/pratt/postfix.rs (1)
151-151:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winBind the discarded
next_tok()result explicitly.This comma-advance statement discards the
Optionreturn value without explicit binding. The other three call sites in this file now uselet _ = self.ts.next_tok();but this one was missed.Proposed fix
- self.ts.next_tok(); + let _ = self.ts.next_tok();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/parser/expression/pratt/postfix.rs` at line 151, Replace the lone call that discards the Option returned from self.ts.next_tok() with an explicit binding like the other call sites; locate the instance in src/parser/expression/pratt/postfix.rs (inside the same function where other next_tok() calls are used) and change the statement to bind the result (e.g., let _ = self.ts.next_tok();) so the Option is explicitly handled and the pattern matches the other three call sites.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/parser/expression/pratt/postfix.rs`:
- Line 151: Replace the lone call that discards the Option returned from
self.ts.next_tok() with an explicit binding like the other call sites; locate
the instance in src/parser/expression/pratt/postfix.rs (inside the same function
where other next_tok() calls are used) and change the statement to bind the
result (e.g., let _ = self.ts.next_tok();) so the Option is explicitly handled
and the pattern matches the other three call sites.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 1cdb6394-c399-40ae-990b-71f8e787089c
📒 Files selected for processing (11)
Cargo.tomldocs/contents.mddocs/developers-guide.mddocs/parser-implementation-notes.mdsrc/parser/ast/expr/sexpr.rssrc/parser/ast/rule/classification.rssrc/parser/expression/pratt/delay.rssrc/parser/expression/pratt/diff.rssrc/parser/expression/pratt/postfix.rssrc/parser/tests/expression/fixtures/valid/postfix.rssrc/test_util/expressions.rs
Make the comma consume in `parse_args` explicit by binding the returned `Option`, matching the other intentional token-consumption sites in the postfix parser.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/parser/expression/pratt/postfix.rs`:
- Around line 171-188: The new tests miss exercising the diff and delay branches
in parse_postfix, so add unit tests that call parse_postfix (via
parse_expression) to cover the "'" (diff) postfix and the "-<" (delay) postfix,
including a test that sets up a pending diff and then ensures a delay is
rejected (verifying the guard that rejects delay while a diff is pending);
specifically add tests that assert the parsed AST or error messages for the "'"
and "-<" cases and one that confirms parse_postfix returns the expected error
when a delay is attempted while a diff is pending, referencing parse_postfix,
diff postfix, and delay postfix in your assertions.
- Around line 163-166: The module-wide Clippy suppression on the tests module is
too broad; remove the inner attribute `#![expect(clippy::expect_used, ...)]`
from the `mod tests` declaration and instead add item-level
`#[expect(clippy::expect_used, reason = "...")]` attributes to each test
function that calls `expect()`/`expect_err()` (for example
`qualified_callee_uses_call_postfix` and `trailing_argument_comma_is_rejected`)
so the lint is narrowly scoped to only those test items.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 79925021-05af-4cec-8d42-9af29b36a9d4
📒 Files selected for processing (1)
src/parser/expression/pratt/postfix.rs
Add focused `parse_expression` tests for the Pratt postfix diff and delay paths, including the pending-diff guard before delay parsing. Narrow the `expect_used` Clippy expectations to the individual test cases that need them instead of suppressing the whole test module.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/parser/expression/pratt/postfix.rs`:
- Around line 151-155: Capture the separator token's span before calling
self.ts.next_tok() and use that span when calling self.ts.push_error so the
diagnostic points at the comma instead of `)`/EOF; specifically, in the postfix
parsing code around the trailing-comma check, call something like let sep_span =
self.ts.peek_span().unwrap_or_else(|| self.ts.eof_span()) (or an equivalent that
inspects the separator) before invoking self.ts.next_tok(), then pass sep_span
into self.ts.push_error(sep_span, "unexpected trailing comma in argument list")
instead of using the later peek_span()/eof_span.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 5d60a935-9967-447d-8e3e-02cc15596718
📒 Files selected for processing (1)
src/parser/expression/pratt/postfix.rs
Capture the comma span before consuming it in Pratt postfix argument parsing, then use that span for the trailing-comma diagnostic. Update the call parsing regression expectation so it verifies the diagnostic now points at the comma rather than the closing parenthesis.
Capture consumed postfix separator spans explicitly and report trailing argument commas at the comma token. Remove the defensive diff-marker EOF fallback so the helper exposes the fallible token consume directly. Strengthen parser tests by checking the wrapped diff expression, assignment pattern and value details, and S-expression rendering beside snapshots.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary
Changes
Core Refactor
Tests and Fixtures
Miscellaneous
How to test
Notes
◳ Generated by DevBoxer ◰
ℹ️ Tag @devboxerhub to ask questions and address PR feedback
📎 Task: https://www.devboxer.com/task/4e7f979f-30bc-45b5-844c-4e890802856c
📝 Closes #223
Summary by Sourcery
Refactor parser internals and tests into smaller, focused modules while preserving existing parsing behavior.
Enhancements:
Tests: