Fix nested delimiter parsing#47
Conversation
Reviewer's GuideThis PR refactors the Class diagram for DelimiterDepth helper structclassDiagram
class DelimiterDepth {
+usize parens
+usize angles
+usize squares
+usize braces
+open_paren()
+close_paren() bool
+open_angle()
+close_angle() bool
+open_square()
+close_square() bool
+open_brace()
+close_brace() bool
+is_top_level() bool
}
Class diagram for updated parse_name_type_pairs functionclassDiagram
class parse_name_type_pairs {
+iter: Iterator
+depth: DelimiterDepth
+name: Option<String>
+buf: String
+parse logic using DelimiterDepth
}
parse_name_type_pairs --> DelimiterDepth
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Warning Rate limit exceeded@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 0 minutes and 13 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
Summary by CodeRabbit
Summary by CodeRabbit
WalkthroughThe Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Function
participant Parser as parse_name_type_pairs
participant Tokenizer as Token Stream
Test->>Parser: Call with parameter string
loop For each token
Parser->>Tokenizer: Get next token
alt Token is opening delimiter
Parser->>Parser: Increment respective counter
else Token is closing delimiter
Parser->>Parser: Decrement respective counter
end
alt Token is comma and all counters are zero
Parser->>Parser: Split name/type pair
end
alt Token is closing parenthesis and all counters are zero
Parser->>Parser: End parsing
end
end
Parser-->>Test: Return parsed name/type pairs
Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate Unit Tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes - here's some feedback:
- Extract the repeated
parens == 0 && angles == 0 && squares == 0 && braces == 0check into a helper (e.g.is_top_level()) to reduce duplication and improve readability. - Encapsulate the four delimiter counters into a small struct (e.g.
DelimiterDepth) with increment/decrement and emptiness-check methods for cleaner code organization. - Consider switching to a recursive descent or small state‐machine approach for nested generics/arrays parsing to avoid manual delimiter counting and improve maintainability.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Extract the repeated `parens == 0 && angles == 0 && squares == 0 && braces == 0` check into a helper (e.g. `is_top_level()`) to reduce duplication and improve readability.
- Encapsulate the four delimiter counters into a small struct (e.g. `DelimiterDepth`) with increment/decrement and emptiness-check methods for cleaner code organization.
- Consider switching to a recursive descent or small state‐machine approach for nested generics/arrays parsing to avoid manual delimiter counting and improve maintainability.
## Individual Comments
### Comment 1
<location> `src/parser/mod.rs:1284` </location>
<code_context>
+ angles = angles.saturating_sub(1);
+ buf.push_str(t.text());
+ }
+ SyntaxKind::T_SHL => {
+ angles += 2;
+ buf.push_str(t.text());
+ }
+ SyntaxKind::T_SHR => {
</code_context>
<issue_to_address>
Handling of T_SHL and T_SHR may not be robust for nested or mixed angle brackets.
The current approach may miscount angles when << or >> are used outside of nested generics, potentially causing incorrect parsing in such edge cases. Please review whether this logic handles all relevant scenarios correctly.
</issue_to_address>
### Comment 2
<location> `src/parser/mod.rs:1264` </location>
<code_context>
}
SyntaxKind::T_RPAREN => {
- if depth == 0 {
+ if parens == 0 && angles == 0 && squares == 0 && braces == 0 {
if let Some(n) = name.take() {
let ty = buf.trim();
</code_context>
<issue_to_address>
The check for all delimiter counters being zero may not handle malformed input gracefully.
Malformed input with unbalanced delimiters could cause incorrect parsing or runtime errors. Please add error handling for these cases.
</issue_to_address>
### Comment 3
<location> `src/parser/mod.rs:1273` </location>
<code_context>
break;
}
- depth -= 1;
+ parens = parens.saturating_sub(1);
+ buf.push_str(t.text());
+ }
+ SyntaxKind::T_LT => {
</code_context>
<issue_to_address>
Use of saturating_sub prevents underflow but may mask logic errors.
Consider explicitly handling cases where the closing delimiter does not match an opening one, rather than relying on saturating_sub to prevent underflow.
Suggested implementation:
```rust
if parens == 0 {
// Handle unmatched closing parenthesis explicitly.
// You may want to log an error, return, or break as appropriate.
// For now, we'll break out of the loop.
break;
} else {
parens -= 1;
buf.push_str(t.text());
}
```
```rust
if angles == 0 {
// Handle unmatched closing angle bracket explicitly.
// You may want to log an error, return, or break as appropriate.
// For now, we'll break out of the loop.
break;
} else {
angles -= 1;
buf.push_str(t.text());
}
```
</issue_to_address>
### Comment 4
<location> `tests/parser.rs:660` </location>
<code_context>
"function complex(p: (u32, (u8, string))): bool {\n}\n"
}
+#[fixture]
+fn function_generic_params() -> &'static str {
+ "function example(arg: Vec<(u32, string)>, map: Map<string, u64>): bool {}\n"
+}
</code_context>
<issue_to_address>
Consider adding tests for more deeply nested generics and arrays.
Adding tests for cases like `Vec<Map<string, Vec<u8>>>` or `[Vec<u32>]` would help ensure the delimiter parsing logic handles nested generics and arrays correctly.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| SyntaxKind::T_LPAREN => { | ||
| depth += 1; | ||
| parens += 1; | ||
| buf.push_str(t.text()); |
There was a problem hiding this comment.
issue: Handling of T_SHL and T_SHR may not be robust for nested or mixed angle brackets.
The current approach may miscount angles when << or >> are used outside of nested generics, potentially causing incorrect parsing in such edge cases. Please review whether this logic handles all relevant scenarios correctly.
| } | ||
| SyntaxKind::T_RPAREN => { | ||
| if depth == 0 { | ||
| if parens == 0 && angles == 0 && squares == 0 && braces == 0 { |
There was a problem hiding this comment.
issue: The check for all delimiter counters being zero may not handle malformed input gracefully.
Malformed input with unbalanced delimiters could cause incorrect parsing or runtime errors. Please add error handling for these cases.
| SyntaxKind::T_LPAREN => { | ||
| depth += 1; | ||
| parens += 1; | ||
| buf.push_str(t.text()); |
There was a problem hiding this comment.
suggestion (bug_risk): Use of saturating_sub prevents underflow but may mask logic errors.
Consider explicitly handling cases where the closing delimiter does not match an opening one, rather than relying on saturating_sub to prevent underflow.
Suggested implementation:
if parens == 0 {
// Handle unmatched closing parenthesis explicitly.
// You may want to log an error, return, or break as appropriate.
// For now, we'll break out of the loop.
break;
} else {
parens -= 1;
buf.push_str(t.text());
} if angles == 0 {
// Handle unmatched closing angle bracket explicitly.
// You may want to log an error, return, or break as appropriate.
// For now, we'll break out of the loop.
break;
} else {
angles -= 1;
buf.push_str(t.text());
}| #[fixture] | ||
| fn function_generic_params() -> &'static str { |
There was a problem hiding this comment.
suggestion (testing): Consider adding tests for more deeply nested generics and arrays.
Adding tests for cases like Vec<Map<string, Vec<u8>>> or [Vec<u32>] would help ensure the delimiter parsing logic handles nested generics and arrays correctly.
|
@sourcery-ai review |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/parser/mod.rs (2)
1324-1339: Verify error handling for unbalanced parentheses.The current implementation uses
close_paren()which returnsfalsefor unmatched closing parentheses and breaks the loop. However, this may not provide clear error reporting to users about malformed input.Consider adding explicit error handling or logging when
close_paren()returnsfalseto help users identify malformed parameter lists.
1340-1362: Enhanced angle bracket handling for generics.The addition of
T_LT,T_GT,T_SHL, andT_SHRhandling correctly supports generic type parsing. The special handling of double angle brackets (T_SHLopening twice,T_SHRclosing twice) is appropriate for nested generics.Verify that the T_SHL and T_SHR handling correctly manages all edge cases, particularly when these tokens appear outside of generic contexts.
#!/bin/bash # Description: Search for usage patterns of T_SHL and T_SHR in the codebase to understand potential edge cases # Expected: Find examples of these tokens and their typical usage contexts echo "Searching for T_SHL and T_SHR usage patterns:" rg -A 3 -B 3 "T_SHL|T_SHR" echo -e "\nSearching for double angle bracket patterns in source:" rg "<<|>>" --type rust
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/parser/mod.rs(4 hunks)tests/parser.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
`**/*.rs`: Comment why, not what. Explain assumptions, edge cases, trade-offs, o...
**/*.rs: Comment why, not what. Explain assumptions, edge cases, trade-offs, or complexity. Don't echo the obvious.
Comments must use en-GB-oxendict spelling and grammar.
Function documentation must include clear examples.
Name things precisely. Use clear, descriptive variable and function names. For booleans, prefer names with is, has, or should.
Each file should encapsulate a coherent module. Group related code (e.g., models + utilities + fixtures) close together.
Group by feature, not layer. Colocate views, logic, fixtures, and helpers related to a domain concept rather than splitting by type.
Every module must begin with a module level (//! ) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Place function attributes after doc comments.
Do not use return in single-line functions.
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Clippy warnings MUST be disallowed.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Use predicate functions for conditional criteria with more than two branches.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Prefer .expect() over .unwrap().
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
src/parser/mod.rstests/parser.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments must use en-GB-oxendict spelling and grammar.
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
src/parser/mod.rstests/parser.rs
🧬 Code Graph Analysis (1)
tests/parser.rs (1)
src/parser/mod.rs (4)
parse(223-252)errors(212-214)parameters(1802-1804)return_type(1808-1826)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build-test
🔇 Additional comments (6)
src/parser/mod.rs (5)
1236-1298: Well-structured delimiter tracking implementation.The
DelimiterDepthstruct provides a clean abstraction for tracking nested delimiters. The consistent pattern of open/close methods for each delimiter type enhances code readability and maintainability.
1307-1307: Improved delimiter tracking initialisation.The replacement of the single
depthcounter withDelimiterDepth::default()correctly initialises tracking for all delimiter types.
1320-1322: Correct parentheses handling.The logic properly increments parentheses depth and appends the token text to the buffer.
1363-1384: Comprehensive square bracket and brace handling.The implementation correctly handles square brackets for arrays and braces for other constructs, following the same pattern as parentheses and angle brackets.
1385-1397: Correct comma and colon handling with multi-delimiter awareness.The logic properly checks
depth.is_top_level()before treating commas as parameter separators and colons as name/type separators, ensuring nested delimiters don't interfere with parameter parsing.tests/parser.rs (1)
778-845: Thorough validation of enhanced parsing capability.The test functions properly verify that:
- Parsing completes without errors
- Function count is correct
- Parameter names and types are correctly extracted
- Return types are properly identified
The tests directly validate the core functionality enhanced by the
DelimiterDepthimprovements.
|
@coderabbitai Are these comments now resolved? Overall Comments
Individual CommentsComment 1
</code_context> <issue_to_address> The current approach may miscount angles when << or >> are used outside of nested generics, potentially causing incorrect parsing in such edge cases. Please review whether this logic handles all relevant scenarios correctly. Comment 2
</code_context> <issue_to_address> Malformed input with unbalanced delimiters could cause incorrect parsing or runtime errors. Please add error handling for these cases. Comment 3
</code_context> <issue_to_address> Consider explicitly handling cases where the closing delimiter does not match an opening one, rather than relying on saturating_sub to prevent underflow. Suggested implementation: if parens == 0 {
// Handle unmatched closing parenthesis explicitly.
// You may want to log an error, return, or break as appropriate.
// For now, we'll break out of the loop.
break;
} else {
parens -= 1;
buf.push_str(t.text());
} if angles == 0 {
// Handle unmatched closing angle bracket explicitly.
// You may want to log an error, return, or break as appropriate.
// For now, we'll break out of the loop.
break;
} else {
angles -= 1;
buf.push_str(t.text());
}</issue_to_address> Comment 4 +#[fixture]
<issue_to_address> Adding tests for cases like |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 1495 🏁 Script executed: Length of output: 7759 🏁 Script executed: Length of output: 1730 @leynos Based on my analysis of the current code, most of the review comments have been resolved! Here's the status: ✅ RESOLVED Comments:Overall Comments:
Individual Comments:
|
|
Stale |
Summary
parse_name_type_pairsTesting
make fmtmake lintmake testhttps://chatgpt.com/codex/tasks/task_e_68672ddda7788322964ac58898d58af3
Summary by Sourcery
Replace the ad-hoc depth counter in parse_name_type_pairs with a dedicated DelimiterDepth tracker and enhance the parser to correctly handle nested generic (<>) and array ([]) delimiters in function parameter lists.
New Features:
Bug Fixes:
Enhancements:
Tests: