Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
8802bab
refactor(parser): modularize rule classification and expression parsing
leynos Apr 23, 2026
d916021
Extract struct literal scope handling
leynos May 1, 2026
4257685
Resolve Cargo for non-login make runs
leynos May 1, 2026
6d8c068
Share counted expression error fixtures
leynos May 1, 2026
147aea3
Fix whitespace-only assignment value diagnostics and expand pratt mod…
leynos May 3, 2026
7082666
Relax diff-marker early peek-and-error to defer validation to postfix…
leynos May 4, 2026
a7a827e
Apply cargo fmt to expression_postfix.rs
leynos May 4, 2026
ccb5ff8
Add atom_diff and atom_delay test helpers with postfix fixture coverage
leynos May 4, 2026
27ec6f6
Add S'(x) and A()-<10> postfix fixture cases
leynos May 4, 2026
1c531ff
Fix pratt postfix architectural violations and record issue #223 in r…
leynos May 4, 2026
e36fb4c
Split postfix expression fixture groups
leynos May 7, 2026
bd73231
Limit Markdown lint to branch changes
leynos May 7, 2026
15f3f2d
Propagate Markdown lint setup failures
leynos May 7, 2026
7de8f93
Preserve Markdown lint failures
leynos May 7, 2026
0549ba9
Document parser module boundaries (#223)
leynos May 19, 2026
333a035
Add parser module regression tests (#223)
leynos May 19, 2026
82aad29
Clarify AtomDiff test helpers
leynos May 19, 2026
841ef5f
Align postfix fixture helper body
leynos May 19, 2026
a0508e5
Bind comma token consume in postfix parser
leynos May 19, 2026
1a6f9b8
Cover postfix diff and delay branches
leynos May 19, 2026
bdd4bd9
Report trailing comma errors at the separator
leynos May 25, 2026
324810e
Tighten postfix parser test coverage (#223)
leynos May 25, 2026
dab06f2
Strengthen postfix parser assertions (#223)
leynos May 26, 2026
de3e89f
Narrow classification test lint expectations (#223)
leynos May 26, 2026
81c28f5
Cover parser fixture docs and postfix helpers (#223)
leynos May 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ test-support = []

[dev-dependencies]
rstest = "0.25"
insta = "1.47.2"
ddlint = { path = ".", features = ["test-support"] }

[lints.clippy]
Expand Down
10 changes: 8 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
nixie tools

APP ?= ddlint
CARGO ?= cargo
CARGO ?= $(or $(shell command -v cargo 2>/dev/null),$(HOME)/.cargo/bin/cargo)
BUILD_JOBS ?=
CLIPPY_FLAGS ?= --all-targets --all-features -- -D warnings
MDLINT ?= markdownlint
Expand Down Expand Up @@ -45,7 +45,13 @@ check-fmt: ## Verify formatting
$(CARGO) fmt --all -- --check

markdownlint: ## Lint Markdown files
find . -type f -name '*.md' -not -path './target/*' -print0 | xargs -0 $(MDLINT)
git rev-parse --verify origin/main >/dev/null
@set -e; \
tmp=$$(mktemp); \
trap 'rm -f "$$tmp"' EXIT; \
git diff --name-only -z --diff-filter=ACMRT origin/main...HEAD -- \
'*.md' '*.markdown' '*.mdx' > "$$tmp"; \
if [ -s "$$tmp" ]; then xargs -0 $(MDLINT) < "$$tmp"; fi

nixie: ## Validate Mermaid diagrams
find . -type f -name '*.md' -not -path './target/*' -print0 | xargs -0 $(NIXIE)
Expand Down
2 changes: 2 additions & 0 deletions docs/contents.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ This page lists the current documents in the `docs/` directory.
roadmap for the `ddlint` linter.
- [Documentation style guide](./documentation-style-guide.md): Authoring
conventions for Markdown structure, spelling, formatting, and ADR content.
- [Developer guide](./developers-guide.md): Parser module boundaries and
ownership notes for the current implementation split.
- [Differential Datalog parser syntax specification updated](./differential-datalog-parser-syntax-spec-updated.md):
Normative reference for DDlog syntax, lexer rules, operator precedence, and
parser desugarings.
Expand Down
56 changes: 56 additions & 0 deletions docs/developers-guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Developer guide

This guide records the parser module structure introduced by issue `#223`.
It is intentionally narrow and documents ownership boundaries rather than the
full parsing pipeline.

## Parser module structure

### `src/parser/ast/expr/sexpr.rs`

- Owns S-expression rendering helpers for `Expr`.
- Supports test and fixture comparisons without coupling callers to debug
formatting.
- Should remain presentation-only; parsing and semantic classification belong
elsewhere.

### `src/parser/ast/rule/classification.rs`

- Owns rule-body term classification for raw literals.
- Handles assignment parsing, aggregation detection, and `for`-loop lowering
within the rule-body helper path.
- Keeps rule-body classification separate from the public `Rule` wrapper so
`rule.rs` stays focused on the surface API.

### `src/parser/expression/pratt/postfix.rs`

- Owns postfix dispatch for the Pratt parser.
- Routes function calls, bit slices, field access, tuple indexing, method
calls, and delay postfixes to the appropriate helper.
- Coordinates the pending diff-marker state across the postfix chain.

### `src/parser/expression/pratt/diff.rs`

- Owns diff-marker tracking and validation.
- Wraps completed postfix expressions in `Expr::AtomDiff` when a diff marker
is pending.
- Emits the targeted diagnostics for duplicate, misplaced, or dangling diff
markers.

### `src/parser/expression/pratt/delay.rs`

- Owns `expr -<N>` postfix parsing.
- Consumes the `-<` token pair, reads the delay literal, and returns
`Expr::AtomDelay` on success.
- Keeps delay-specific validation separate from the generic postfix loop.

## Boundary rules

- Keep formatting helpers in `sexpr.rs` rather than mixing them into the core
expression parser.
- Keep rule-body classification in `classification.rs` rather than adding
helper-stage logic to `rule.rs`.
- Keep postfix dispatch in `postfix.rs`; add new postfix behaviour there only
when it needs shared chain state.
- Keep diff-marker state and delay parsing in their dedicated submodules so
`pratt.rs` remains the central parser entry point.
26 changes: 24 additions & 2 deletions docs/parser-implementation-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ code review. It is intentionally non-normative.
architecture and public contracts.
- This document is the worked implementation guide for current module mapping,
migration sequencing, and non-regression invariants.
- `docs/developers-guide.md` records the parser module boundaries introduced by
issue `#223` and should be kept in sync with this document when those
boundaries move.
- If behaviour differs from the syntax spec, track that mismatch in
`docs/parser-conformance-register.md` rather than duplicating decision text
here.
Expand All @@ -31,6 +34,18 @@ The parser stack is split into three cooperating layers:
- AST wrappers and expression parsing (`src/parser/ast/*`,
`src/parser/expression/*`): build typed views and expression trees.

Issue `#223` sharpened the internal ownership boundaries inside the AST and
expression layers:

- `src/parser/ast/expr/sexpr.rs` owns S-expression rendering helpers for
`Expr` and is presentation-only.
- `src/parser/ast/rule/classification.rs` owns rule-body term classification
and the aggregation-tracking path used by `Rule` helpers.
- `src/parser/expression/pratt/postfix.rs`,
`src/parser/expression/pratt/diff.rs`, and
`src/parser/expression/pratt/delay.rs` split the Pratt postfix chain into a
dispatcher plus two focused helpers for diff markers and delay postfixes.

Current pipeline guarantees are intentionally narrow:

- `parse()` builds the CST-backed `Parsed` result, collects top-level `for`
Expand Down Expand Up @@ -101,6 +116,9 @@ The following invariants must hold throughout the crate split:
Primary implementation points:

- `src/parser/expression/pratt.rs`
- `src/parser/expression/pratt/postfix.rs`
- `src/parser/expression/pratt/diff.rs`
- `src/parser/expression/pratt/delay.rs`
- `src/parser/expression/qualified.rs`

### Struct-literal guard
Expand Down Expand Up @@ -171,6 +189,7 @@ Implementation points:

- `src/parser/span_scanners/rules.rs`
- `src/parser/ast/rule.rs`
- `src/parser/ast/rule/classification.rs`

Aggregation and lowering stage boundaries are tracked in the conformance
register. The aggregation boundary itself is now fixed as helper-stage in the
Expand Down Expand Up @@ -287,8 +306,8 @@ Important invariants:
rejects capitalized names (e.g., `Foo`) as invalid.
- The span scanner keeps non-`extern` rejection separate from the
output-signature check and emits the targeted diagnostic
`transformer declarations require ':' followed by at least one output identifier`
when the colon or first output identifier is missing.
`transformer declarations require ':' followed by at least one output
identifier` when the colon or first output identifier is missing.

These helpers are shared intentionally to keep declaration parsing consistent
across top-level constructs.
Expand Down Expand Up @@ -346,10 +365,13 @@ token names or their human-readable equivalents.
- Tokenization and keyword policy: `src/tokenizer.rs`
- Entry parse orchestration: `src/parser/mod.rs`
- Pratt parser: `src/parser/expression/pratt.rs`
- Pratt postfix helpers:
`src/parser/expression/pratt/{postfix,diff,delay}.rs`
- Prefix/infix helpers: `src/parser/expression/*.rs`
- Centralized diagnostic messages: `src/parser/error_messages.rs`
- Rule span scanning: `src/parser/span_scanners/rules.rs`
- Top-level scanners: `src/parser/span_scanners/*.rs`
- AST wrappers: `src/parser/ast/*.rs`
- Rule-body classification helpers: `src/parser/ast/rule/classification.rs`
- Shared parse utilities: `src/parser/ast/parse_utils/*.rs`
- Test assertion helpers: `src/test_util/assertions.rs`
9 changes: 9 additions & 0 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,15 @@ split parser library surfaces defined in `docs/adr-001-parser-crate-split.md`.
- [x] 2.3.2. Parse aggregation and FlatMap constructs in rule bodies.
See docs/differential-datalog-parser-syntax-spec-updated.md §5.12 and
docs/differential-datalog-parser-syntax-spec-updated.md §6.1.
- [x] 2.3.3. Refactor oversized parser modules to satisfy the 400-line
maintainability guideline (issue `#223`). Split `rule.rs`, `expr.rs`, `pratt.rs`,
and the expression test module into focused submodules:
`src/parser/ast/rule/classification.rs`,
`src/parser/ast/expr/sexpr.rs`,
`src/parser/expression/pratt/delay.rs`,
`src/parser/expression/pratt/diff.rs`,
`src/parser/expression/pratt/postfix.rs`, and
`src/parser/tests/expression/fixtures/`. See PR `#259`.
Comment thread
leynos marked this conversation as resolved.

### 2.4. Syntax-spec lexical and expression conformance

Expand Down
147 changes: 2 additions & 145 deletions src/parser/ast/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
//! can inspect without re-parsing tokens.
use std::fmt;

mod sexpr;

use super::number::NumberLiteral;
use super::pattern::Pattern;
use super::string_literal::StringLiteral;
Expand Down Expand Up @@ -308,148 +310,3 @@ pub enum Expr {
/// for later lowering stages, not current parser behaviour.
MapLit(Vec<(Expr, Expr)>),
}
impl Expr {
/// Display the expression as a simple S-expression for tests.
#[must_use]
pub fn to_sexpr(&self) -> String {
match self {
Self::Literal(Literal::Number(n)) => n.to_sexpr(),
Self::Literal(Literal::String(s)) => s.to_sexpr(),
Self::Literal(Literal::Bool(b)) => b.to_string(),
Self::Variable(name) => name.clone(),
Self::Apply { callee, args } | Self::Call { callee, args } => format_nary(
"call",
std::iter::once(callee.to_sexpr()).chain(args.iter().map(Self::to_sexpr)),
),
Self::MethodCall { recv, name, args } => format_nary(
"method",
std::iter::once(recv.to_sexpr())
.chain(std::iter::once(name.clone()))
.chain(args.iter().map(Self::to_sexpr)),
),
Self::FieldAccess { expr, field } => {
format_nary("field", [expr.to_sexpr(), field.clone()])
}
Self::TupleIndex { expr, index } => {
format_nary("tuple-index", [expr.to_sexpr(), index.clone()])
}
Self::BitSlice { expr, hi, lo } => {
format_nary("bitslice", [expr.to_sexpr(), hi.to_sexpr(), lo.to_sexpr()])
}
Self::Struct { name, fields } => format_nary(
"struct",
std::iter::once(name.clone())
.chain(fields.iter().map(|(n, e)| format_field(n, &e.to_sexpr()))),
),
Self::Tuple(items) => format_nary("tuple", items.iter().map(Self::to_sexpr)),
Self::Closure { params, body } => format_nary(
"closure",
[format!("({})", params.join(" ")), body.to_sexpr()],
),
Self::IfElse {
condition,
then_branch,
else_branch,
} => format_if_else(condition, then_branch, else_branch),
Self::Unary { op, expr } => format_nary(op.symbol(), std::iter::once(expr.to_sexpr())),
Self::AtomDiff { expr } => format_nary("diff", [expr.to_sexpr()]),
Self::AtomDelay { delay, expr } => {
format_nary("delay", [delay.to_string(), expr.to_sexpr()])
}
Self::Binary { op, lhs, rhs } => {
format_nary(op.symbol(), [lhs.to_sexpr(), rhs.to_sexpr()])
}
Self::Group(e) => format_nary("group", std::iter::once(e.to_sexpr())),
Self::ForLoop {
pattern,
iterable,
guard,
body,
} => format_for_loop(pattern, iterable, guard.as_deref(), body),
Self::Match { scrutinee, arms } => format_match(scrutinee, arms),
Self::Break => "(break)".to_string(),
Self::Continue => "(continue)".to_string(),
Self::Return { value } => format_nary("return", [value.to_sexpr()]),
Self::VecLit(items) => format_nary("vec", items.iter().map(Self::to_sexpr)),
Self::MapLit(entries) => {
format_nary("map", entries.iter().map(|(k, v)| format_kv(k, v)))
}
}
}
}

#[inline]
fn format_field(name: &str, value: &str) -> String {
let mut s = String::with_capacity(name.len() + value.len() + 3);
s.push('(');
s.push_str(name);
s.push(' ');
s.push_str(value);
s.push(')');
s
}

#[inline]
fn format_kv(key: &Expr, value: &Expr) -> String {
let k = key.to_sexpr();
let v = value.to_sexpr();
let mut s = String::with_capacity(k.len() + v.len() + 10);
s.push_str("(entry ");
s.push_str(&k);
s.push(' ');
s.push_str(&v);
s.push(')');
s
}

fn format_nary<I>(label: &str, parts: I) -> String
where
I: IntoIterator<Item = String>,
{
let parts_vec: Vec<String> = parts.into_iter().collect();
let cap = 2 + label.len() + parts_vec.iter().map(|p| 1 + p.len()).sum::<usize>();
let mut out = String::with_capacity(cap);
out.push('(');
out.push_str(label);
for part in parts_vec {
out.push(' ');
out.push_str(&part);
}
out.push(')');
out
}

fn format_if_else(condition: &Expr, then_branch: &Expr, else_branch: &Expr) -> String {
format_nary(
"if",
[
condition.to_sexpr(),
then_branch.to_sexpr(),
else_branch.to_sexpr(),
],
)
}

fn format_for_loop(
pattern: &Pattern,
iterable: &Expr,
guard: Option<&Expr>,
body: &Expr,
) -> String {
let mut parts = vec![pattern.to_source(), iterable.to_sexpr()];
if let Some(cond) = guard {
parts.push(cond.to_sexpr());
}
parts.push(body.to_sexpr());
format_nary("for", parts)
}

fn format_match(scrutinee: &Expr, arms: &[MatchArm]) -> String {
let arm_parts = arms
.iter()
.map(|arm| format_nary("arm", [arm.pattern.to_source(), arm.body.to_sexpr()]));
format_nary(
"match",
std::iter::once(scrutinee.to_sexpr()).chain(arm_parts),
)
}
Loading
Loading