Skip to content

Correct terminology from AST to CST in documentation#13

Open
horner wants to merge 1 commit intoDoctave:mainfrom
horner:patch-1
Open

Correct terminology from AST to CST in documentation#13
horner wants to merge 1 commit intoDoctave:mainfrom
horner:patch-1

Conversation

@horner
Copy link
Copy Markdown

@horner horner commented Oct 19, 2025

When Dossier says “AST provided by Tree-sitter”, that’s slightly imprecise — what it really means is:

Dossier uses Tree-sitter’s parse tree (a CST) as input and treats it like an AST for its analysis layer.

Tree-sitter itself always produces a Concrete Syntax Tree — a complete structural representation of the text according to the grammar.
Dossier then walks that tree and builds its own semantic model (the JSON “Entity” schema).

So the phrase “AST provided by Tree-sitter” is shorthand for:

  • “We use the syntax tree Tree-sitter gives us, and we call it our AST because we don’t do a separate abstraction pass.”

That’s common in many tools that:

  • don’t need to normalize or simplify syntax (e.g., doc extractors, highlighters),
  • or treat Tree-sitter’s CST as “good enough” for semantic traversal.

In summary

Term What Tree-sitter actually provides How Dossier uses it
CST Full parse tree including punctuation and syntax variants ✅ Consumed directly
AST Usually a simplified, semantic model 🚫 Not separate; Dossier reuses the CST
Entity JSON Dossier’s normalized output schema ✅ Built from CST traversal

So when you see “AST provided by Tree-sitter,” read it as “Tree-sitter CST used as the working AST.”

Updated references from AST to CST in README.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant