_ _
_ __ ___ __ _ _ __| | ____| | _____ ___ __
| '_ ` _ \ / _` | '__| |/ / _` |/ _ \ \ /\ / / '_ \
| | | | | | (_| | | | < (_| | (_) \ V V /| | | |
|_| |_| |_|\__,_|_| |_|\_\__,_|\___/ \_/\_/ |_| |_|
_ _
___ ___ _ __ ___ _ __ (_) | ___ _ __
/ __/ _ \| '_ ` _ \| '_ \| | |/ _ \ '__|
| (_| (_) | | | | | | |_) | | | __/ |
\___\___/|_| |_| |_| .__/|_|_|\___|_|
|_|
The codebase for the Project is entirely hand-coded. Every line of code in the project was crafted by a human developer. NO AI-generated Code was used in the development of this project. AI was strictly utilized only for architectural brainstorming and refining complex regular expression.
Two-Pass parsing Strategy
- Markdown Text -> AST (Abstract Syntax Tree)
- First-Pass: The engine scans document to identify high-level structural blocks (e.g.,
caption block,heading,list block,code blocks,table block,quote,unmarkable block,image,formula block) and generates a token stream. - Second-Pass: Inline parsing runs only on terminal nodes that contains raw text content (e.g.
paragraph,heading,list_item,quote,unknown_text). - Design Constraint: Inline Parsing does not decompose nodes further into atomic units. Once a
boldorunderlinetoken is matched, its content is stored as-is. Breaking inline tokens down to the character level adds complexity without any practical benefit.
- First-Pass: The engine scans document to identify high-level structural blocks (e.g.,
- AST -> HTML
- The final output is generated via recursive tree-walk of the AST, ensuring that complex hierarchies are translated into valid, semantic HTML.
The Core AST generation logic is tested to parsing stability and regression prevention:
- 70% Code Coverage Focus is placed on critical structural components.
- Core Block Testing High priority modules (tables, headers, lists, caption, paragraphs)
- npm v10.8.2, node v20.20.0
- Typescript v5.3.2
- Webpack v5.105.3
- TS-Loader v9.5.4
- Tailwind CSS from website v4.0.12
- Shiki v.4.0.2
- Katex v.0.16.33
- Jest v.30.2.0
-
Clone the repository:
-
Install the dependencies:
yarn
-
Run the compiler:
yarn build
-
Open in your browser
http://localhost:8081yarn start
-
For testing use command:
yarn test
Please see entrypoint ./src/index.ts for the API reference.
API functions:
- convertMDtoHTML(txt: string) - converts markdown text into HTML, return HTML string
- convertMDtoTokens(txt: string) - converts markdown text into tokens, return array of tokens
- convertMDtoAST(txt: string) - converts markdown text into AST, return Abstract Syntax Tree
Directories:
-
./src- the main compiler code:../test- the test code../htmlblocks- the html blocks to parse AST into HTML../content- the example text to parse../static- the static files index.html, css styles../types- integration with external libraries
-
/dist- the compiled code and static files, need to run build command‚
Files:
./src/index.ts- the entrypoint of the compiler./src/Grammar.ts- the grammar with Regexp rules./src/Tokenizer.ts- the tokenize class to make AST from MD text./src/Render.ts- the compiler class to compile AST into HTML
