Skip to content

meugenom/markdown-ts-compiler

Repository files navigation

                        _       _
   _ __ ___   __ _ _ __| | ____| | _____      ___ __  
  | '_ ` _ \ / _` | '__| |/ / _` |/ _ \ \ /\ / / '_ \ 
  | | | | | | (_| | |  |   < (_| | (_) \ V  V /| | | |
  |_| |_| |_|\__,_|_|  |_|\_\__,_|\___/ \_/\_/ |_| |_|

                                  _ _       
         ___ ___  _ __ ___  _ __ (_) | ___ _ __ 
        / __/ _ \| '_ ` _ \| '_ \| | |/ _ \ '__|
       | (_| (_) | | | | | | |_) | | |  __/ |
        \___\___/|_| |_| |_| .__/|_|_|\___|_|
                           |_|               

Markdown Typescript Compiler

Human Made No AI Generated Code

Version

TypeScript Webpack TailwindCSS

DEMO

IMPORTANT!

The codebase for the Project is entirely hand-coded. Every line of code in the project was crafted by a human developer. NO AI-generated Code was used in the development of this project. AI was strictly utilized only for architectural brainstorming and refining complex regular expression.

Core Architecture

Two-Pass parsing Strategy

  1. Markdown Text -> AST (Abstract Syntax Tree)
    • First-Pass: The engine scans document to identify high-level structural blocks (e.g., caption block, heading, list block, code blocks, table block, quote, unmarkable block, image, formula block) and generates a token stream.
    • Second-Pass: Inline parsing runs only on terminal nodes that contains raw text content (e.g. paragraph, heading, list_item, quote, unknown_text).
    • Design Constraint: Inline Parsing does not decompose nodes further into atomic units. Once a bold or underline token is matched, its content is stored as-is. Breaking inline tokens down to the character level adds complexity without any practical benefit.
  2. AST -> HTML
    • The final output is generated via recursive tree-walk of the AST, ensuring that complex hierarchies are translated into valid, semantic HTML.

Testing

The Core AST generation logic is tested to parsing stability and regression prevention:

  • 70% Code Coverage Focus is placed on critical structural components.
  • Core Block Testing High priority modules (tables, headers, lists, caption, paragraphs)

testing screenshot

Technologies Used:

  • npm v10.8.2, node v20.20.0
  • Typescript v5.3.2
  • Webpack v5.105.3
  • TS-Loader v9.5.4
  • Tailwind CSS from website v4.0.12
  • Shiki v.4.0.2
  • Katex v.0.16.33
  • Jest v.30.2.0

How to use it:

  1. Clone the repository:

  2. Install the dependencies:

    yarn
  3. Run the compiler:

    yarn build
  4. Open in your browser http://localhost:8081

    yarn start
  5. For testing use command:

    yarn test

API Reference:

Please see entrypoint ./src/index.ts for the API reference.

API functions:

  • convertMDtoHTML(txt: string) - converts markdown text into HTML, return HTML string
  • convertMDtoTokens(txt: string) - converts markdown text into tokens, return array of tokens
  • convertMDtoAST(txt: string) - converts markdown text into AST, return Abstract Syntax Tree

How to use it in your project:

Directories:

  • ./src - the main compiler code:

    • ../test - the test code
    • ../htmlblocks - the html blocks to parse AST into HTML
    • ../content - the example text to parse
    • ../static - the static files index.html, css styles
    • ../types - integration with external libraries
  • /dist - the compiled code and static files, need to run build command‚

Files:

  • ./src/index.ts - the entrypoint of the compiler
  • ./src/Grammar.ts - the grammar with Regexp rules
  • ./src/Tokenizer.ts - the tokenize class to make AST from MD text
  • ./src/Render.ts - the compiler class to compile AST into HTML

Author:

meugenom

License: MIT

About

[NO-AI] - Compiler to parse markdown text to html, TS, AST-Tree, REGEXP

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors