md2LLM

Turn your markdown notes into a small local model you can chat with.

md2LLM reads an Obsidian vault or any folder of Markdown files, scores the notes, generates fine-tuning data, recommends a training path based on your hardware, and helps you run the resulting model locally with Ollama.

Overview • Features • Getting Started • Run The App • Workflow • Training Options • Project Structure • Troubleshooting

Tip

You can use the full browser workflow, or generate training_data.jsonl from the CLI with python -m pipeline.main.

Overview

The project is split into three stages:

Markdown notes
    |
    v
Training data generation
    |
    v
Fine-tuning on MLX / Unsloth / Colab
    |
    v
Local chat through Ollama

Typical flow:

Select a vault, folder upload, zip archive, or an existing training_data.jsonl file.
Choose the training goal: knowledge, style, reasoning, or chatbot.
Generate JSONL examples from your notes using an LLM provider.
Pick a base model from the recommended Hugging Face repos.
Fine-tune locally on Apple Silicon with MLX, on NVIDIA with Unsloth, or use the Colab fallback.
Export or load the trained model and chat with it locally.

Features

Browser-based workflow for vault upload, note review, generation, training, and chat
Markdown parsing with frontmatter, tags, wikilinks, and note quality scoring
Four dataset generation modes for different personalization goals
Hardware-aware training recommendations
Local training support for Apple Silicon and NVIDIA GPUs
Google Colab fallback for low-memory or CPU-only machines
Ollama integration for model discovery and local chat
CLI entry point for dataset generation without the frontend

Getting Started

Prerequisites

You need:

Python 3.10+
Node.js 18+
Ollama installed locally for chat and local model usage
An OPENAI_API_KEY for the default data-generation path

Recommended hardware:

Apple Silicon Mac with 16 GB+ RAM for MLX training
NVIDIA GPU with CUDA for Unsloth training
Low-memory or CPU-only machines should expect to use the Colab path

Install Dependencies

git clone <your-repo-url>
cd ownLLM

python -m venv .venv
source .venv/bin/activate

pip install -r requirements.txt
cd frontend
npm install
cd ..

Install the training backend that matches your machine:

# Apple Silicon
pip install -r requirements-mlx.txt

# NVIDIA GPU
pip install -r requirements-unsloth.txt

Important

requirements-unsloth.txt assumes CUDA-compatible PyTorch is already installed. The comments in that file call this out explicitly.

Configure Environment

cp .env.example .env

At minimum, set:

OPENAI_API_KEY=your-openai-api-key-here

Optional:

Set HF_TOKEN if you want to train gated Hugging Face models such as Llama 3.2.
Use an Ollama model for data generation from the UI instead of OpenAI if you want a local generation path.

Run The App

Development Mode

Start the FastAPI backend:

python -m server.app

In a second terminal, start the Vite frontend:

cd frontend
npm run dev

Then open:

Frontend: http://localhost:5173
API: http://localhost:8000
API docs: http://localhost:8000/docs

Production-Like Local Run

Build the frontend and let FastAPI serve it:

cd frontend
npm run build
cd ..
python -m server.app

Open http://localhost:8000.

Workflow

1. Select Your Input

The app accepts:

A vault path on disk
A zipped vault upload
A folder upload from the browser
An existing training_data.jsonl file if you already generated data elsewhere

The backend registers the vault, scans Markdown files, and stores job state under output/ and uploads/.

2. Configure Data Generation

Generation supports four goals:

Goal	What it optimizes for
`knowledge`	Answering questions about what is in your notes
`style`	Reproducing your writing voice
`reasoning`	Connecting ideas the way your notes do
`chatbot`	Producing conversational first-person responses

By default, the app uses gpt-4o-mini for training data generation, but the UI also supports choosing another provider/model string.

3. Review Notes And Generate JSONL

The pipeline:

reads Markdown files recursively
skips excluded folders such as .obsidian, _templates, templates, attachments, and assets
scores notes based on length, metadata, links, tags, and draft/raw markers
generates training pairs and streams progress back to the UI

Generated datasets are written to output/ and can be downloaded from the app.

4. Train A Base Model

The training flow checks:

hardware capabilities
whether the selected Hugging Face model is already cached locally
whether a Hugging Face token is required

From there the app routes you to the best path for your machine.

Training Options

Apple Silicon

Uses training/train_mlx.py and exports a GGUF model after MLX LoRA fine-tuning.

Notes:

Best experience is on Apple Silicon with 16 GB+ RAM
The MLX script explicitly warns against problematic Python environments
If memory is too tight, the app recommends Colab instead of forcing local training

NVIDIA GPU

Uses training/train_unsloth.py for 4-bit LoRA fine-tuning with Unsloth and then attempts GGUF export.

Notes:

Best suited to CUDA-capable machines
Larger models may still require significant VRAM
Gated models may require HF_TOKEN

Google Colab

The fallback notebook lives at training/md2LLM_colab.ipynb.

This path is recommended when:

no suitable local GPU is available
RAM is too limited for a safe local run
you want a predictable training environment without local ML dependencies

Supported Training Repos

The UI recommends curated Hugging Face models such as:

Qwen/Qwen2.5-1.5B-Instruct
Qwen/Qwen2.5-3B-Instruct
Qwen/Qwen2.5-7B-Instruct
microsoft/Phi-3-mini-4k-instruct
meta-llama/Llama-3.2-1B-Instruct
meta-llama/Llama-3.2-3B-Instruct

After training, models are stored under models/, and local chat uses the Ollama API.

CLI Usage

If you only want to generate the dataset:

python -m pipeline.main \
  --vault /path/to/your/vault \
  --goal knowledge \
  --model gpt-4o-mini \
  --output output/training_data.jsonl

Useful flags:

--validate to run LLM quality validation on generated pairs
--no-resume to ignore any existing checkpoint and start fresh
--min-quality to override the default note threshold

Project Structure

frontend/    React + Vite app for the end-to-end workflow
server/      FastAPI app, routes, job state, and frontend serving
pipeline/    Vault reader, note scoring, and training-data generation
training/    MLX, Unsloth, export helpers, and Colab notebook
models/      Trained model artifacts and GGUF outputs
output/      Generated JSONL files, checkpoints, and session data
uploads/     Temporary uploaded vault contents

Key files:

Troubleshooting

Ollama Is Not Running

If chat or model discovery fails, start Ollama locally:

ollama serve

A Hugging Face Model Will Not Download

Check:

whether the model is gated
whether you accepted the model license on Hugging Face
whether HF_TOKEN is set in .env

Local Training Is Too Slow Or Runs Out Of Memory

Use the Colab flow. The UI already recommends it on weak hardware, and the notebook is part of the repo.

The Frontend Loads But The Root Path Does Not Work In Production

Build the frontend first:

cd frontend
npm run build

The FastAPI app only serves the SPA from frontend/dist when that build output exists.

Privacy

The app runs locally, but note content sent to generate training pairs follows the policy of the LLM provider you choose for generation. Fine-tuning, model files, and local chat stay on your machine unless you deliberately use a hosted environment such as Colab.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

md2LLM

Overview

Features

Getting Started

Prerequisites

Install Dependencies

Configure Environment

Run The App

Development Mode

Production-Like Local Run

Workflow

1. Select Your Input

2. Configure Data Generation

3. Review Notes And Generate JSONL

4. Train A Base Model

Training Options

Apple Silicon

NVIDIA GPU

Google Colab

Supported Training Repos

CLI Usage

Project Structure

Troubleshooting

Ollama Is Not Running

A Hugging Face Model Will Not Download

Local Training Is Too Slow Or Runs Out Of Memory

The Frontend Loads But The Root Path Does Not Work In Production

Privacy

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
frontend		frontend
models		models
output		output
pipeline		pipeline
server		server
training		training
.env.example		.env.example
.gitignore		.gitignore
BUILD.md		BUILD.md
README.md		README.md
config.yaml		config.yaml
requirements-mlx.txt		requirements-mlx.txt
requirements-unsloth.txt		requirements-unsloth.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

md2LLM

Overview

Features

Getting Started

Prerequisites

Install Dependencies

Configure Environment

Run The App

Development Mode

Production-Like Local Run

Workflow

1. Select Your Input

2. Configure Data Generation

3. Review Notes And Generate JSONL

4. Train A Base Model

Training Options

Apple Silicon

NVIDIA GPU

Google Colab

Supported Training Repos

CLI Usage

Project Structure

Troubleshooting

Ollama Is Not Running

A Hugging Face Model Will Not Download

Local Training Is Too Slow Or Runs Out Of Memory

The Frontend Loads But The Root Path Does Not Work In Production

Privacy

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages