Skip to content

seastarbot/treeindex-go

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

TreeIndex-Go ๐ŸŒณ

Vectorless RAG Framework for Go โ€” No embeddings, no vector DB, just LLM reasoning.

Inspired by PageIndex

English | ไธญๆ–‡


ไธญๆ–‡

่ฟ™ๆ˜ฏไป€ไนˆ๏ผŸ

TreeIndex ๆ˜ฏไธ€ไธช็บฏ Go ๅฎž็Žฐ็š„ RAG๏ผˆๆฃ€็ดขๅขžๅผบ็”Ÿๆˆ๏ผ‰ๆก†ๆžถใ€‚ๅฎƒไธไฝฟ็”จๅ‘้‡ๆ•ฐๆฎๅบ“๏ผŒไธๅšๆ–‡ๆœฌๅˆ‡็‰‡๏ผŒ่€Œๆ˜ฏๅฐ†ๆ–‡ๆกฃๆž„ๅปบไธบๅฑ‚็บงๆ ‘็ดขๅผ•๏ผŒ็„ถๅŽ็”จ LLM ๆŽจ็†ๆฅๅฎšไฝๅ’Œๆฃ€็ดข็›ธๅ…ณไฟกๆฏใ€‚

ๆ ธๅฟƒ็†ๅฟต

ไผ ็ปŸ RAG:  ๆ–‡ๆกฃ โ†’ ๅˆ‡็‰‡ โ†’ ๅ‘้‡ๅตŒๅ…ฅ โ†’ ็›ธไผผๅบฆๆœ็ดข โ†’ ้กถ้ƒจ็ป“ๆžœ
TreeIndex: ๆ–‡ๆกฃ โ†’ ๆ ‘็ดขๅผ• โ†’ LLM ๆŽจ็†ๅฏผ่ˆช โ†’ ็ฒพ็กฎๅ†…ๅฎนๆๅ–

ไธบไป€ไนˆไธ็”จๅ‘้‡๏ผŸ

  • ๅ‘้‡ๆœ็ดขไพ่ต– embedding ๆจกๅž‹็š„่ดจ้‡
  • ๅฐๆ–‡ๆกฃ/ไธ“ไธš้ข†ๅŸŸ embedding ๆ•ˆๆžœๅทฎ
  • ้œ€่ฆ้ขๅค–้ƒจ็ฝฒๅ‘้‡ๆ•ฐๆฎๅบ“๏ผˆPinecone/Chroma/Milvus๏ผ‰
  • LLM ๆœฌ่บซ็š„ๆŽจ็†่ƒฝๅŠ›่ถณๅคŸๅš็ฒพๅ‡†ๅฎšไฝ

็‰นๆ€ง

  • ๐Ÿ”ถ ้›ถๅค–้ƒจไพ่ต– โ€” ไป…ไฝฟ็”จ Go ๆ ‡ๅ‡†ๅบ“ + net/http ่ฐƒ LLM API
  • ๐ŸŒณ ๅฑ‚็บงๆ ‘็ดขๅผ• โ€” ๆ–‡ๆกฃ็ป“ๆž„่‡ช็„ถๆ˜ ๅฐ„ไธบๆ ‘
  • ๐Ÿง  ๆŽจ็†ๅผๆฃ€็ดข โ€” LLM ่ฏปๆ ‘ โ†’ ๅฎšไฝ โ†’ ๆๅ–๏ผŒไธๆ˜ฏๅ‘้‡ๅŒน้…
  • ๐Ÿ“„ ๅคšๆ ผๅผๆ”ฏๆŒ โ€” PDFใ€Markdown
  • ๐Ÿค– ๅคš LLM ๆ”ฏๆŒ โ€” OpenAIใ€Anthropic๏ผˆไปปๆ„ๅ…ผๅฎน API๏ผ‰
  • ๐Ÿ“ฆ ๅ•ไบŒ่ฟ›ๅˆถ โ€” go build ๅณๅฏ้ƒจ็ฝฒ
  • ๐Ÿ’พ ็ดขๅผ•ๆŒไน…ๅŒ– โ€” JSON ๆ–‡ไปถๅญ˜ๅ‚จ๏ผŒ้šๆ—ถๅŠ ่ฝฝๆŸฅ่ฏข

ๅฟซ้€Ÿๅผ€ๅง‹

# ็ผ–่ฏ‘
go build -o treeindex ./cmd/treeindex

# ๅฟซ้€ŸๆŸฅ็œ‹ๆ–‡ๆกฃๆ ‘๏ผˆไธ้œ€่ฆ LLM๏ผ‰
./treeindex tree document.md

# ๆž„ๅปบ็ดขๅผ•๏ผˆ้œ€่ฆ LLM API key๏ผ‰
export OPENAI_API_KEY=sk-xxx
./treeindex index document.pdf

# ๆŸฅ่ฏข
./treeindex query document.treeindex.json "What is the main conclusion?"

# ไนŸๆ”ฏๆŒ Anthropic
export ANTHROPIC_AUTH_TOKEN=sk-ant-xxx
./treeindex index document.md --provider anthropic

ๅ‘ฝไปค

ๅ‘ฝไปค ่ฏดๆ˜Ž
treeindex index <file> ๆž„ๅปบๆ ‘็ดขๅผ•๏ผˆๅซ LLM ๆ‘˜่ฆ๏ผ‰
treeindex tree <file> ๅฟซ้€ŸๆŸฅ็œ‹ๆ–‡ๆกฃๆ ‘๏ผˆไธ้œ€่ฆ LLM๏ผ‰
treeindex query <index> <query> ๆŸฅ่ฏข็ดขๅผ•ๆ–‡ๆกฃ
treeindex show <index> ๆ˜พ็คบๅฎŒๆ•ดๆ ‘็ป“ๆž„

้€‰้กน

ๆ ‡ๅฟ— ่ฏดๆ˜Ž ้ป˜่ฎคๅ€ผ
--no-summarize ่ทณ่ฟ‡ LLM ๆ‘˜่ฆ็”Ÿๆˆ false
--merge <len> ๅˆๅนถๅฐ่Š‚๏ผˆ< len ๅญ—็ฌฆ๏ผ‰ 200
--top <n> ่ฟ”ๅ›ž็ป“ๆžœๆ•ฐ้‡ 5
--provider <p> LLM ๆไพ›ๅ•† openai
--format <f> ่พ“ๅ‡บๆ ผๅผ (text/json) text

ไฝœไธบๅบ“ไฝฟ็”จ

package main

import (
    "github.com/seastarbot/treeindex-go/internal/indexer"
    "github.com/seastarbot/treeindex-go/internal/llm"
    "github.com/seastarbot/treeindex-go/internal/retriever"
)

func main() {
    // ๅˆ›ๅปบ LLM provider
    provider, _ := llm.NewProvider(nil)

    // ๆž„ๅปบ็ดขๅผ•
    idx := indexer.New(provider, true)
    tree, _ := idx.IndexFile("document.pdf")
    tree.SaveToFile("index.json")

    // ๆŸฅ่ฏข
    ret := retriever.New(provider)
    results, _ := ret.Retrieve(tree, "What is the conclusion?", 5)
    answer, _ := ret.Answer("What is the conclusion?", results)
    fmt.Println(answer)
}

ๆžถๆž„

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Parser  โ”‚โ”€โ”€โ”€โ–ถโ”‚  Indexer โ”‚โ”€โ”€โ”€โ–ถโ”‚  Tree    โ”‚
โ”‚ PDF/MD   โ”‚    โ”‚ LLMๆ‘˜่ฆ  โ”‚    โ”‚ JSONๅญ˜ๅ‚จ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                     โ”‚
                                     โ–ผ
                                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                โ”‚ Retrieverโ”‚
                                โ”‚ LLMๆŽจ็†  โ”‚
                                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

่ฎธๅฏ่ฏ

MIT License


English

What is this?

TreeIndex is a pure Go implementation of a RAG (Retrieval-Augmented Generation) framework. Instead of vector embeddings and similarity search, it builds a hierarchical tree index from documents and uses LLM reasoning to navigate and retrieve relevant content.

The Idea

Traditional RAG:  Document โ†’ Chunks โ†’ Embeddings โ†’ Vector Search โ†’ Top-K
TreeIndex:        Document โ†’ Tree Index โ†’ LLM Reasoning โ†’ Precise Extraction

Why skip vectors?

  • Vector search depends heavily on embedding model quality
  • Small documents and domain-specific content get poor embeddings
  • Requires deploying a vector database (Pinecone/Chroma/Milvus)
  • LLMs are already good at understanding document structure and finding information

Features

  • ๐Ÿ”ถ Zero dependencies โ€” Go stdlib + net/http only
  • ๐ŸŒณ Hierarchical tree index โ€” Document structure naturally maps to a tree
  • ๐Ÿง  Reasoning retrieval โ€” LLM reads the tree โ†’ locates โ†’ extracts
  • ๐Ÿ“„ Multi-format โ€” PDF, Markdown
  • ๐Ÿค– Multi-LLM โ€” OpenAI, Anthropic (any compatible API)
  • ๐Ÿ“ฆ Single binary โ€” Just go build and deploy
  • ๐Ÿ’พ Persistent indexes โ€” JSON file storage, load and query anytime

Quick Start

# Build
go build -o treeindex ./cmd/treeindex

# Quick tree view (no LLM needed)
./treeindex tree document.md

# Build index (requires LLM API key)
export OPENAI_API_KEY=sk-xxx
./treeindex index document.pdf

# Query
./treeindex query document.treeindex.json "What is the main conclusion?"

# Also supports Anthropic
export ANTHROPIC_AUTH_TOKEN=sk-ant-xxx
./treeindex index document.md --provider anthropic

Commands

Command Description
treeindex index <file> Build tree index (with LLM summaries)
treeindex tree <file> Quick tree view (no LLM needed)
treeindex query <index> <query> Query an indexed document
treeindex show <index> Display full tree structure

Options

Flag Description Default
--no-summarize Skip LLM summary generation false
--merge <len> Merge small sections (< len chars) 200
--top <n> Number of results to return 5
--provider <p> LLM provider (openai/anthropic) openai
--format <f> Output format (text/json) text

As a Library

package main

import (
    "github.com/seastarbot/treeindex-go/internal/indexer"
    "github.com/seastarbot/treeindex-go/internal/llm"
    "github.com/seastarbot/treeindex-go/internal/retriever"
)

func main() {
    // Create LLM provider
    provider, _ := llm.NewProvider(nil)

    // Build index
    idx := indexer.New(provider, true)
    tree, _ := idx.IndexFile("document.pdf")
    tree.SaveToFile("index.json")

    // Query
    ret := retriever.New(provider)
    results, _ := ret.Retrieve(tree, "What is the conclusion?", 5)
    answer, _ := ret.Answer("What is the conclusion?", results)
    fmt.Println(answer)
}

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Parser  โ”‚โ”€โ”€โ”€โ–ถโ”‚  Indexer โ”‚โ”€โ”€โ”€โ–ถโ”‚  Tree    โ”‚
โ”‚ PDF/MD   โ”‚    โ”‚ LLM Summ.โ”‚    โ”‚ JSON     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                     โ”‚
                                     โ–ผ
                                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                โ”‚ Retrieverโ”‚
                                โ”‚ LLM Rsn. โ”‚
                                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

License

MIT License

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages